[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

GB2615598A - Attesting to a set of unconsumed transaction outputs - Google Patents

Attesting to a set of unconsumed transaction outputs Download PDF

Info

Publication number
GB2615598A
GB2615598A GB2201967.3A GB202201967A GB2615598A GB 2615598 A GB2615598 A GB 2615598A GB 202201967 A GB202201967 A GB 202201967A GB 2615598 A GB2615598 A GB 2615598A
Authority
GB
United Kingdom
Prior art keywords
transaction
blockchain
node
outputs
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2201967.3A
Other versions
GB202201967D0 (en
Inventor
Patrick Coughlan Steven
Owen Davies Jack
Zhang Wei
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nchain Licensing AG
Original Assignee
Nchain Licensing AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nchain Licensing AG filed Critical Nchain Licensing AG
Priority to GB2201967.3A priority Critical patent/GB2615598A/en
Publication of GB202201967D0 publication Critical patent/GB202201967D0/en
Priority to PCT/EP2023/050839 priority patent/WO2023156102A1/en
Publication of GB2615598A publication Critical patent/GB2615598A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/50Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using hash chains, e.g. blockchains or hash trees
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3247Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving digital signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/56Financial cryptography, e.g. electronic payment or e-cash

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A set of unconsumed outputs of blockchain transactions is determined, and a hash tree to attest to the set is formed; wherein a leaf node 302 can each either A) designate Hi a corresponding subset the unconsumed outputs or B) comprise a placeholder value ø. At least one update to the set is determined and the hash tree is recomputed to reflect the update, wherein on at least one occasion the update involves removal of an output and on at least one occasion it involves addition of a new output. When an output is removed, the leaf node designating the removed output may be replaced with a new leaf node which either: A) designates an updated subset of one or more unconsumed outputs not including the removed output, or B) comprises a placeholder value. When a new output is added, a new leaf node may be inserted in the tree in place of a replaceable leaf node that either: A) designates an output being removed, or B) comprises a placeholder value. If no replaceable leaf node is available, the tree is expanded to accommodate the new leaf node. Affected values of the tree up the root 304 may be recomputed.

Description

ATTESTING TO A SET OF UNCONSUMED TRANSACTION OUTPUTS
TECHNICAL FIELD
The present disclosure relates to attesting to a set of unconsumed outputs (e.g. UTX05) of blockchain transactions.
BACKGROUND
A blockchain refers to a form of distributed data structure, wherein a duplicate copy of the blockchain is maintained at each of a plurality of nodes in a distributed peer-to-peer (P2P) network (referred to below as a "blockchain network") and widely publicised. The blockchain comprises a chain of blocks of data, wherein each block comprises one or more transactions. Each transaction, other than so-called "coinbase transactions", points back to a preceding transaction in a sequence which may span one or more blocks going back to one or more coinbase transactions. Coinbase transactions are discussed further below. Transactions that are submitted to the blockchain network are included in new blocks. New blocks are created by a process often referred to as "mining", which involves each of a plurality of the nodes competing to perform "proof-of-work", i.e. solving a cryptographic puzzle based on a representation of a defined set of ordered and validated pending transactions waiting to be included in a new block of the blockchain. It should be noted that the blockchain may be pruned at some nodes, and the publication of blocks can be achieved through the publication of mere block headers.
The transactions in the blockchain may be used for one or more of the following purposes: to convey a digital asset (i.e. a number of digital tokens), to order a set of entries in a virtualised ledger or registry, to receive and process timestamp entries, and/or to time-order index pointers. A blockchain can also be exploited in order to layer additional functionality on top of the blockchain. For example blockchain protocols may allow for storage of additional user data or indexes to data in a transaction. There is no pre-specified limit to the maximum data capacity that can be stored within a single transaction, and therefore increasingly more complex data can be incorporated. For instance this may be used to store an electronic document in the blockchain, or audio or video data.
Nodes of the blockchain network (which are often referred to as "miners") perform a distributed transaction registration and verification process, which will be described in more detail later. In summary, during this process a node validates transactions and inserts them into a block template for which they attempt to identify a valid proof-of-work solution. Once a valid solution is found, a new block is propagated to other nodes of the network, thus enabling each node to record the new block on the blockchain. In order to have a transaction recorded in the blockchain, a user (e.g. a blockchain client application) sends the transaction to one of the nodes of the network to be propagated. Nodes which receive the transaction may race to find a proof-of-work solution incorporating the validated transaction into a new block. Each node is configured to enforce the same node protocol, which will include one or more conditions for a transaction to be valid. Invalid transactions will not be propagated nor incorporated into blocks. Assuming the transaction is validated and thereby accepted onto the blockchain, then the transaction (including any user data) will thus remain registered and indexed at each of the nodes in the blockchain network as an immutable public record.
The node who successfully solved the proof-of-work puzzle to create the latest block is typically rewarded with a new transaction called the "coinbase transaction" which distributes an amount of the digital asset, i.e. a number of tokens. The detection and rejection of invalid transactions is enforced by the actions of competing nodes who act as agents of the network and are incentivised to report and block malfeasance. The widespread publication of information allows users to continuously audit the performance of nodes. The publication of the mere block headers allows participants to ensure the ongoing integrity of the blockchain.
In an "output-based" model (sometimes referred to as a UTXO-based model), the data structure of a given transaction comprises one or more inputs and one or more outputs. Any spendable output comprises an element specifying an amount of the digital asset that is derivable from the proceeding sequence of transactions. The spendable output is sometimes referred to as a UTXO ("unspent transaction output"). The output may further comprise a locking script specifying a condition for the future redemption of the output. A locking script is a predicate defining the conditions necessary to validate and transfer digital tokens or assets. Each input of a transaction (other than a coinbase transaction) comprises a pointer (i.e. a reference) to such an output in a preceding transaction, and may further comprise an unlocking script for unlocking the locking script of the pointed-to output. So consider a pair of transactions, call them a first and a second transaction (or "target" transaction). The first transaction comprises at least one output specifying an amount of the digital asset, and comprising a locking script defining one or more conditions of unlocking the output. The second, target transaction comprises at least one input, comprising a pointer to the output of the first transaction, and an unlocking script for unlocking the output of the first transaction.
In such a model, when the second, target transaction is sent to the blockchain network to be propagated and recorded in the blockchain, one of the criteria for validity applied at each node will be that the unlocking script meets all of the one or more conditions defined in the locking script of the first transaction. Another will be that the output of the first transaction has not already been redeemed by another, earlier valid transaction. Any node that finds the target transaction invalid according to any of these conditions will not propagate it (as a valid transaction, but possibly to register an invalid transaction) nor include it in a new block to be recorded in the blockchain.
SUMMARY
It may be desired for a party to attest to a set of unconsumed transaction outputs (e.g. UTX0s). This may be done using a hash tree, also known as a Merkle tree. A hash tree in general is a data structure comprising a root node, one or more internal layers, and a leaf layer, arranged in a hierarchy from root (highest layer) to leaf layer (lowest in the hierarchy). Each of the internal layers and the leaf layer comprises a plurality of nodes at that layer. The root node is a hash of the nodes in the next layer down in the hierarchy, and in each internal layer, each node is a hash of a different subset of the nodes at the next layer down in the hierarchy. The leaf nodes themselves are typically also hashes, each a hash of respective preimage data. To use such a structure to attest to (e.g. publicly commit to) a set of transaction outputs, each leaf can be used to designate a different corresponding subset of the transaction outputs (e.g. the UTX0s of a given transaction), e.g. by each leaf comprising hash of its corresponding subset of UTX0s (or the like). By publishing the root of the hash tree, this will make a public commitment to the publishing party's view of the UTXO set. This will enable other parties to prove that a given target UTXO is part of the attested-to set given the root, the target UTXO and information about the hashes along the path between the target's leaf and the root (the so-called "Merkle path").
However, there is an issue in that it would be computationally intense to have to recompute the entire hash tree every time the UTXO set (or such like) is updated.
According to one aspect disclosed herein, there is provided method comprising: determining a set of unconsumed outputs of blockchain transactions recorded on and/or submitted for recordal on a blockchain, and forming a hash tree to attest to the set. The hash tree comprises a plurality of layers arranged in a hierarchy from a leaf layer up to a root layer with one or more internal layers in between. The root layer comprises a root node, and each internal layer and the leaf layer each comprise a respective plurality of nodes with each node having an indexed position within its respective layer. At each layer but the leaf layer, each node within the layer comprises a hash of a combination of a different corresponding subset of at least two of the nodes at the next layer down in the hierarchy. According to the present disclosure, at least some of the leaf nodes can each either A) designate a corresponding subset of one or more of the unconsumed outputs or B) corn prise a placeholder value.
The method further comprises, on each of one or more occasions: I) determining an update to the set of unconsumed outputs, each update comprising an addition of one or more new outputs and/or removal of one or more outputs which have now been consumed, and II) recomputing the hash tree to reflect the update. On at least one of the occasions the update comprises removal of at least one output and on at least one of the occasions the update comprises the addition of at least one new output.
On each occasion the recomputing of the hash tree comprises: when any output is removed, removing the leaf node designating the removed output and replacing the removed leaf node, at the same position in the leaf layer, with a new leaf node which either: A) designates an updated subset of one or more unconsumed outputs not including the removed output, or B) comprises a placeholder value; and when any new output is added, forming a new leaf node designating a subset of one or more unconsumed outputs including at least the new output, wherein the new leaf node is inserted in the tree in place of a replaceable leaf node when available, at the same position in the leaf layer, a replaceable leaf node being a leaf node that either: A) designates an output being removed, or B) comprises a placeholder value; but otherwise if no replaceable leaf node is available, the tree is expanded to accommodate the new leaf node.
Thus, rather than restructuring the tree every time there is an update, the disclosed method uses a scheme of placeholder values, which could also be described as "dummy" leaves.
They can be inserted in place of the designation of removed UTX05 and then replaced, either in the current update or later, by a designation of newly added UTX05 (or more generally transaction outputs). One effect of this is that the operator has the freedom to not necessarily recompute the entire tree every time there is an update, but instead recompute only a subtree comprising the affected nodes.
Preferably the method allows a placeholder value to be added to replace a removed leaf node removed from any arbitrary position in the leaf layer. For example, the placeholder values are not constrained to being added only to pad out the right hand side of the tree. Similarly the method preferably allows a new node to be added to replace a placeholder value that may have been included at any position in the leaf layer, including the possibility of a placeholder value that is left of a non-placeholder leaf node (one that actually designates one or more transaction outputs). E.g. the new nodes are not necessarily only added in place of right-hand padding of the tree.
BRIEF DESCRIPTION OF THE DRAWINGS
To assist understanding of embodiments of the present disclosure and to show how such embodiments may be put into effect, reference is made, by way of example only, to the accompanying drawings in which: Figure 1 is a schematic block diagram of a system for implementing a blockchain, Figure 2 schematically illustrates some examples of transactions which may be recorded in a blockchain, Figures 3 and 4 schematically illustrate the concept of a Merkle tree by way of example, Figure 5 schematically illustrates a scheme for indexing nodes of a Merkle tree, Figure 6 schematically illustrates by way of example the concept of a holey Merkle tree, Figure 7 is a flow chart of an example method according to embodiments disclosed herein, and Figure 8 is a flow chart of a method of providing and executing a hash puzzle, Figure 9 is a schematic transaction diagram of a an example of a challenge transaction and a corresponding candidate transaction, Figure 10 is a schematic illustration of a an example of a process of updating a holey Merkle tree, and Figure 11 schematically illustrates a relationship between the changelog A, the new set of leaves.
DETAILED DESCRIPTION OF EMBODIMENTS
In the Bitcoin system and other UTXO-based blockchains, the state of the blockchain at a particular block height can be considered as the set of unspent transaction outputs recorded on chain, sometimes known as the UTXO set. A given node (or "miner") of the blockchain network may also maintain a local UTXO set that additionally includes those in the pool of pending transactions maintained at the node.
The UTXO set is used for validating incoming transactions, and each blockchain node therefore maintains a local copy of the UTXO set, according to the worldview of that node. However, the UTXO set must be derived from the historical transactions that have occurred on the blockchain.
In order to be confident of the state of the UTXO set at a particular block height, it is currently necessary to process and validate the entire historical chain of transactions up to the block height for which the UTXO set is being derived. This is a resource-intensive process, which makes it difficult for users of the blockchain network (i.e. non-nodes) to obtain a trustworthy view of the UTXO set at any given block height. Although all the miners are likely to have the same view of the UTXO set for a given block height, they would require a standardised deterministic approach to committing the set so that the commitment can be verified and attested to.
In the following disclosure, there is provided a deterministic mechanism for processing, structuring, and committing to the UTXO set, e.g. at any given block height, such that the mechanism can be utilised by any observer to check the membership of a particular UTXO at a given block height. The disclosure provides a new data structure for a UTXO set, structuring the UTXO set as a Holey Merkle Tree (HMT). The disclosure also provides an efficient algorithm to update the UTXO set by incremental deterministic UTXO set updates. Embodiments also provide a commitment scheme that ensures the integrity of the UTXO set, by incentivising miners to publicly commit to a particular UTXO set snapshot.
For instance, the disclosed embodiments for committing to the state of the UTXO set at a particular block height enable an efficient verification on a given UTXO set for the given block height. Once the UTXO set is verified, it can be used in a wide variety of applications to enable users to check membership of UTX05 in the set at a particular historical point in time.
1. EXAMPLE SYSTEM OVERVIEW Figure 1 shows an example system 100 for implementing a blockchain 150. The system 100 may comprise a packet-switched network 101, typically a wide-area internetwork such as the Internet. The packet-switched network 101 comprises a plurality of blockchain nodes 104 that may be arranged to form a peer-to-peer (P2P) network 106 within the packet-switched network 101. Whilst not illustrated, the blockchain nodes 104 may be arranged as a near-complete graph. Each blockchain node 104 is therefore highly connected to other blockchain nodes 104.
Each blockchain node 104 comprises computer equipment of a peer, with different ones of the nodes 104 belonging to different peers. Each blockchain node 104 comprises processing apparatus comprising one or more processors, e.g. one or more central processing units (CPUs), accelerator processors, application specific processors and/or field programmable gate arrays (FPGAs), and other equipment such as application specific integrated circuits (ASICs). Each node also comprises memory, i.e. computer-readable storage in the form of a non-transitory computer-readable medium or media. The memory may comprise one or more memory units employing one or more memory media, e.g. a magnetic medium such as a hard disk; an electronic medium such as a solid-state drive (SSD), flash memory or EEPROM; and/or an optical medium such as an optical disk drive.
The blockchain 150 comprises a chain of blocks of data 151, wherein a respective copy of the blockchain 150 is maintained at each of a plurality of blockchain nodes 104 in the distributed or blockchain network 106. As mentioned above, maintaining a copy of the blockchain 150 does not necessarily mean storing the blockchain 150 in full. Instead, the blockchain 150 may be pruned of data so long as each blockchain node 150 stores the block header (discussed below) of each block 151. Each block 151 in the chain comprises one or more transactions 152, wherein a transaction in this context refers to a kind of data structure. The nature of the data structure will depend on the type of transaction protocol used as part of a transaction model or scheme. A given blockchain will use one particular transaction protocol throughout. In one common type of transaction protocol, the data structure of each transaction 152 comprises at least one input and at least one output. Each output specifies an amount representing a quantity of a digital asset as property, an example of which is a user 103 to whom the output is cryptographically locked (requiring a signature or other solution of that user in order to be unlocked and thereby redeemed or spent). Each input points back to the output of a preceding transaction 152, thereby linking the transactions.
Each block 151 also comprises a block pointer 155 pointing back to the previously created block 151 in the chain so as to define a sequential order to the blocks 151. Each transaction 152 (other than a coinbase transaction) comprises a pointer back to a previous transaction so as to define an order to sequences of transactions (N.B. sequences of transactions 152 are allowed to branch). The chain of blocks 151 goes all the way back to a genesis block (Gb) 153 which was the first block in the chain. One or more original transactions 152 early on in the chain 150 pointed to the genesis block 153 rather than a preceding transaction.
Each of the blockchain nodes 104 is configured to forward transactions 152 to other blockchain nodes 104, and thereby cause transactions 152 to be propagated throughout the network 106. Each blockchain node 104 is configured to create blocks 151 and to store a respective copy of the same blockchain 150 in their respective memory. Each blockchain node 104 also maintains an ordered set (or "pool") 154 of transactions 152 waiting to be incorporated into blocks 151. The ordered pool 154 is often referred to as a "mempool". This term herein is not intended to limit to any particular blockchain, protocol or model. It refers to the ordered set of transactions which a node 104 has accepted as valid and for which the node 104 is obliged not to accept any other transactions attempting to spend the same output.
In a given present transaction 152j, the (or each) input comprises a pointer referencing the output of a preceding transaction 152i in the sequence of transactions, specifying that this output is to be redeemed or "spent" in the present transaction 152j. Spending or redeeming does not necessarily imply transfer of a financial asset, though that is certainly one common application. More generally spending could be described as consuming the output, or assigning it to one or more outputs in another, onward transaction. In general, the preceding transaction could be any transaction in the ordered set 154 or any block 151. The preceding transaction 152i need not necessarily exist at the time the present transaction 152j is created or even sent to the network 106, though the preceding transaction 152i will need to exist and be validated in order for the present transaction to be valid. Hence "preceding" herein refers to a predecessor in a logical sequence linked by pointers, not necessarily the time of creation or sending in a temporal sequence, and hence it does not necessarily exclude that the transactions 152i, 152j be created or sent out-of-order (see discussion below on orphan transactions). The preceding transaction 152i could equally be called the antecedent or predecessor transaction.
The input of the present transaction 152j also comprises the input authorisation, for example the signature of the user 103a to whom the output of the preceding transaction 152i is locked. In turn, the output of the present transaction 152j can be cryptographically locked to a new user or entity 103b. The present transaction 152j can thus transfer the amount defined in the input of the preceding transaction 152i to the new user or entity 103b as defined in the output of the present transaction 152j. In some cases a transaction 152 may have multiple outputs to split the input amount between multiple users or entities (one of whom could be the original user or entity 103a in order to give change). In some cases a transaction can also have multiple inputs to gather together the amounts from multiple outputs of one or more preceding transactions, and redistribute to one or more outputs of the current transaction.
According to an output-based transaction protocol such as bitcoin, when a party 103, such as an individual user or an organization, wishes to enact a new transaction 1521 (either manually or by an automated process employed by the party), then the enacting party sends the new transaction from its computer terminal 102 to a recipient. The enacting party or the recipient will eventually send this transaction to one or more of the blockchain nodes 104 of the network 106 (which nowadays are typically servers or data centres, but could in principle be other user terminals). It is also not excluded that the party 103 enacting the new transaction 152j could send the transaction directly to one or more of the blockchain nodes 104 and, in some examples, not to the recipient. A blockchain node 104 that receives a transaction checks whether the transaction is valid according to a blockchain node protocol which is applied at each of the blockchain nodes 104. The blockchain node protocol typically requires the blockchain node 104 to check that a cryptographic signature in the new transaction 152j matches the expected signature, which depends on the previous transaction 152i in an ordered sequence of transactions 152. In such an output-based transaction protocol, this may comprise checking that the cryptographic signature or other authorisation of the party 103 included in the input of the new transaction 152j matches a condition defined in the output of the preceding transaction 152i which the new transaction spends (or "assigns"), wherein this condition typically comprises at least checking that the cryptographic signature or other authorisation in the input of the new transaction 152j unlocks the output of the previous transaction 152i to which the input of the new transaction is linked to. The condition may be at least partially defined by a script included in the output of the preceding transaction 152i. Alternatively it could simply be fixed by the blockchain node protocol alone, or it could be due to a combination of these. Either way, if the new transaction 152j is valid, the blockchain node 104 forwards it to one or more other blockchain nodes 104 in the blockchain network 106. These other blockchain nodes 104 apply the same test according to the same blockchain node protocol, and so forward the new transaction 152j on to one or more further nodes 104, and so forth. In this way the new transaction is propagated throughout the network of blockchain nodes 104.
In an output-based model, the definition of whether a given output (e.g. UTXO) is assigned (or "spent") is whether it has yet been validly redeemed by the input of another, onward transaction 152j according to the blockchain node protocol. Another condition for a transaction to be valid is that the output of the preceding transaction 152i which it attempts to redeem has not already been redeemed by another transaction. Again if not valid, the transaction 152j will not be propagated (unless flagged as invalid and propagated for alerting) or recorded in the blockchain 150. This guards against double-spending whereby the transactor tries to assign the output of the same transaction more than once. An account-based model on the other hand guards against double-spending by maintaining an account balance. Because again there is a defined order of transactions, the account balance has a single defined state at any one time.
In addition to validating transactions, blockchain nodes 104 also race to be the first to create blocks of transactions in a process commonly referred to as mining, which is supported by "proof-of-work". At a blockchain node 104, new transactions are added to an ordered pool 154 of valid transactions that have not yet appeared in a block 151 recorded on the blockchain 150. The blockchain nodes then race to assemble a new valid block 151 of transactions 152 from the ordered set of transactions 154 by attempting to solve a cryptographic puzzle. Typically this comprises searching for a "nonce" value such that when the nonce is concatenated with a representation of the ordered pool of pending transactions 154 and hashed, then the output of the hash meets a predetermined condition.
E.g. the predetermined condition may be that the output of the hash has a certain predefined number of leading zeros. Note that this is just one particular type of proof-ofwork puzzle, and other types are not excluded. A property of a hash function is that it has an unpredictable output with respect to its input. Therefore this search can only be performed by brute force, thus consuming a substantive amount of processing resource at each blockchain node 104 that is trying to solve the puzzle.
The first blockchain node 104 to solve the puzzle announces this to the network 106, providing the solution as proof which can then be easily checked by the other blockchain nodes 104 in the network (once given the solution to a hash it is straightforward to check that it causes the output of the hash to meet the condition). The first blockchain node 104 propagates a block to a threshold consensus of other nodes that accept the block and thus enforce the protocol rules. The ordered set of transactions 154 then becomes recorded as a new block 151 in the blockchain 150 by each of the blockchain nodes 104. A block pointer 155 is also assigned to the new block 151n pointing back to the previously created block 151n-1 in the chain. The significant amount of effort, for example in the form of hash, required to create a proof-of-work solution signals the intent of the first node 104 to follow the rules of the blockchain protocol. Such rules include not accepting a transaction as valid if it spends or assigns the same output as a previously validated transaction, otherwise known as double-spending. Once created, the block 151 cannot be modified since it is recognized and maintained at each of the blockchain nodes 104 in the blockchain network 106. The block pointer 155 also imposes a sequential order to the blocks 151. Since the transactions 152 are recorded in the ordered blocks at each blockchain node 104 in a network 106, this therefore provides an immutable public ledger of the transactions.
Note that different blockchain nodes 104 racing to solve the puzzle at any given time may be doing so based on different snapshots of the pool of yet-to-be published transactions 154 at any given time, depending on when they started searching for a solution or the order in which the transactions were received. Whoever solves their respective puzzle first defines which transactions 152 are included in the next new block 151n and in which order, and the current pool 154 of unpublished transactions is updated. The blockchain nodes 104 then continue to race to create a block from the newly-defined ordered pool of unpublished transactions 154, and so forth. A protocol also exists for resolving any "fork" that may arise, which is where two blockchain nodes104 solve their puzzle within a very short time of one another such that a conflicting view of the blockchain gets propagated between nodes 104. In short, whichever prong of the fork grows the longest becomes the definitive blockchain 150. Note this should not affect the users or agents of the network as the same transactions will appear in both forks.
According to the bitcoin blockchain (and most other blockchains) a node that successfully constructs a new block 104 is granted the ability to newly assign an additional, accepted amount of the digital asset in a new special kind of transaction which distributes an additional defined quantity of the digital asset (as opposed to an inter-agent, or inter-user transaction which transfers an amount of the digital asset from one agent or user to another). This special type of transaction is usually referred to as a "coinbase transaction", but may also be termed an "initiation transaction" or "generation transaction". It typically forms the first transaction of the new block 151n. The proof-of-work signals the intent of the node that constructs the new block to follow the protocol rules allowing this special transaction to be redeemed later. The blockchain protocol rules may require a maturity period, for example 100 blocks, before this special transaction may be redeemed. Often a regular (non-generation) transaction 152 will also specify an additional transaction fee in one of its outputs, to further reward the blockchain node 104 that created the block 151n in which that transaction was published. This fee is normally referred to as the "transaction fee", and is discussed blow.
Due to the resources involved in transaction validation and publication, typically at least each of the blockchain nodes 104 takes the form of a server comprising one or more physical server units, or even whole a data centre. However in principle any given blockchain node 104 could take the form of a user terminal or a group of user terminals networked together.
The memory of each blockchain node 104 stores software configured to run on the processing apparatus of the blockchain node 104 in order to perform its respective role or roles and handle transactions 152 in accordance with the blockchain node protocol. It will be understood that any action attributed herein to a blockchain node 104 may be performed by the software run on the processing apparatus of the respective computer equipment. The node software may be implemented in one or more applications at the application layer, or a lower layer such as the operating system layer or a protocol layer, or any combination of these.
Also connected to the network 101 is the computer equipment 102 of each of a plurality of parties 103 in the role of consuming users. These users may interact with the blockchain network 106 but do not participate in validating transactions or constructing blocks. Some of these users or agents 103 may act as senders and recipients in transactions. Other users may interact with the blockchain 150 without necessarily acting as senders or recipients. For instance, some parties may act as storage entities that store a copy of the blockchain 150 (e.g. having obtained a copy of the blockchain from a blockchain node 104).
Some or all of the parties 103 may be connected as part of a different network, e.g. a network overlaid on top of the blockchain network 106. Users of the blockchain network (often referred to as "clients") may be said to be part of a system that includes the blockchain network 106; however, these users are not blockchain nodes 104 as they do not perform the roles required of the blockchain nodes. Instead, each party 103 may interact with the blockchain network 106 and thereby utilize the blockchain 150 by connecting to (i.e. communicating with) a blockchain node 106. Two parties 103 and their respective equipment 102 are shown for illustrative purposes: a first party 103a and his/her respective computer equipment 102a, and a second party 103b and his/her respective computer equipment 102b. It will be understood that many more such parties 103 and their respective computer equipment 102 may be present and participating in the system 100, but for convenience they are not illustrated. Each party 103 may be an individual or an organization. Purely by way of illustration the first party 103a is referred to herein as Alice and the second party 103b is referred to as Bob, but it will be appreciated that this is not limiting and any reference herein to Alice or Bob may be replaced with "first party" and "second "party" respectively.
The computer equipment 102 of each party 103 comprises respective processing apparatus comprising one or more processors, e.g. one or more CPUs, GPUs, other accelerator processors, application specific processors, and/or FPGAs. The computer equipment 102 of each party 103 further comprises memory, i.e. computer-readable storage in the form of a non-transitory computer-readable medium or media. This memory may comprise one or more memory units employing one or more memory media, e.g. a magnetic medium such as hard disk; an electronic medium such as an SSD, flash memory or [[PROM; and/or an optical medium such as an optical disc drive. The memory on the computer equipment 102 of each party 103 stores software comprising a respective instance of at least one client application 105 arranged to run on the processing apparatus. It will be understood that any action attributed herein to a given party 103 may be performed using the software run on the processing apparatus of the respective computer equipment 102. The computer equipment 102 of each party 103 comprises at least one user terminal, e.g. a desktop or laptop computer, a tablet, a smartphone, or a wearable device such as a smartwatch. The computer equipment 102 of a given party 103 may also comprise one or more other networked resources, such as cloud computing resources accessed via the user terminal.
The client application 105 may be initially provided to the computer equipment 102 of any given party 103 on suitable computer-readable storage medium or media, e.g. downloaded from a server, or provided on a removable storage device such as a removable SSD, flash memory key, removable [EPROM, removable magnetic disk drive, magnetic floppy disk or tape, optical disk such as a CD or DVD ROM, or a removable optical drive, etc. The client application 105 comprises at least a "wallet" function. This has two main functionalities. One of these is to enable the respective party 103 to create, authorise (for example sign) and send transactions 152 to one or more bitcoin nodes 104 to then be propagated throughout the network of blockchain nodes 104 and thereby included in the blockchain 150. The other is to report back to the respective party the amount of the digital asset that he or she currently owns. In an output-based system, this second functionality comprises collating the amounts defined in the outputs of the various 152 transactions scattered throughout the blockchain 150 that belong to the party in question.
Note: whilst the various client functionality may be described as being integrated into a given client application 105, this is not necessarily limiting and instead any client functionality described herein may instead be implemented in a suite of two or more distinct applications, e.g. interfacing via an API, or one being a plug-in to the other. More generally the client functionality could be implemented at the application layer or a lower layer such as the operating system, or any combination of these. The following will be described in terms of a client application 105 but it will be appreciated that this is not limiting.
The instance of the client application or software 105 on each computer equipment 102 is operatively coupled to at least one of the blockchain nodes 104 of the network 106. This enables the wallet function of the client 105 to send transactions 152 to the network 106. The client 105 is also able to contact blockchain nodes 104 in order to query the blockchain 150 for any transactions of which the respective party 103 is the recipient (or indeed inspect other parties' transactions in the blockchain 150, since in embodiments the blockchain 150 is a public facility which provides trust in transactions in part through its public visibility).
The wallet function on each computer equipment 102 is configured to formulate and send transactions 152 according to a transaction protocol. As set out above, each blockchain node 104 runs software configured to validate transactions 152 according to the blockchain node protocol, and to forward transactions 152 in order to propagate them throughout the blockchain network 106. The transaction protocol and the node protocol correspond to one another, and a given transaction protocol goes with a given node protocol, together implementing a given transaction model. The same transaction protocol is used for all transactions 152 in the blockchain 150. The same node protocol is used by all the nodes 104 in the network 106.
When a given party 103, say Alice, wishes to send a new transaction 152j to be included in the blockchain 150, then she formulates the new transaction in accordance with the relevant transaction protocol (using the wallet function in her client application 105). She then sends the transaction 152 from the client application 105 to one or more blockchain nodes 104 to which she is connected. E.g. this could be the blockchain node 104 that is best connected to Alice's computer 102. When any given blockchain node 104 receives a new transaction 152j, it handles it in accordance with the blockchain node protocol and its respective role. This comprises first checking whether the newly received transaction 152j meets a certain condition for being "valid", examples of which will be discussed in more detail shortly. In some transaction protocols, the condition for validation may be configurable on a per-transaction basis by scripts included in the transactions 152.
Alternatively the condition could simply be a built-in feature of the node protocol, or be defined by a combination of the script and the node protocol.
On condition that the newly received transaction 152j passes the test for being deemed valid (i.e. on condition that it is "validated"), any blockchain node 104 that receives the transaction 152j will add the new validated transaction 152 to the ordered set of transactions 154 maintained at that blockchain node 104. Further, any blockchain node 104 that receives the transaction 152j will propagate the validated transaction 152 onward to one or more other blockchain nodes 104 in the network 106. Since each blockchain node 104 applies the same protocol, then assuming the transaction 152j is valid, this means it will soon be propagated throughout the whole network 106.
Once admitted to the ordered pool of pending transactions 154 maintained at a given blockchain node 104, that blockchain node 104 will start competing to solve the proof-ofwork puzzle on the latest version of their respective pool of 154 including the new transaction 152 (recall that other blockchain nodes 104 may be trying to solve the puzzle based on a different pool of transactions154, but whoever gets there first will define the set of transactions that are included in the latest block 151. Eventually a blockchain node 104 will solve the puzzle for a part of the ordered pool 154 which includes Alice's transaction 152j). Once the proof-of-work has been done for the pool 154 including the new transaction 152j, it immutably becomes part of one of the blocks 151 in the blockchain 150. Each transaction 152 comprises a pointer back to an earlier transaction, so the order of the transactions is also immutably recorded.
Different blockchain nodes 104 may receive different instances of a given transaction first and therefore have conflicting views of which instance is 'valid' before one instance is published in a new block 151, at which point all blockchain nodes 104 agree that the published instance is the only valid instance. If a blockchain node 104 accepts one instance as valid, and then discovers that a second instance has been recorded in the blockchain 150 then that blockchain node 104 must accept this and will discard (i.e. treat as invalid) the instance which it had initially accepted (i.e. the one that has not been published in a block 151).
An alternative type of transaction protocol operated by some blockchain networks may be referred to as an "account-based" protocol, as part of an account-based transaction model. In the account-based case, each transaction does not define the amount to be transferred by referring back to the UTXO of a preceding transaction in a sequence of past transactions, but rather by reference to an absolute account balance. The current state of all accounts is stored, by the nodes of that network, separate to the blockchain and is updated constantly. In such a system, transactions are ordered using a running transaction tally of the account (also called the "position"). This value is signed by the sender as part of their cryptographic signature and is hashed as part of the transaction reference calculation. In addition, an optional data field may also be signed the transaction. This data field may point back to a previous transaction, for example if the previous transaction ID is included in the data field.
2. UTXO-BASED MODEL Figure 2 illustrates an example transaction protocol. This is an example of a UTXO-based protocol. A transaction 152 (abbreviated "Tx") is the fundamental data structure of the blockchain 150 (each block 151 comprising one or more transactions 152). The following will be described by reference to an output-based or "UTXO" based protocol. However, this is not limiting to all possible embodiments. Note that while the example UTXO-based protocol is described with reference to bitcoin, it may equally be implemented on other example blockchain networks.
In a UTXO-based model, each transaction ("Tx") 152 comprises a data structure comprising one or more inputs 202, and one or more outputs 203. Each output 203 may comprise an unspent transaction output (UTXO), which can be used as the source for the input 202 of another new transaction (if the UTXO has not already been redeemed). The UTXO includes a value specifying an amount of a digital asset. This represents a set number of tokens on the distributed ledger. The UTXO may also contain the transaction ID of the transaction from which it came, amongst other information. The transaction data structure may also comprise a header 201, which may comprise an indicator of the size of the input field(s) 202 and output field(s) 203. The header 201 may also include an ID of the transaction. In embodiments the transaction ID is the hash of the transaction data (excluding the transaction ID itself) and stored in the header 201 of the raw transaction 152 submitted to the nodes 104.
Say Alice 103a wishes to create a transaction 152j transferring an amount of the digital asset in question to Bob 103b. In Figure 2 Alice's new transaction 152j is labelled "Tx!'. It takes an amount of the digital asset that is locked to Alice in the output 203 of a preceding transaction 152i in the sequence, and transfers at least some of this to Bob. The preceding transaction 152i is labelled "Txo" in Figure 2. Txo and Tx/are just arbitrary labels. They do not necessarily mean that Txois the first transaction in the blockchain 151, nor that Tx/ is the immediate next transaction in the pool 154. Tx1 could point back to any preceding (i.e. antecedent) transaction that still has an unspent output 203 locked to Alice.
The preceding transaction Tx0 may already have been validated and included in a block 151 of the blockchain 150 at the time when Alice creates her new transaction Tri, or at least by the time she sends it to the network 106. It may already have been included in one of the blocks 151 at that time, or it may be still waiting in the ordered set 154 in which case it will soon be included in a new block 151. Alternatively Tx0 and Tx/ could be created and sent to the network 106 together, or Tx() could even be sent after Tx/ lithe node protocol allows for buffering "orphan" transactions. The terms "preceding" and "subsequent" as used herein in the context of the sequence of transactions refer to the order of the transactions in the sequence as defined by the transaction pointers specified in the transactions (which transaction points back to which other transaction, and so forth). They could equally be replaced with "predecessor" and "successor", or "antecedent" and "descendant", "parent" and "child", or such like. It does not necessarily imply an order in which they are created, sent to the network 106, or arrive at any given blockchain node 104. Nevertheless, a subsequent transaction (the descendent transaction or "child") which points to a preceding transaction (the antecedent transaction or "parent") will not be validated until and unless the parent transaction is validated. A child that arrives at a blockchain node 104 before its parent is considered an orphan. It may be discarded or buffered for a certain time to wait for the parent, depending on the node protocol and/or node behaviour.
One of the one or more outputs 203 of the preceding transaction Txo comprises a particular UTXO, labelled here UTX0o. Each UTXO comprises a value specifying an amount of the digital asset represented by the UTXO, and a locking script which defines a condition which must be met by an unlocking script in the input 202 of a subsequent transaction in order for the subsequent transaction to be validated, and therefore for the UTXO to be successfully redeemed. Typically the locking script locks the amount to a particular party (the beneficiary of the transaction in which it is included). I.e. the locking script defines an unlocking condition, typically comprising a condition that the unlocking script in the input of the subsequent transaction comprises the cryptographic signature of the party to whom the preceding transaction is locked.
The locking script (aka scriptPubKey) is a piece of code written in the domain specific language recognized by the node protocol. A particular example of such a language is called "Script" (capital S) which is used by the blockchain network. The locking script specifies what information is required to spend a transaction output 203, for example the requirement of Alice's signature. Unlocking scripts appear in the outputs of transactions. The unlocking script (aka scriptSig) is a piece of code written the domain specific language that provides the information required to satisfy the locking script criteria. For example, it may contain Bob's signature. Unlocking scripts appear in the input 202 of transactions.
So in the example illustrated, UTX00 in the output 203 of Txocomprises a locking script [Checksig PA] which requires a signature Sig PA of Alice in order for UTX00 to be redeemed (strictly, in order for a subsequent transaction attempting to redeem UTX00 to be valid). [Checksig PA] contains a representation (i.e. a hash) of the public key PA from a public-private key pair of Alice. The input 202 of Tx] comprises a pointer pointing back to Tx] (e.g. by means of its transaction ID, Tx1,00, which in embodiments is the hash of the whole transaction Txo). The input 202 of Do comprises an index identifying UTX0owithin Txo, to identify it amongst any other possible outputs of Txo. The input 202 of Txi further comprises an unlocking script <Sig PA> which comprises a cryptographic signature of Alice, created by Alice applying her private key from the key pair to a predefined portion of data (sometimes called the "message" in cryptography). The data (or "message") that needs to be signed by Alice to provide a valid signature may be defined by the locking script, or by the node protocol, or by a combination of these.
When the new transaction Tx/i arrives at a blockchain node 104, the node applies the node protocol. This comprises running the locking script and unlocking script together to check whether the unlocking script meets the condition defined in the locking script (where this condition may comprise one or more criteria). In embodiments this involves concatenating the two scripts: <Sig PA> < PA> II [Checksig PA] where "11" represents a concatenation and "<...>" means place the data on the stack, and "[...]" is a function comprised by the locking script (in this example a stack-based language). Equivalently the scripts may be run one after the other, with a common stack, rather than concatenating the scripts. Either way, when run together, the scripts use the public key PA of Alice, as included in the locking script in the output of Txo, to authenticate that the unlocking script in the input of Tx/ contains the signature of Alice signing the expected portion of data. The expected portion of data itself (the "message") also needs to be included in order to perform this authentication. In embodiments the signed data comprises the whole of Tx/ (so a separate element does not need to be included specifying the signed portion of data in the clear, as it is already inherently present).
The details of authentication by public-private cryptography will be familiar to a person skilled in the art. Basically, if Alice has signed a message using her private key, then given Alice's public key and the message in the clear, another entity such as a node 104 is able to authenticate that the message must have been signed by Alice. Signing typically comprises hashing the message, signing the hash, and tagging this onto the message as a signature, thus enabling any holder of the public key to authenticate the signature. Note therefore that any reference herein to signing a particular piece of data or part of a transaction, or such like, can in embodiments mean signing a hash of that piece of data or part of the transaction.
If the unlocking script in Txt meets the one or more conditions specified in the locking script of Tx° (so in the example shown, if Alice's signature is provided in Tx/ and authenticated), then the blockchain node 104 deems Tx/ valid. This means that the blockchain node 104 will add Tx/to the ordered pool of pending transactions 154. The blockchain node 104 will also forward the transaction Tx/to one or more other blockchain nodes 104 in the network 106, so that it will be propagated throughout the network 106. Once Tx/ has been validated and included in the blockchain 150, this defines UTX00 from Txo as spent. Note that Tx/ can only be valid if it spends an unspent transaction output 203. If it attempts to spend an output that has already been spent by another transaction 152, then Tx/will be invalid even if all the other conditions are met. Hence the blockchain node 104 also needs to check whether the referenced UTXO in the preceding transaction Tx° is already spent (i.e. whether it has already formed a valid input to another valid transaction). This is one reason why it is important for the blockchain 150 to impose a defined order on the transactions 152. In practice a given blockchain node 104 may maintain a separate database marking which UTX05 203 in which transactions 152 have been spent, but ultimately what defines whether a UTXO has been spent is whether it has already formed a valid input to another valid transaction in the blockchain 150.
lithe total amount specified in all the outputs 203 of a given transaction 152 is greater than the total amount pointed to by all its inputs 202, this is another basis for invalidity in most transaction models. Therefore such transactions will not be propagated nor included in a block 151.
Note that in UTXO-based transaction models, a given UTXO needs to be spent as a whole. It cannot "leave behind" a fraction of the amount defined in the UTXO as spent while another fraction is spent. However the amount from the UTXO can be split between multiple outputs of the next transaction. E.g. the amount defined in UTX0o in Txo can be split between multiple UTX05 in Txt. Hence if Alice does not want to give Bob all of the amount defined in UTX0o, she can use the remainder to give herself change in a second output of Tx4 or pay another party.
In practice Alice will also usually need to include a fee for the bitcoin node 104 that successfully includes her transaction 104 in a block 151. If Alice does not include such a fee, Txo may be rejected by the blockchain nodes 104, and hence although technically valid, may not be propagated and included in the blockchain 150 (the node protocol does not force blockchain nodes 104 to accept transactions 152 if they don't want). In some protocols, the transaction fee does not require its own separate output 203 (i.e. does not need a separate UTXO). Instead any difference between the total amount pointed to by the input(s) 202 and the total amount of specified in the output(s) 203 of a given transaction 152 is automatically given to the blockchain node 104 publishing the transaction. E.g. say a pointer to UTX0o is the only input to Do, and Tx/ has only one output UTX61/. If the amount of the digital asset specified in UTX00 is greater than the amount specified in UTX01, then the difference may be assigned (or spent) by the node 104 that wins the proof-of-work race to create the block containing UTX01. Alternatively or additionally however, it is not necessarily excluded that a transaction fee could be specified explicitly in its own one of the UTX05 203 of the transaction 152.
Alice and Bob's digital assets consist of the UTX0s locked to them in any transactions 152 anywhere in the blockchain 150. Hence typically, the assets of a given party 103 are scattered throughout the UTX0s of various transactions 152 throughout the blockchain 150.
There is no one number stored anywhere in the blockchain 150 that defines the total balance of a given party 103. It is the role of the wallet function in the client application 105 to collate together the values of all the various UTX05 which are locked to the respective party and have not yet been spent in another onward transaction. It can do this by querying the copy of the blockchain 150 as stored at any of the bitcoin nodes 104.
Note that the script code is often represented schematically (i.e. not using the exact language). For example, one may use operation codes (opcodes) to represent a particular function. "OP_..." refers to a particular opcode of the Script language. As an example, OP RETURN is an opcode of the Script language that when preceded by OP_FALSE at the beginning of a locking script creates an unspendable output of a transaction that can store data within the transaction, and thereby record the data immutably in the blockchain 150. E.g. the data could comprise a document which it is desired to store in the blockchain.
Typically an input of a transaction contains a digital signature corresponding to a public key PA. In embodiments this is based on the ECDSA using the elliptic curve secp256k1. A digital signature signs a particular piece of data. In some embodiments, for a given transaction the signature will sign part of the transaction input, and some or all of the transaction outputs. The particular parts of the outputs it signs depends on the SIGHASH flag. The SIGHASH flag is usually a 4-byte code included at the end of a signature to select which outputs are signed (and thus fixed at the time of signing).
The locking script is sometimes called "scriptPubKey" referring to the fact that it typically comprises the public key of the party to whom the respective transaction is locked. The unlocking script is sometimes called "scriptSig" referring to the fact that it typically supplies the corresponding signature. However, more generally it is not essential in all applications of a blockchain 150 that the condition for a UTXO to be redeemed comprises authenticating a signature. More generally the scripting language could be used to define any one or more conditions. Hence the more general terms "locking script" and "unlocking script" may be preferred.
3. HASH TREES A hash tree, also known as a Merkle tree, is a data structure which can be used to attest to (e.g. publicly commit to) the membership of a set of data items D. An example hash tree 400 is illustrated by way of example in Figure 3.
The hash tree 300 comprises a plurality of layers arranged in a hierarchy from a leaf layer (lowest layer) up to a root layer (highest layer) with one or more internal layers in between. The root layer comprises a single root node 304. Each internal layer comprises a respective plurality of internal nodes 303 within that layer. The leaf layer comprises a respective plurality of leaf nodes 302. Within each of the internal layers and the leaf layer, each node 303/302 has an indexed position within its respective layer. The root node 304 is a hash of a combination (typically a concatenation) of all the internal nodes 303 in the next layer down in the hierarchy (which will be two nodes 303 in the case of a binary tree). At each internal layer, each internal node 303 within the layer comprises a hash of a combination (typically a concatenation) of a different corresponding subset of at least two of the nodes 302/303 at the next layer down in the hierarchy. In the case of a binary tree, each internal node 303 will be the hash of the combination (e.g. concatenation) of a different respective pair of the nodes in the next layer down 302/303.
Each leaf node designates a respective data item 301, also labelled herein D. Each leaf node 302 could designate its respective data item DI by comprising an explicit copy of Di, in the clear. However, typically each leaf node 302 instead comprises a hash of a respective preimage wherein the respective preimage comprises D. In this case the data DI is not derivable from the leaf node 302, but nonetheless the leaf node 302 may be said to designate or signify its respective data item Di in that this particular leaf node 302 is the element of the tree 303 which nominates the respective data item as being part of the set of data items Do, Di, ...
In embodiments each subset of nodes at each layer (the subset that is combined and hashed from one layer to the next) may be exactly two nodes, i.e. the tree is a binary tree. In other words each node (except the leaves) branches to two other nodes at the next layer down. However non-binary hash trees are also possible. Typically also each subset of nodes at each layer is an exclusive subset. The term "Merkle tree" is sometimes assumed in the literature to imply a binary tree. However that limitation was not imposed by Mr Merkle in his original patent and that convention is not adopted here. More generally a Merkle tree as referred to herein may refer to hash tree of any degree of branching, or even uneven branching. The term Merkle tree may be used throughout the following description, but this may equally be replaced with the term hash tree anywhere herein.
One possible use of a Merkle tree would be to attest to a set of unconsumed transaction outputs 203 of blockchain transactions 152. These may be referred to as UTX05 throughout the following, but more generally the same principles could apply to other blockchain standards that are based on the concept of consumed and unconsumed transaction outputs but which do not necessarily use the terminology of UTX05. Consumed and unconsumed UTX05 or transaction outputs may also be referred to as "spent" and "unspent", respectively. However this does not necessarily imply a financial transaction and the term "spent" may be used in a more general sense to mean "used up", "redeemed" or "unlocked" (in the sense of unlocking a locking script in the case where locking and unlocking scripts are used). The "spent" terminology may be used throughout the following, but this may be replaced anywhere herein with "consumed" (and conversely unconsumed), or such like (used up, redeemed or unlocked).
The set of unspent UTX0s 203 being attested to by the Merkle tree 300 may be the current set of unspent transaction outputs that have already been successfully recorded in blocks 152 on the blockchain 150, i.e. already "mined". Alternatively the set UTX05 in question may be the union of set of UTX0s recorded in blocks 152 on-chain 150 and the set of UTX0s of pending transactions 152 currently in the pool of a given miner, or a given blockchain node 104 (i.e. a given node 104 of the blockchain network 106). Note that the term "node" may be used differently in different contexts to refer either to a node 104 of the blockchain network 106 or a node 302/303/304 of the Merkle tree 300, which are not the same thing. Note also that the term "miner" or "mining" as used most generally herein does not necessarily imply a financial reward for the task of incorporating blocks on chain. More generally "mining" may refer to the race to win the proof-of-work puzzle required to generate blocks 152 that are accepted by the blockchain network 106, and a "miner" is any party that participates in the proof-of-work race. The term miner as used anywhere herein could equally be replaced with operator of a node 104 of the blockchain network 106.
A Merkle tree could also be used to attest to other sets of UTX05. E.g. the set of UTX05 in question could be a more specific set of unspent UTX05, such as the set of unspent UTX05 of which a particular party is the payer 102a or payee 102b.
Whatever the set of UTX05 being attested to, then in order to use a Merkle tree 300 to attest to the membership of the set, each leaf node 302 designates (i.e. signifies, nominates or attests to) a different respective subset of the UTX05 in the set. This means that each data item DI comprises an indication (identifier) of the respective subset being designated by the respective leaf node 302. In embodiments the respective leaf node 302 designates its respective subset of UTX05 not by comprising the indication of that subset explicitly, but instead by comprising a hash of a preimage wherein the preimage comprises the indication of the respective subset (optionally the preimage could comprise the indication combined, e.g. concatenated, with another piece of data such as a salt). In other words DI is the preimage of the leaf node 302.
For each leaf node 302, the indication (comprised by Di) of the respective subset could take any suitable form. E.g. in embodiments each leaf node 302 may correspond to a different respective transaction (Tx) 152, and the subset of UTX0s 203 designated by the respective leaf node 302 may comprise all the unspent UTX05 in the transaction (which could be one or more outputs of the transaction). In this case the indication comprised by DI may comprise the transaction identifier (TxID) of the respective transaction along with an indication of which UTX05 within that transaction are unspent (or equivalently which outputs are spent). Alternatively each leaf node 302 could correspond to a different individual UTXO, i.e. a given leaf node 302 designates a single UTXO. In this case the indication comprised by Di could comprise a combination of TxID and the number of the particular UTXO 203 within the identified transaction 152 (e.g. output 0 or output 1). As another example, each leaf node 302 could correspond to all the UTX0s in a certain specified group of transactions 152 (e.g. range of TxIDs).
Whatever the mapping of UTX05 203 to leaves 302, the Merkle tree 301 (of which the instance illustrated in Figure 3 is just one example), allows a party to attest to or commit to the membership of the set of unspent UTX05. This is because the tree 300 allows another party to verify that a candidate data item DI is a member of the attested-to set. The principle is illustrated by way of example in Figure 4, where Do happens to be the candidate data item.
Say that an attesting party (the "attester") wishes to commit to the membership of a UTXO set, e.g. current UTX0s on chain and/or current UTX05 in their mempool, and that the attestation or commitment is being made for the benefit of at least one other party (the "attestee"). For example the attesting party could be a third-party service provider who provides a service of attesting to the UTXO set for the benefit of one or more miners (who in this example would be the attestees). Or as another example, the attesting party could be a miner who attests to their view of the UTXO set. In this case the attestee party (or parties) could be one or more other miners so as to promote consensus among miners, or could one or more end-users 102 of the blockchain 150.
Whatever their roles, the attesting party makes the Merkle root 304 available to the attestee party, such as by publishing it (e.g. on the blockchain 150), or sending it to them via an off-chain side channel (not shown). Say then the attestee wishes to check that some candidate data item Di -in this an indication of a candidate UTXO -is indeed a member of the set attested to by the Merkle root 304. The attestee can do this given only the candidate data item DI, the Merkle root 304, and a further portion of information called the "Merkle Proof" or "hash tree proof".
Call the leaf node 302 that designates the candidate data item Di the candidate leaf node 450-typically the hash H(DI) of the candidate data item D. The "Merkle path" is the subset of nodes 401 (which will include a leaf node 302 and one or more interior nodes 303) that are immediate siblings of each node lying along the path 402 between the candidate leaf node 450 and the Merkle root 302 (including the sibling of the candidate leaf node 450). In other words, consider a path 402 from the candidate leaf node 450 to the Merkle root 304. At each layer (including the leaf layer), there will be a node 302/303 that lies on that path 402, and that node will have one or more siblings 401 (exactly one in the case of a binary tree) which is/are the other node(s) 302/303 that are is/are part of the same subset at that layer-that is, the subset of nodes that are combined and hashed to form the node 303/304 at the next layer up along the path 402. In that art, the total subset of siblings 401 that lie along the path 402 are referred to as the "Merkle path" (even though they are not the path 402 through the tree per se). If the attested to party knows the candidate data item Di they which to check, and has been given the Merkle root 304 and Merkle path 401, then they can demonstrate that the candidate data item Di was part of the set attested to by the attesting party, by recomputing the Merkle root from the candidate data item DI Merkle path 401.
The Merkle path may be obtained by the attestee from, for example, either a miner/node who provides this as a service, or some non-node service provider. In embodiments it may be provided along with the Merkle root 304 by the party the attesting who supplies the Merkle root.
It can be seen by way of example from Figure 4 that the candidate data item DI and the complementary nodes of the Merkle path 401 are sufficient to recompute the Merkle root without needing to know the whole tree 300.
3.1 Example notations Notation Meaning TT, Merkle Tree representing the UTXO set at block height 71 LT, The set of leaves in Tn.
A Changelog array containing indices that changed when updating Tn. to T1 HI, The array of hashes at layer K ile The set of updated hashes at layer K [",], Array of indices or hashes at layer K P Partial Merkle Tree, a set of arrays of indices Hi The hash value of a node at index i in a given layer k 2k is the number of leaves in a binary Merkle tree where k is the number of layers S" The UXTO set at block height n Bn The set of transactions in the block with height n nn-Ht The set of outpoints that are either created or spent (but not both) between block 71 ± 1 and n + t, inclusive "n+1 Iii' The set of outpoints that are created and spent between block n + land n + t, inclusive D1 The set of outpoints that are either created or spent but not both in block n + 1 4+1 The set of outpoints that are created and spent in block n + 1 Table 0 a list of notations used in the description of certain embodiments disclosed herein.
3.2 Example formulation A Merkle tree is typically interpreted as a binary hash tree (though this is not essential). In a binary Merkle tree, each node in the tree may be given an index pair (ia. Wand is represented as N(i", is). The indices ia, is are simply numerical labels that are related to a specific position in the tree.
The construction of each of a Merkle tree's nodes may be governed by the following equations H(Di) ia =ip = I ia N(icr, H (ia, id) II N(id + 1, iii)) where i5 = + -1)12 and H is a cryptographic hash function, and where Di is the block of data underlying the ith leaf node.
An example of a binary Merkle tree constructed according to these equations is shown in Figure 3. From this diagram of a Merkle tree, we can see that the ia = = i case corresponds to a leaf node, which is simply the hash of the corresponding ith block of data Di. The ia # ip case corresponds to an internal or root node, which is generated by recursively hashing and concatenating child nodes in the tree until the specific node or the root is reached.
For example, the node N(0,3) is constructed from the four data blocks Bo, , B3 as: N(0,3) = H(N(0,1) II N(2,3)) = H[H (A( (0,0) II N (1,1)) II H (N(2,2) II N(3,3))1 = H[H(H(Do) II H(DO) II H(H(D2) II 1-1(D3))1.
The tree depth M is defined as the lowest level of nodes in the tree, and the depth m of a node is the level at which the node exists. For example, m -root = 0 and = = M, where M = 3 in Figure 4.
3.3 Merkle proofs The purpose of a Merkle tree in most applications is to facilitate a proof that some data block Di is a member of a list or set of N data blocks 73 E [D0,...,DN_O. Given a Merkle root and a candidate data block Di, this can be treated as a 'proof-of-existence' of the block within the set.
The mechanism for such a proof is known as a Merkle proof and consists of obtaining a set of hashes known as the Merkle path for a given data block Di and Merkle root R. The Merkle path for a data block is simply the minimum list of hashes required to reconstruct the root R by way of repeated hashing and concatenation, often referred to as the 'authentication path' for a data block.
If, given a Merkle root R, we wish to prove that the data block Do belongs to the set 23 E {D0, ..., DN_i} represented by R we can perform a Merkle proof as follows 1) Obtain the Merkle root R from a trusted source.
2) Obtain the Merkle path F from a source. In this case, F is the set of hashes: F = {N(1,1), N(2,3), N(4,7)). 10 3) Compute a Merkle proof using Do and F as follows: a. Hash the data block to obtain: N(0,0) = 1-1(D0).
b. Concatenate with N(2,2) and hash to obtain: N(0,1) = H(N(0,0) II N(1,1)).
c. Concatenate with N(3,4) and hash to obtain: N(0,3) = H(N(0,1) II N(2,3)). 20 d. Concatenate with N(4,7) and hash to obtain the root: N(0,7) = H(N(0,3) II N(4,7)), = IV(0,7).
e. Compare the calculated root R' with the root R obtained in step 1: i. If R' = R, the existence of Do in the tree, and therefore the data set D, is confirmed.
ii. If # R, the proof has failed and Do is not confirmed to be a member of D. This is an efficient mechanism for providing a proof-of-existence for some data as part of the data set represented by a Merkle tree and its root. For example, if the data Do corresponded to a blockchain transaction and the root R is publicly available as part of a block header then we can quickly prove that the transaction was included in that block.
The process of authenticating the existence of Do as part of our example Merkle tree is shown in Figure 4. Figure 4 shows a Merkle proof-of-existence of a data block Do, in the tree represented by a root R (304), using a Merkle path 402. This demonstrates that performing the Merkle proof for a given block Do and root R is effectively traversing the Merkle tree 'upwards' by using only the minimum number of hash values necessary.
3.4 Indices Denoting Arrays in Merkle Trees In this section, we establish alternative notations for the indices in a Merkle tree. In embodiments this may be used as an alternative to the indexing convention of Figure 3, though it is not excluded that both schemes could be used in parallel for different purposes. The scheme of Figure 3 has been described herein to state an example of a Merkle tree using Merkle's original conventions, and to show how nodes may be constructed from concatenated children. The following on the other hand establishes an example scheme for denoting arrays within Merkle trees.
In what follows we use lc to indicate the index of the layer in which a node resides, which runs from K = 0 for the leaf layer to lc = k for the root node layer.
Figure 5 shows an example of a Merkle tree, with indices iKwithin each layer K. Each row is indexed from 0 to the right, and the rows themselves are indexed from 0 up from the leaf layer. A node with a higher value of index iK within a given layer may be describes as to the "right". When speaking of a given layer, index iK may be abbreviated as just Ion the understanding of which layer K is in question.
The general formula for the number of nodes in a given layer is given by: nK = V lc E k} which can be written as nic = 24-' for the example Merkle tree shown in Figure 5.
Layer (n) No. nodes in layer (nK) Array of node indices in layer (AK) Formula for no. nodes in layer K = 0 16 Ao = [0,1, _14,15] no = 24-o = 24 = 16 K = 1 8 Ai = [0,1, ...,6,7] K = 2 4 A2 = [0,1,2,3] K = 3 2 A3 = [0,1] sr = k = 4 1 A4 = [0] Table 1: the layers, node numbers, and arrays containing layer indices of the Merkle tree shown in Figure 5.
3.5 Partial Merkle Tree Consider a Merkle tree T comprising a set of nodes N, including a set L of L leaf nodes, one root node R and a set I of exactly n1 = INI - -1 internal nodes.
A partial Merkle tree (PMT) is a collection of indexed nodes of a Merkle tree which can be used to prove that a proper subset of leaf nodes L' are members of the complete set of leaves L, which can be expressed mathematically as L' c L. It should be noted that the partial Merkle tree for a single leaf is simply its Merkle path.
Under a stricter definition, we can consider a partial Merkle tree as the minimum proper subset N' of the N nodes in the tree required to prove that all elements of L' are indeed elements of L. Recall that L is itself uniquely represented by R. In a balanced binary Merkle tree, the number of leaf nodes is necessarily given by L = 2k, for some integer k. The value k can also be used to determine the number of layers in a binary Merkle tree, which is simply given by k + 1. When referring to a particular layer within such a tree, we will use ic to denote the layer where ic E...,k}, and where we use the convention that the layer K = 0 is the leaf node layer and layer K = k is the root node layer.
The index i of a node situated in the layer lc is denoted by We can use the equations below to perform calculations on a node index as follows: io = iK bK = iK mod 2, cK = iK / K E f1,2, k} / K E / K E k -iK-1-bK-1 The first equation (1) allows us to calculate the index of the parent of the node of interest.
The second equation (2), which is also used in (1), allows us to calculate the parity of the index of the node (i.e. whether it is an odd or even index). The third equation (3) allows us to determine the index of a node's partner (i.e. the node which the node of interest is concatenated and hashed with in the tree), which we call a complementary index cK.
Using the equations above, we can construct an algorithm generatePMT() that takes a set of leaves L' from a complete Merkle tree, and the parameter k for the tree, and which generates a minimal partial Merkle tree corresponding to L'.
In the algorithm, assume L' is in fact an ordered subset of L. We can use the term array to refer to an ordered set herein.
The primary input L' of the generatePMT() algorithm is an array of leaf indices of the form if-11 0, 0, *** P where = len(E) = is the number of leaf nodes for which we wish to generate a PMT.
The algorithm also takes the total number of layers of the tree k = log2 iiL as input.
The output P of the g enerateP MT () algorithm is a partial Merkle tree of the form: P = (ig, c8, c61-1, A'2, ..., 4_1) = cg, 4_1) = [-No, ..* 4-1] = [[***10,[***]1, *** ,[***ik-li where A, is the set of node indices in the kth layer of the PMT, which is a proper subset of the indices of the nodes AK in the xth layer of the full Merkle tree and satisfies the relation 30 A;, c AK.
An example algorithm cieneratePMT( ): for determining a partial Merkle tree may be described as follows.
Inputs:= , , k 1. Initialise an array for the leaf indices ire, = , 2. For each entry e in A/0: Compute the complementary index c = e Hirywa 2. If c E 11'0, insert c into A'0 in numerical order.
3. Initialise and set Bo = [].
4. For K = 1 tO K = k -1: Initialise an array for internal node indices AIK = Set U"_i =A'K_jBIC-F Initialise a set of branch indices B, = [j* For (m = 0; 2m < 114-1i; rn + 1): i. Compute the index i;let = U"_][2m]/2 ii. Insert to BK.
Initialise a set of complement indices CK = []* For each entry e in B k: Compute the complement index c = e + (-i)e mod 2.
H. If the index c is not in BK, append c to CK.
iii. Set A = CK.
- Break if A'" = [0]. (gathered all the nodes at level k) t Lj A Set P = K-0 Outputs: P = (i8, ia, 4-1) = [[...]o, [...]k-11 Step 1 initialises an array A'0 corresponding to the nodes of the PMT in the leaf layer, which only contains the leaves provided to the algorithm by I/ at this point. Note that 11'0 is denoted with a " " to indicate it is a subset of the full array Ao of layer 0 nodes.
Step 2 populates the rest of A/0 by calculating the complementary indices co for all of the leaves already in the array and adds them to the array. When the complementary indices are added, they are added in numerical order, meaning each complementary index is placed next to (either immediately left or right of) the corresponding original leaf provided in L'.
Step 3 initialises a branch array Bo, which allows step 4 to be written as a loop.
S
Step 4 loops through the remaining internal layers lc = 1 to ec = k -land generates an array AiK for each layer, whose elements are the subset of nodes in that layer which are also in the PMT for L'.
This step introduces an array UK_i which is the ordered union of the PMT nodes 111K_i of the layer below and the branch nodes Bk_i of the layer below. The branch nodes of a given layer are all the nodes in that layer that can be calculated from other information (i.e. child nodes) that are already somewhere lower down in the PMT.
This means that the array UK_i includes all the nodes of the lower layer that we have access to for calculation (whether they are stored as part of the PMT or can be calculated 'on-thefly' from other parts), and therefore UK_i can be used as the starting point for determining the relevant arrays irK, 8,,,L'" in the layer above.
The rest of step 3 comprises two further subroutines. The first calculates the parent index iK for ever other entry in UK_i and adds them to the branch array Bic. The second calculates the complement partner indices c.,T for all the entries in Bic and adds these complement indices to CK if they are not already in BK, which avoids double-counting anything we need to store in that we know we can already calculate based on other information in the PMT.
The end of step 3 involves setting each respective A'" to the array CK we have just populated and deduplicated. There is also a break condition to halt the loop of step 3 if ArK is empty, meaning no further processing is required to generate the PMT.
Step 5 of the algorithm simply combines all the arrays A'0, in order to create the final partial Merkle tree IF.
4. HOLEY MERKLE TREE The present disclosure provides a novel scheme for maintaining a Merkle tree (or hash tree), whereby at least some of the leaf nodes can each either: A) designate a corresponding subset of one or more of the unconsumed outputs (e.g. by comprising a hash of a preimage that compromised an indication of the corresponding subset); or B) comprise a placeholder value, in other words a dummy value, which may also be referred to herein as a "hole". In embodiments every leaf node has the ability to become either type of node, and may be changed back and forth between the two from one update of the tree to the next.
The idea is illustrated by way of example in Figure 6. Here. Some of the leaf nodes 302 each designate a respective corresponding subset of the UTX05 203, as discussed previously. E.g. each comprises a hash Hi of an indication of its corresponding subset. The subset designated by each leaf could be one or more UTX0s, e.g. the subset of each leaf being an individual UTXO, or all the UTX05 within an individual transaction 152, or the UTX05 of more than one transaction (e.g. the transactions in a given range of TxIDs). For instance, in embodiments each leaf comprises a hash H1 of a respective preimage where the respective preimage comprises a TxID of a respective transaction 152 and an identifier of each UTXO 203 within the transaction (e.g. output 0, output 1, etc.).
One or more others of the leaf nodes may comprise "holes", represented herein by the empty sign "0". These are dummy or placeholder values that represent no UTX05. E.g. they could be zero or a string of zeros, or any other predetermined value that is understood to represent a hole in the leaf layer of the tree where no UTX05 are designated. A "hole" as referred to herein could be any representation of a null value.
The following describes a method of attesting to a current state of a set of UTX0s (or more generally unconsumed outputs of blockchain transactions in any output-based model). That is, computing a Merkle tree cumulating in a Merkle root which can later be used to verify the membership of the set. An example flow chart is shown in Figure 7. In embodiments the method may be used to publicly commit to the membership of the set by publishing the Merkle root, e.g. on the blockchain 150. Although alternatively it is not excluded that the method could instead be used to keep a private record, e.g. internally within an organization.
In embodiments the method may be performed by a third-party service provider who makes available the attestation to one or more nodes 104 (or "miners") of the blockchain network 106 in order to confirm the miner's view of the blockchain. Alternatively or additionally, the method may be performed by one or more miners who make the attestation available to other miners and/or users 102 of the blockchain network in order to promote confidence in the state of the chain.
The attested-to set could be, for example, only the set of UTX05 actually recorded in blocks 151 on the blockchain. Or alternatively it could be the set that additionally includes the UTX0s of pending transactions in the mempool of a given miner.
At some initial set-up stage S10, the method begins by determining an initial Merkle tree based on the current UTXO set at the set-up stage. This may include one or more holes 0, or it could begin fully occupied only with leaves 302 that designate UTX05. In embodiments the tree may be formed so as to be "balanced". That is to say, each layer contains a full set of nodes 303/303 (i.e. the maximum number of nodes at each layer given the branching ratio of the tree), with the leaf nodes all at the same layer. E.g. in a binary tree this means there are 2k leaf nodes all at the same, lowest layer where k is the number of layers. Depending on the number of UTX05 and the mapping of UTXO subsets to leaves 302, this may require including one or more holes 0 at the leaf layer. E.g. in a binary tree, if there are not exactly 2k UTXO subsets, where k is the number of layers, then one or more holes 0 will be needed to fill the gaps in the leaf layer so as to "fill up" the leaf layer, i.e. include a full set of leaf nodes 302 across the whole bottom layer of the tree. Figure 6 shows and example of this.
On some later occasion, at step 520, the method comprises determining an update to the set of UTX0s. For example this could be done the next time that a new block 151 is included on the blockchain 150 (in other words the block height has increased by one). Alternatively it could be in response to some other event, or after some periodic or random time interval has elapsed. E.g. lithe tree is being maintained by a node 104 of the blockchain network and includes pending transactions, then the update could be performed after the node 104 has received a predetermined number of new pending transactions into its pool of pending transactions ("mempool").
Whatever the trigger, at this stage the method at step 520 comprises determining a "delta set" representing the update. The delta set is the total set of new UTX0s being added and UTX05 which have now been spent and so are to be removed. On any given occasion (i.e. at each update), the delta set will comprise an addition of at least one new UTXO and/or removal of one or more UTX05 which have now been consumed. On at least some occasions, the update will very likely comprise both the removal of at least one now-spent UTXO and the addition of at least one new UTXO.
In embodiments, the party performing the method may determine the delta set by monitoring the blockchain 150 and/or mempool themself. Alternatively, they may determine the delta set by receiving it from another party who monitors the blockchain 150 on their behalf. For instance, the party performing the method may be a node 104 of the blockchain network 106, or a party operating one or more nodes 104 of the blockchain network 106-a "miner" for short -and the delta set may be determined by receiving a subscription to the delta set from a third party service provider who monitors the blockchain 150 for new blocks and regularly provides the delta set to one or more miners as a subscription service (e.g. each time the block height increases). Or if the set being attested to by the tree 300 is also to include pending transactions, the miner could determine the delta set as the union of the delta of on-chain UTX0s received in the subscription combined with the delta to the miner's mempool. Alternatively the miner may monitor the blockchain and determine the delta set directly themself.
In embodiments, the party performing the method may be the third-party service provider, and the determination of the delta set may be performed by the service provider monitoring the blockchain 150. In some such embodiments, the service provider may both use the delta set both to determine the update to the tree for the purpose of their own attestation, and to provide to one or more miners as a subscription service to inform them of updates to the blockchain 150.
At step 530, the method comprises recomputing the hash tree to reflect the update.
By whatever means the delta set was determined, the idea of the holey Merkle tree is that when there is an update to the UTXO set, sometimes entries will be removed and sometimes entries will be added. Using a holey Merkle tree, it is possible to add new entries in the tree in places where others have been removed, either in the current update round or from one round of update to the next. This will make recomputing the tree simpler.
E.g. consider a simple update where only one leaf node is removed, say H3, to reflect a spent UTXO; and only one new leaf added is added to reflect a new UTXO. The new leaf -call it H3A -may be added in place of the removed leaf H3, at the same position i=3 as the removed node. Or if only one new leaf node is to be added to reflect a new UTXO and no UTX05 have been removed, but there is a pre-existing hole 0 at, say, i=4, then the new node -call it H4°-may be added in place of the hold at position i=4.
Other scenarios are also possible. E.g. if more leaves designating now-spent UTX0s are removed in one round of update than new leaves designating new UTX05 are added, then on that occasion one or more new holes 0 may be added to fill the gaps. Then later, in a subsequent round of update (subsequent occasion), if more leaves designating new UTX05 are to be added than those being removed, the excess leaf or leaves designating the new UTXO(s) can replace an existing hole or holes 0 that was already present from a previous round of update (previous occasion). Also if the tree begins at step 510 with some holes already present, then these may be filled in subsequent rounds of update.
In some cases, only one of a subset of multiple UTX0s designated by a given leaf may have been spent, still leaving one or more other unspent UTX05 covered by the same leaf 302. In that case the corresponding leaf will be updated but not replaced with a hole 0. It may be replaced with an updated leaf that designates at least the remaining UTX0(s), and which could potentially also designate one or more new UTX0s. In other words an old leaf designating a first subset of UTX05 is replaced with an updated leaf designating a second, different subset of UTX05, instead of a hole 0, wherein the second subset could overlap with the first subset but is not identical.
If at any given update, an added UTXO cannot be accommodated by a hole 0 or a leaf being removed in the same update, then the tree may need to be expanded to the right. If the tree is to remain balanced, this will mean added a whole new side to the tree (a whole extra half in the case of a binary tree) plus new root node 304. Preferably the tree is kept balanced at each update, whether the update involves replacing holes or other previous leaves, or expanding the tree, padding, or any combination of these. It is not essential that the tree is kept balanced by the updates. However, it may be preferable as it minimises the number of layers in the tree, such that the extra computation is also minimised (e.g. in a binary tree, it is only padded up to 2^k not 2A(k+1)).
The process of recomputing the tree may be summarised as follows. When any output is removed, the leaf node designating the removed output is removed and replaced, at the same position in the leaf layer, with a new leaf node which either: A) designates an updated subset of one or more unconsumed outputs not including the removed output, or B) comprises a placeholder value (a "hole" 0). And when any new output is added, a new leaf node is formed designating a subset of one or more unconsumed outputs including at least the new output, wherein the new leaf node is inserted in the tree in place of a replaceable leaf node when available, at the same position in the leaf layer. A replaceable leaf node is one that either: A) designates an output being removed, or B) comprises a placeholder value. But otherwise, if no replaceable leaf node is available, the tree is expanded to accommodate the new leaf node.
Preferably, the recomputing of the tree comprises recomputing the hashes of only those nodes affected by the update, i.e. only the part of the tree affected. In other words, the only hashes that need to be recomputed are those that are themselves an updated leaf node 302 or that are internal nodes 303 at any level of parent (parent, grandparent, great-grandparent etc) of an updated leaf node 302. For example, referring to Figure 6, if the hole 0 at i=4 in the leaf layer is replaced with a new node H4° to designate a new UTXO (or UTX05), then only the hashes of H4's parent, grandparent, great-grandparent and the root 304 need to be recomputed, as well as H4A at the leaf layer of course. So referring to the index notation of Figure 5, the recomputed hashes would be those at: (k=0, i=4); (k=1; i=2); (k=2; i=1); (k=3; i=0) and the root 304 (k=4). The rest of the hash values are unaffected by the update.
An example algorithm for determining a partial Merkle tree has previously been described. Previously this algorithm was known but only for determining the minimum subset of the tree needed to verify that a candidate subset of leaves are members of the set. According to the present disclosure, such an algorithm may be re-purposed for determining the minimum subset of nodes that need to be recomputed to perform the most efficient update of a holey Merkle tree.
The recomputing of the hash tree may be performed by a "remove then add" procedure; i.e. first removing any and all leaf nodes designating consumed outputs from the set, then inserting any and all leaf nodes designating new unconsumed outputs. This approach helps reduce the overall size (i.e. number of leaves) of the tree at any given time. This is because updating the leaf set transaction-wise ensures that UTX05 are removed from the Merkle tree at the earliest possible opportunity in the algorithm, which in-turn ensures that the holes in the leaf set are positioned as far left in the ordered set at possible. This means that holes are always filled from as far left as possible, thus minimising the possible number of entries in the leaf set. However an "add then remove" procedure is also possible.
Optionally, in embodiments the method at the re-computation stage 530 may also provide for shrinking the tree by a layer if, after the update, the tree would otherwise now have a redundant number of layers given its degree of branching and the number of leaves that designate actual UTX05 compared to those that have been removed and would thus otherwise comprise holes 0. E.g. in the case of a binary tree, if there are now fewer than half of the 2k leaves occupied by nodes that actually designate UTX05, i.e. half or more of the nodes at the current leaf layer are now holes 0, then the tree can be shrunk by one layer.
At step 540, optionally the Merkle root 304 (R) is published as a public commitment to the membership of the attested-to UTXO set. This could be done by including it in a transaction 152 on the blockchain 150, or by other means, such as on a website. In the case of publishing on chain, the commitment may be included in an output 203 (e.g. unspendable output) of a transaction recorded on the blockchain 152, e.g. by means of an OP_DROP opcode, OP_RETURN opcode, or combination of OP_RETURN and OP_FALSE, depending on the locking script protocol being used.
Whether or not step 540 is performed, at a later time the method may loop back to step 520 to determine another update and recompute the tree again. For example this may be done each time the block height of the blockchain 150 increases (i.e. a new block has been "mined"), or periodically or randomly, or each time the number of pending transactions in the party's mempool increases by a threshold amount.
On at least one of the occasions (i.e. one of the loops) the update comprises the removal of at least one output, and on at least one occasion the update comprises the addition of at least one new output. The addition could happen on the same occasion (i.e. same update) as the removal, or a different occasion (i.e. different update). Sometimes an update may comprise both the removal of at least one output and the addition of at least one new output. On other occasions the update may comprise only addition or only removal.
Note: it is known generally in a Merkle tree to "pad" an otherwise unbalanced Merkle tree, whereby the tree effectively gets additional leaves added to its right-hand side to 'pad out' the tree to a balanced one. The purpose of padding is to make sure the number of leaves is the smallest possible power of 2, and the padding nodes are added at the end of the leaf layer (right-hand side). It depends on use cases. Sometimes, the padding nodes all have a fixed value, e.g., 0. In most cases, they are just the duplicate of the adjacent value (the left neighbour). It is possible that holes may be represented in the same way. However, conventional padding only refers to padding out the right hand side of a static tree so as to balance the tree (make the number of leaves a power of two). Whereas holes, as disclosed in the present disclosure, can be added in place of an actual content-carrying leaf node that has been removed from a dynamically updated tree. Preferably they can be created in any place in the list of leaves, while padding is only appended at the end. Furthermore, holes act as placeholders which can subsequently be replaced by new content-carrying leaves in further updates to the dynamic tree.
An optional additional feature relates to "generation transactions", also known as "coinbase" transactions (though most generally this does not necessarily imply a financial reward). A generation transaction is a transaction generated when a node 104 of the blockchain network 106 successfully wins the proof-of-work race to have a new block 151 generated and incorporated onto the blockchain 150-sometimes described as a new block being "mined". Typically the generation transaction contains an output that rewards the successful miner with a new portion of a digital asset (e.g. some new Satoshi). However, this is not essential. Either way, in some blockchain protocols, each generation transaction may have an associated maturity condition associated with it, which must be met before its one or more outputs can be consumed. For example the maturity condition may be that a certain block height threshold (e.g. 100 blocks) has been reached since the generation transaction was generated -i.e. a certain number of new blocks have been added to the blockchain. The protocol obeyed by all the nodes 104 of the network 106 will not allow such outputs to be spent until the maturity condition has been reached.
Conventionally, the outputs of generation transactions are only added to the UTXO set once the maturity condition has been met. However, this can be cumbersome to process when trying to maintain an ongoing data set. Consider the case where a generation transaction only matures after 100 blocks. This means that when a new block is found at, say, block height 1000, the party maintaining the UTXO set has to go back and fetch all the blocks the way back to block 900 (either form the blockchain 150 or a copy of these blocks in a local database) in order to verify the newly matured block. In embodiments of the present disclosure on the other hand, the party performing the method (e.g. miner or service provider) may avoid this by including the outputs of unmatured generation transactions in the UTXO set being attested to by the Merkle tree. In addition, the method comprises maintaining a list indicating whether the associated maturity condition has been met for each generation transaction (e.g. the respective block height at the time they were generated). This gives a small efficiency increase as it involves querying a smaller data structure.
In embodiments the list of unmatured transactions and their maturity status may be included in the preimage D of one or more of the leaves 302, so that the maturity status of the unmatured transactions is also attested to. Alternatively the list may simply be kept for internal reference by the party in question (e.g. miner or service provider).
5. EXAMPLE IMPLEMENTATION -UTXO COMMITMENT SCHEMES In this section, there is disclosed a UTXO commitment scheme that combines the following elements: 1. Incremental construction of UTXO set snapshots; 2. Structuring the incremental UTXO set as a partially-populated Merkle tree, termed a 'Holey Merkle Tree' (HMT); and 3. Commitment to a UTXO set snapshot, via a HMT, using an incentive mechanism.
The goal with elements (1) and (2) is to prescribe a deterministic process for determining and updating the state of the UTXO set on a block-by-block basis, such that these updates are efficient to perform. The aim of introducing element (3) is to provide a way for UTXO set snapshots to be produced, shared, and updated by blockchain nodes 104 in a way that requires minimal trust on the behalf of the user.
5.1 Incremental UTXO set snapshots A UTXO snapshot S, is defined as the complete set of spendable outpoints on the blockchain that are unspent at a block of height n, denoted as B. 5.1.1 Coinbase transactions An outpoint that is created in a coinbase transaction can only be spent after 100 blocks. If we do not record them at the time they are created, we need to fetch it in a historical block every time we have a new block. We propose to have an ordered list of outpoints in the UTXO snapshot dedicated to coinbase transactions that are to be matured. We call it the coinbase list.
For a UTXO snapshot STh the leftmost outpoints in the snapshot are reserved for coinbase outpoints and ordered in descending block height. If n > 100, then only the outpoints corresponding to the most recent 100 coinbase transactions are reserved and ordered in descending block height. For example, Sn = ttXithoo txid99111-, ...} = {thaseloo cbaSe99111, cbasei...} Note that, the first 100 transaction IDs does not necessarily imply the first 100 outpoints as one coinbase transaction can have multiple outputs. They are ordered in descending block height for convenience. When a new coinbase transaction is added, it is added to the beginning of the list, it naturally pushes the matured coinbase transaction txidi out of the coinbase list.
The coinbase list can be part of the UTXO set when creating a commitment. For example, they can be represented by a single leaf or multiple leaves. For simplicity, we will assume that the leaves or some of the leaves will be updated by every block.
5.1.2 Delta Set We now define the delta set to encode the changes to the UTXO set between two consecutive blocks or any two blocks.
A delta set D;FI is a collection of outpoints that are either 1. existing members of Sn and spent between Bn+i and 13"±t, or 2. created between Bn±i and 13"+t and still unspent at 13"±t.
The first type of outpoint is included in Dnn+Ti so that we can remove them from Sn. The second type of outpoint is included in Dniti so that we can add them to Sn. We do not need a flag to indicate which type an outpoint is, because the first type has its duplicate in Sn while the second type does not (i.e. we can use the duplication of an outpoint as a marker of its type).
An intermediate set /"n:Fq is a collection of outpoints that are created between.13,±1 and 13"±t and spent between Bn+i and 13n+t. These outpoints are neither included in D+Tf nor covered in Sn. It is optional to keep them for the purpose of defining a UTXO snapshot. It may become important to keep them for the purpose of auditing.
When t = 1, we abbreviate Dnn++1 as Dn+i, and ini++1 as /"Hi. Now we provide an algorithm to construct Dn+i and /n+1.
Given a UTXO snapshot 5, and a block /3"1, we construct a delta set Dn+i, an intermediate set /n+i, and an updated snapshot S1.+1 as follows: 1. Initialise two empty sets D,! = {}.
2. For each transaction in /3"1 a. extract the inputs as outpoints and add them to D b. extract the spendable outputs as outpoints and add them to D 3. For each duplicated element in D, remove both copies from D and add one copy to!.
4. Define Dn+i as D and /n+i as I. 5. Initialise another set S as S = Sn 6. For each outpoint in D11+1, a. if it exists in S, remove it from S. b. otherwise, add it to S. 7. Define 5n+1 as S. We can generalise this to multiple blocks. That is, if we have a snapshot 5, and we want to create a later snapshot S"t, we can gather the blocks 8n+1,Bn+2, ...,Bn+t, and consider it as a larger set of transactions Irin+1,n+t = Urnt+1B 1. We can then apply the same algorithm rin+i,"t to obtain Dnn:Fli, int and Sun. The only difference is the following modification to step 1: Add the spendable outputs of all the coinbase transactions as outpoints to the coinbase list ordered in descending block height in Sin.
5.2 Holey Merkle Tree The second component for generating a deterministic commitment to a UTXO at a point in time is to introduce a holey Merkle tree (HMT).
5.2.1 Example Design of a Holey Merkle Tree The idea here is to represent the UTXO set Sn at a given block Bn of height n as a Merkle tree Tn, whose root Rn can be attested to on the blockchain. See again Figure 6.
Using the notation (Tx/D, 0 to denote a transaction outpoint corresponding to a UTXO, the design of the Merkle tree Tn is as follows: i. The Merkle tree is a balanced, binary Merkle tree.
H. The leaves of the Merkle tree may consist of either: a. UTX05; or b. Holes UTX0s may be represented as either: a. TxID II i pairs; or b. TxID II vect < i > structures, where vect < i > is a Boolean vector indicating whether each UTXO is spent (0) or unspent (1).
c. TxID II array < i > e.g. TxID II [is, iv __id is an array of k output data structures.
Holes may be represented as a zero element (e.g. a 0 bit, or Boolean 'False').
5.2.1.1 Example Conventions The three options presented above for representing UTX05. iii.a), b), and c), respectively offer slightly different properties and trade-offs. Option a) is the most granular version, as a leaf in the Merkle tree is dedicated to each UTXO, while b) and c) both accommodate having multiple UTX0s included in the same leaf of the tree.
Option b) may in some cases generate a larger tree than a) because each leaf can represent multiple UTX05, i.e., less likely to generate a hole, but has the drawback that leaves may remain in the tree while only a single UTXO of a transaction is unspent, meaning that the spend-state of all of the transaction outputs must be kept until all have been spent, whereas a) allows each UTXO to be tracked independently. For example, if a transaction generates two outputs including a 'change' output, the change output may not be spent until a long time after the primary output, meaning we have to store the spend-state for both until the later time when the change output is also consumed. One interesting observation about option b) is that is does capture information about the spent UTXO (STX0) set of the chain, at least temporarily and for some transactions, which may be a benefit.
Option c) is identical to option b), except the objects stored for each UTXO include the full output information, rather than just the Boolean spend-state of each output.
From herein, we consider only case a) in our protocols, but note that each could be adapted easily for either cases b) or c).
In the following embodiments, the following conventions may be employed related to Merkle trees and blockchain transactions: * Items are always indexed from 0 (e.g. the leftmost leaf in a HMT has index 0).
* We denote transaction outpoints by (TxID,i) pairs, where i denotes the ith output (indexed from 0) in the transaction denoted by TxID.
* We denote the ordered array of leaves of a Merkle tree by L, and the Ath leaf of the tree is denoted by L[2].
5.2.1.2 Storage An HMT for a block of height rt, denoted by Tn, can be stored as a nested array, where each inner array represents the ordered set of hashes stored at a particular level of the tree, which are indexed from 0. This can be written as To, = [H0,111, H = 0, H1, H 24[110, H1, ..., H2k_i], where each inner array IHiK = [110, H1, ...,H2k_d represents the set of hashes at the Kth layer in the tree and the final inner array Hk = [R] is simply the root node.
This format for storing the HMT state will become useful when updating the state of the HMT by changing some of the leaves, which is detailed in the following section. When some of the leaves change between HMT states, these changes are propagated throughout the rest of the tree. Any hash in the tree that is a propagator of these changes up the tree will need to be updated.
5.2.2 Updating the Holey Merkle tree In order to define a functional HMT, a deterministic algorithm updateTree() may be specified to allow a particular HMT state, representing a UTXO set Sn at block height re, to be updated when a subsequent block Bn+1 is obtained.
Figure 10 shows an example of a process of updating the HMT state Tn T where changes to leaf nodes are highlighted in bold.
The overall algorithm updateTree() can be used to update the tree from one state T, to another state Tn+1 based on a set of changed leaves that are determined by a blockchain block. This algorithm may be defined as follows.
updateTree(): Tn+1<-updateLeaves(T" Bn+i) Inputs: Tn, /3,±1 1. Extract the ordered set of leaves from Tn as L = L. 2. L111 (-update Leaves(L, Bn+i)- 3. Ti-n+ <-update Hashes(T", Ln+i).
4. Return 72+1.
Outputs: 11,2+1 The algorithm operates by generating the updated set of leaves Ln+i, using the previous set Sn and new block data 8n+1 and applying a sub-algorithm updateLeaves(). The algorithm then uses the updated set of leaves, in conjunction with the previous tree T", to generate the new tree Tn+1 by invoking another sub-algorithm updateHashes(). It should be noted here that the output Tn+1 also includes both Ln+1 and Rn+i, by definition.
This means the enveloping algorithm updateTree() has the effect of updating the HMT over the interval of (at least) one block. We will now detail the two sub-algorithms needed; updateLeaves() and updateHashes().
5.2.2.1 Updating Leaves A sub-algorithm which may be specified is is the updateLeaves() algorithm, which is responsible for taking the leaf set of a current Merkle tree state LTh and updating its state according to the effect of a new block Bn+1.
This sub-algorithm updates the set of leaves from a previous state Lit, representing the UTXO set prior to receiving the block Bn+i, to a new set of leaves L11+1 representing the state of the UTXO set accounting for this new block.
Step 1 initialises a leaf set L that we will manipulate in the algorithm by removing and adding entries corresponding to UTX0s consumed and created respectively in Bn+1.
Step 2 initialises an array A which will store the indices at which changes (i.e. removals and additions) to Lit are made. We call this the changelog array'. This array is necessary to update node hashes later in the updateHashes() algorithm.
Step 3 is a transaction-wise loop that implements a 'remove-then-add' process for each transaction individually, before moving on to the next transaction in the block. The transactions are processed in the order they appear in the block.
Step 3.1 takes the current transaction and removes from L all outpoints consumed by it. The index in L at which each outpoint was previously recorded is inserted into the changelog A. Step 3.2 takes the current transaction and adds each of its outpoints to L at the left-most empty (0) entry position. If there is no empty entry in L, the outpoint is appended to the end of the list. The index in L at which a new outpoint is inserted/appended is recorded in A. Step 4 handles redundant or missing indices in L following step 3, to ensure the tree is balanced and of the correct depth. If L is a power of two, and its HMT will be a perfectly balanced binary tree, then any remaining indices larger than the length of L are removed, which handles the case where the HMT shrinks by one level following step 3. Otherwise, the minimum number of empty elements are appended to L that are required for the number of leaves to reach a power of 2. Once again any indices larger than the new size of the balanced tree are removed. The changelog array A is updated accordingly.
Step 5 the temporary leaf set L becomes the new leaf set 42,1 and both L"i and the changelog A are output by the algorithm.
An example of such an algorithm may be written in full as follows.
updateLeaves(): Lii+1, A <-updateLeaves(Ln, Bn+i) Inputs: Lit, B.",+1 1. Initialise a changelog array A = [ ].
2. For each transaction, in the order they appear in Bn+1: 1. For each input, in the order they appear in the transaction: 1. Extract the outpoint (TxID II i) referenced by the input.
2. Replace the outpoint entry in L with 0.
3. Insert the index of the modified entry of L into A. 2. For each output, in the order they appear in the transaction: 1. Extract the outpoint (TxID II i) of the output.
2. Locate the leftmost 0 element L[A] = 0 in L: S 1. If there is a 0 element L[A] = 0 in L: 1. Insert the outpoint TxID II i at L[A].
2. Insert the index A into A. 2. If there is no 0 element L[A] = 0 in L: 1. Append the outpoint TxID II i to L. 2. Insert the index A = len(L) -1 into A. 3. Compute r = len(L): 1. If 1 k: r = 2k: 1. Remove any indices] satisfying] > r -1 from A. 2. Else: 1. Append the minimum p empty elements 0 to L, such that r + p = 2k.
2. Insert the index r + 1 into A. 3. Remove any indices] satisfying] > r + p -1 from A. 4. Let Lit+1 = L, return Ln+1, A. 20 Outputs: L"i, A 5.2.2.1.1 Discussion There are two different ways in which the updateLeaves() algorithm could be written; namely using a remove-then-add paradigm as shown above, or alternatively using an add-then-remove paradigm.
The remove-then-add option parses a block transaction-wise. This means that the UTX0s consumed by the outputs of one transaction are removed from the leaf set, and then the UTX05 created by the inputs of that transaction are added to the leaf set, both of which occur before moving on to the next transaction in the block. In order to function correctly, this version requires the topological transaction ordering rule (TTOR) to be in effect, as is the case in the BTC and BSV blockchains.
An alternative to this is the add-then-remove option, which is shown in Appendix 5.1, whereby the block is parsed input-wise, and then output-wise. In other words, this option adds all the UTX05 created in the outputs of the new block to the leaf set, and then removes all the UTX0s consumed by the inputs of the block. This paradigm makes no assumption about the order of transactions in a block and is compatible with the canonical transaction ordering rule (CTOR) as is used in the BCH blockchain.
It is preferable to use the remove-then-add option specifically because this approach minimises the overall size (i.e. number of leaves) of the HMT at any given frame in time. This is because updating the leaf set transaction-wise ensures that UTX05 are removed from the Merkle tree at the earliest possible opportunity in the algorithm, which in-turn ensures that the holes in the leaf set are positioned as far left in the ordered set at possible. This means that holes are always filled from as far left as possible, thus minimising the possible number of entries in the leaf set.
5.2.2.2 Updating the Merkle Tree Node Hashes A second sub-algorithm that may be specified is the updateHashes() algorithm. This is responsible for updating the hashes of the previous HMT state Tr, to form the new HMT state Tn+1 based on the changes to the UTXO set caused by the new block B"±i. The main steps are as follows: Step 1 of the updateHashes() algorithm initialises a new tree using the previous state T" as the basis. Note that in the case where the new state has a different size (i.e. different k) then step 1 should accommodate this, either by adding additional empty arrays for new layers or removing arrays from higher layers.
Step 2 obtains the ordered set of updated leaf hashes, denoted Lan+i, based on the changelog indices A and the new leaf set L11+1. This can be done by selecting each element in L"1 at each index stored in A. This set can also be labelled as He = /ALI_ to represent the changes to hashes Ho in the leaf layer (K = 0) of the HMT.
Step 3 updates the hashes in the leaf layer Ho, by replacing elements in Ho with the new elements in He at the corresponding index and saves the resulting Blio.
Step 4 generates a partial Merkle tree using generatePMT() algorithm described in section 2.3, taking the changelog indices A as input. Note that the output is another set of arrays of indices P, which will help to identify nodes that are required to update the tree.
Step 5 is a loop, containing two further loops, which ascends through the remaining layers of the new HMT state Tn+1 up to and including the new root node at layer k. For each layer, this step is to identify hash values that are required to update the next layer above. It combines the hashes Hic_1(P[K -11) of the PMT in the K -1th layer with the updated hashes Hek_1 just calculated in the K -1th layer to create a set of hashes H_1 which is the subset of HK_I. required to calculate the new hashes in HK in the Kth layer.
The first of the smaller loops in step 5 iterates through pairs of elements of H_1 to calculate the changed elements He in layer k. The second smaller loop then updates the values in H, with the changed elements we have just calculated and saves them.
Step 6 finally returns the new tree state Tn+1 as an array of the arrays Ho, H1, ..., Hk just updated in the previous step.
An example of such an algorithm may be written in full as follows.
updateHashes(): Tn+1<-updateHashes(T", L"±i, Inputs: Tn., Ln+1 1. Initialise a new tree state using the previous state T"1 <-T. 2. Use A and L"1 to obtain (for ic = 0) the updated leaf hashes Hg' = LL = [... , ... 0.
3. Update and save Ho in Toi+1 by replacing...,Hii, ... with..., HP__ taken from Re.
4. Generate a partial Merkle tree P generatePMT(A,k). Note k = log2 ten(Ln+i).
5. For layers = 1 to yr = k: i. Obtain the indices P[K -1] = [..., i, ...] from the partial Merkle tree.
H. Obtain HL1 = 511,_1(P[K -1]) U H,act1 = [..., Hi, U [ ] K-1 Initialise an array of changed hashes H. iv. For (m = 0; 2m < IH;c_i I; m = m ± 1): Add the changed hash Itgcs[m] H(H;c_i[m] II H1(m 11), where [HP22], = [Hi+ik_i).
v. For (i = 0; < ± 1): 1. Update Ill K in T11+1 by replacing Hi <- 2. Save H K. 6. Return Toi+i = [Ho, Hk]n+i Outputs: T11+1 5.2.3 How is a partial Merkle tree related to the HMT? Consider a HMT in state To, with leaves Lit, and its next state Tu+i, with leaves 1,2+1. The changelog array A contains all of the indices of elements in L."1 which represent changes (either insertions or deletions) which are to be reflected in generating the new tree To+i from the old tree Tr, as the starting point.
We can identify the proper subset Lain+iof leaf hashes representing the changes. These are the leaves which will propagate updates to internal node hashes, and the root hash throughout the new Merkle tree Tm+1.
Figure 11 shows the relationship between the changelog A, the new set of leaves Li.,±1 and the sub-set of leaves Lan+1 which cause changes to the internal nodes and root hash of the new HMT In the updateHashes() algorithm, the goal is to update the HMT state Tn+1 77" by recomputing the HMT root There are at least two options for achieving this; the first option is simply to recompute the entire Merkle tree using the complete set of new leaves L11+1. The second, and far more desirable option, is to recompute only the internal nodes required to propagate the changes from 4+1 <-L to the new root hash 12,+1. This second option is more computationally efficient than the first, and so we have chosen to implement it in the updateHashes() algorithm.
The hashes required to propagate the changes caused by the new values of leaves L' to the new root Ri2+1 are identical to the partial Merkle tree of L'.
This observation shows that the PMT for a given set of leaves is equivalent to the nodes required to propagate any changes causes by changes in the values of that set of nodes.
Therefore, if we generate the PMT for a set of changed nodes in the leaf layer of a HMT, we obtain the nodes required to propagate the change up to the leaf node. Combining these old hashes specified by the PMT with the new hashes from the same layer, we will be able to calculate the new hashes for the next layer above. This is what step 5 does in the updateHashes(), unifying the set of nodes that have changed in the layer below with the PMT nodes in the layer below to create a set that contains everything required to propagate a given change to the next level of the HMT.
5.3. DISCUSSION 5.3.1 Ordering of L In the example design of the HMT in 5.2.1, we have employed the paradigm that a given set of ordered leaves Li uniquely determines the corresponding HMT denoted 11, which represents the UTXO set at a given block Bi, and its properties. These properties include: 1. the total data size of the tree, 2. the total number of nodes in the tree; and 3. the tree depth.
We have used this fact throughout the update mechanism outlined subsequently in 3.2.1, where we have composed the updateTree() algorithm such that it first invokes the sub-algorithm updateLeaves() before eventually updating the required hash values of the nodes in the Merkle tree 5.3.2 Scenarios for L"i In 5.2.1, it was stated that the HMT design is that of a balanced, binary Merkle tree. This means that the cardinality of the set of leaves L for any HMT instance, corresponding to any UTXO set Sn at any block height, can be expressed as L = 2k, a power of 2 where the power k + 1 denotes the number of layers of the Merkle tree.
When the UTXO set Sn itself cannot be expressed as a power of 2, it can be padded either by repetition of the rightmost leaf data or with zeros, as is commonplace. It should be noted that this padding stage is implemented in the updateLeaves() sub-algorithm of 3.2.2.
6. HASH BOUNTY SCHEME According to another aspect disclosed herein, there is provided a method of incentivizing a party to publicly commit to the membership of a set of data items D. This may be to commit to a set of UTX05, as discussed previously, such as to commit to the current state of a blockchain 150 or the UTX0s on chain plus those pending in a mempool. In this case the method could be performed for example by a third party service provider to incentivize miners (operators of nodes 104) to publicly commit to their view of the UTXO set, or the method could be performed by a miner themselves in order to incentivize other miners.
Alternatively the method could be used to incentivize parties to commit to the membership of a set of any other kind of data items. The scheme could be used to represent anything where the set has elements that may be created and destroyed/consumed. For example, 5,inventory logs may be used to track a set of inventory items that can be either in either a sold' or 'unsold' state. In this case the data items D could represent the inventory items. Another example would be a transparent ticketing system where the ticket-provider wishes to attest to the set of remaining tickets (e.g. seats in a football stadium) as and when they are individually sold. This may help reduce fraud and re-selling in secondary markets by providing a record of which seats were sold at which time, the state of the entire 'stadium' of seats now being mapped to the HMT.
Whatever the set being attested to, the method works as follows. Reference may be made to Figure 8.
At step T10, the challenging party (e.g. service provider) places a hash puzzle in a transaction and has the transaction recorded on a blockchain. In the case where the set of data items D to be attested to comprises a UTXO set, then the blockchain used to publish the hash puzzle could be the same blockchain 150 as the UTXO set relates to, or a different blockchain.
A hash puzzle is a challenge, also called a hash challenge, in the form of a short program placed in a transaction recorded on a blockchain -call this the challenge transaction. For example the program may take the form of a locking script in an output 203 of a transaction in a UTXO-based (or output-based) model. Alternatively the program could take the form of a smart contract in a transaction of an account-based model. Either way, the program comprises a reference hash value, being the hash of a preimage wherein the preimage comprises a reference piece of information which the challenger wishes to challenge a challengee party (e.g. a miner) to prove that they know. The program of the hash puzzle also comprises some code (e.g. script) which will receive a candidate piece of information from the challengee, hash a preimage comprising the candidate piece of information to produce a candidate hash, compare the candidate hash with the reference hash stored in the challenge transaction, and on condition that they match release a reward to the challengee (e.g. a quantity of a digital asset such as some Satoshi).
Hence at step T20, the method comprises the challengee sending a candidate transaction to be recorded on the same blockchain as the challenge transaction. This may comprise the challengee sending or propagating the candidate transaction to one or more nodes 104 of the blockchain network 106. The candidate transaction comprises a pointer to the challenge transaction, along with the candidate piece of information, or the preimage that comprises the candidate piece of information (optionally the candidate piece of information could be combined, e.g. concatenated, with another element such as a salt in order to form the preimage). For example in a UTXO-based (or output-based) model, the candidate piece of information (or preimage) may be included in the unlocking script of an input 202 that points to the output of the challenge transaction that contains the hash puzzle. Or in an account-based model the candidate transaction may comprise a smart contract that comprises the candidate piece of information and references the smart contract of the challenge transaction.
At step T30 the blockchain network 106 attempts to validate the candidate transaction at one or more of the nodes 104 of the blockchain network 106. This comprises executing the hash puzzle, e.g. by running the locking script together with the unlocking script in the case of a UTXO-based (or output-based) model, or by executing the smart contracts in the case of an account-based model. The node 104 executing the puzzle will thus take the candidate piece of information from the candidate transaction, and hash the candidate preimage that comprises the candidate piece of information. If the candidate preimage comprises more data than just the candidate piece of information (e.g. another piece of data concatenated with it), then the additional data may either be received in the candidate transaction or added from the challenge transaction before the candidate preimage is hashed to generate the candidate hash.
At step T40 the node 104 in question checks whether the resulting candidate hash equals the reference hash from the challenge transaction. If so, then at step 150 the candidate transaction is validated and recorded on chain, thus rewarding the challengee for proving knowledge (and publicly attesting to that knowledge) of the candidate piece of information. lion the other hand the candidate hash does not equal the reference hash, then at step 160 the candidate transaction is not validated and instead is rejected by the node 104, and so the challengee does not receive their reward.
S
According to an aspect disclosed herein, a hash puzzle such as described above can be used to incentivize one or more challengee parties to publicly commit to a view of a set of data items D, such as a UTXO set.
To do this, the reference piece of information tested-for by the challenge transaction comprises the Merkle tree root 304 of the set as determined by the challenging party (e.g. service provider). In the case of a UTXO set, this would be the challenger's attestation of their view of the current UTXO set, and the challenge amounts to a challenge to the one or more challengees (e.g. miners) to publicly corroborate one another's view of the UTXO set.
Such a scheme can be used to promote confidence in the state of the chain. The challenger and challengee(s) all use the same algorithm for computing the Merkle tree 300, so this is completely deterministic.
However, there would be an issue if the scheme was deployed in a simple form like this, in that a malicious party (e.g. a rogue miner) could observe the Merkle root supplied by a legitimate challengee -based on the legitimate challengee's own determination of the UTXO set -and the malicious party could then submit a candidate transaction of their own copying the observed Merkle root, and try to get this validated before the legitimate party's transaction.
In embodiments the challenge transaction may additionally require a signature of a specific challengee or challenegees to unlock, such as by means of a pay-to-public-key (P2PK) puzzle or pay-to-public-key-hash (P2PKH) puzzle included in the locking script along with the main hash puzzle as an additional condition for redeeming the output of the challenge transaction. However, this may not be enough assuming the challenger is making the challenge open to multiple challengee parties, as then one of the challengees could still copy the other's answers. For example, the challenegee may publish multiple instances of the challenge transaction on chain, one for each of the potential challengees, each requiring a signature of the respective challengee to unlock. Or the challenger could publish a single challenge transaction on chain that will accept the signature of any of multiple different challengees as alternative conditions for unlocking.
To guard against copying therefore, it is disclosed herein that the hash puzzle should in fact test for two pieces of information: the Merkle root 304 of the attested-to set, and additional piece of information that is specific to a particular challengee. This could be a piece of information which is unique to the challengee, or at least suitably rare or diverse that another challengee is unlikely to be allocated the same value. It could be a piece of information which is known only to the challenger and the challengee, or at least that is practically unlikely or computationally burdensome for another party to guess or uncover. In embodiments the additional piece of information may be allocated randomly to the challengee. In embodiments the additional piece of information may be another one of the nodes 302/303 (e.g. internal note 303) other than the root node 304. In embodiments this may be a randomly allocated one of the nodes (e.g. a random internal node 303). In embodiments each of a plurality of potential challenges (e.g. miners) may be allocated different such pieces of additional information, each specific to the respective challengee -for example each may be allocated a different and/or random one of the nodes (e.g. internal node302) in the tree 300 as their respective piece of additional information. As an alternative example, the piece of additional information could be a secret password or piece of personal information known only to the challenger and the particular challengee.
At step T10 the challenger arranges to have recorded on a blockchain, e.g. 150, a challenge transaction. If the challenger is a party other than a miner such as a third party service provider, this comprises sending the challenge transaction (directly or via an intermediate party) to one or more of the blockchain nodes 104 to be propagated through the network 106 and thereby validated and incorporated into a block 151 on chain. If the challenger is themselves a miner, they could validate the challenge transaction for recordal themselves, and/or propagate to one or more nodes 104 of one or more other miners to validate. Note that the blockchain on which the challenge transaction is recorded could be the same or a different blockchain as that on which the transactions of the previously discussed UTXO set are recorded. For ease of reference it may be assumed in the following that they are the same blockchain, but this is not limiting.
At step T20 the challengee submits to be recorded on chain a candidate transaction which points to the challenge transaction. If the challengee is a party other than a miner, this comprises sending the candidate transaction (directly or via an intermediate party) to one or more of the blockchain nodes 104 to be propagated through the network 106 and thereby validated and incorporated into a block 151 on chain. If the challengee is themselves a miner, they could validate the candidate transaction for recordal themselves, and/or propagate to one or more nodes 104 of one or more other miners to validate.
The challenge transaction comprises code, e.g. a locking script in a UTXO based model or a smart contract in an account-based model, which (steps T30-T40) is run at a node 104 of the blockchain network 106 when a candidate transaction pointing to the challenge transaction is received at the node 104. The code tests for three things in the candidate transaction (e.g. in an unlocking script or smart contract of the candidate transaction): a valid signature of the challengee, the Merkle root 304 attesting to the set of data items D, and the additional piece of information.
Figure 9 schematically illustrates the challenge transaction 152Ch (which is recorded on the blockchain 150 already) and the candidate transaction 152Can (which is to be validated for recordal on the blockchain 150). The candidate transaction 152Can comprises: (e.g. in an input 202) a pointer 951 to the challenge transaction 152Ch (e.g. its transaction ID TXID_Ch); and (e.g. in an unlocking script in the input 202) a signature 952 of the challengee party, a candidate instance 953 of the Merkle root 304 being attested to, and a candidate instance 951 of the additional piece if information (e.g. another of the nodes 303 of the tree which has been assigned to that particular challengee as an additional challenge element). Or more generally, the candidate transaction 152Can may be said to comprise candidate preimage data that comprises at least the candidate instance 953 of the root 304 and the candidate instance 954 of the additional information, wherein the preimage data could optionally comprise one or both of these elements combined with (e.g. concatenated with) some further data.
The challenge transaction 152Ch comprises (e.g. in a locking script in an output 203 of the challenge transaction 152Ch) a short program (e.g. script) 900 is run as part of the process of validating the candidate transaction 152Can. This program or code 900 comprises a first element (e.g. a Checksig) 901 that verifies the signature 952 of the challengee from the candidate transaction 152Can, and a second element (hash puzzle) which verifies the candidate instance 953 of the Merkle root and the candidate instance 954 of the additional piece of information from the candidate transaction 152.
The hash puzzle 902 comprises reference hash result data. In embodiments, the hash puzzle 902 comprises two separate constituent hash puzzles. In that case, the reference hash result data comprises two separate hash values: a first hash value h1, which was determined by taking a hash of a reference instance of the Merkle root 304 (or more generally preimage data that comprises at least the reference instance of the root); and a second hash value h2 which was determined by taking a hash of a reference instance of the piece of additional information (or more generally preimage data that comprises at least the reference instance of the additional information). A first one of the constituent hash puzzles comprises the first hash value h1, along with code which will hash the candidate instance 953 of the Merkle root 304 from the candidate transaction 152 (or more generally the candidate preimage data that comprises the candidate instance 953 of the Merkle root 304), and check whether this equals h1. A second one of the constituent hash puzzles comprises the second hash value h2, along with code which will hash the candidate instance 954 of the piece of additional information from the candidate transaction 152Can (or more generally the candidate preimage data that comprises the candidate instance 954 of the additional information), and check whether this equals h2. On condition that both checks are passed, as well as the signature 952 being verified, then the candidate transaction 152Can is validated for recordal in a block 151 on the blockchain 150 (step T50), and the challengee is assigned the reward defined in the challenge transaction 152Ch.
In an alternative implementation, the hash puzzle 902 comprises a single combined hash puzzle. In this case the reference hash result data in the hash puzzle in the challenge transaction 152Ch comprises a combined hash value h, having been determined by taking a hash of reference preimage data that comprises a combination (e.g. concatenation) of the reference instance of the Merkle root 304 and the reference instance of the piece of additional information (and optionally further data). In this case, the hash puzzle 902 will hash candidate preimage data that comprises the reference instance 953 of the Merkle root 304 and the refence instance 954 of the additional information combined in the same way (e.g. concatenated) as the reference instances were to produce h. On condition that the result matches h, and the signature is verified, then the candidate transaction 152Can is validated for recordal in a block 151 on the blockchain 150, and the challengee is assigned the reward defined in the challenge transaction 152Ch.
In embodiments, the challenger (e.g. third-party service provider) may publish multiple challenges on the blockchain 150, each along the lines discussed above, but each requiring the signature of a different potential challengee and a different respective piece of additional information 954 specific to the respective challengee. For example, this could be done by including multiple different challenge transactions152Ch on chain, each with a different puzzle 902 requiring a different piece of additional information 954 specific to the respective challengee; or by including the different hash puzzles 902 in different outputs 203 of the same challenge transaction 152Ch; or a combination of these two approaches (some puzzles 902 in different transactions 152, some in different output 203 of the same transaction).
This way all the challengees (e.g. miners) are incentivized to share their public attestation as to the membership of the set of data items D (e.g. UTX05), but no one challengee can perform a replay attack by observing the Merkle root 953 in another's pending candidate transaction 152Can, because the first challengee's reward will be tied to a different piece of additional information 954 than the other's. For instance, in the example where the challengees are miners attesting to the UTXO set, and additional piece of information 954 for each challengee is a respective randomly allocated non-root node 302/303 (e.g. internal node 303), then typically there might be expected to be of the order of 10 miners and 1000 nodes in the tree 300. Therefore the chance of a malicious miner being allocated the same node 302/303 as another miner is quite small.
7. EXAMPLE IMPLEMENTATION -COMMITTING TO UTXO STATE In earlier sections, we have detailed how the UTXO set at a particular block height can be treated as a snapshot of the state of the chain at that block height. We have also shown how this snapshot can be efficiently modelled using a HMT structure, which can be deterministically and efficiently updated to reflect changes to the UTXO set state as new blocks are produced.
This model has significant utility when applied to a commitment scheme for the state of the UTXO set. Such a scheme, whereby the UTXO set state is reliably attested at regular intervals, would enable lightweight and timely bootstrapping of Bitcoin nodes (and other third parties) to the network.
Typically, when a node joins the network at the 'current' chaintip, it is unable to validate new transactions until it has obtained a trustworthy copy of the UTXO state snapshot at that chaintip; we term this process 'bootstrapping' the node. This is trivial in the case where the node to be bootstrapped already has a peer in the Bitcoin network that it trusts, in that it can simply obtain a copy of the chaintip UTXO snapshot from that trusted peer.
However, the challenge becomes much greater in the absence of a trusted peer. In the worst case, the bootstrapping node will have to perform a full blockchain synchronisation, often referred to as an initial block download (IBD), whereby the node must obtain the entire history of the chain and replay it in sequence to re-derive the current UTXO state independently. This is a resource-intensive and time-consuming process, which also prevents the node from producing new blocks and validating new transactions until the bootstrapping is completed.
In this section, there is disclosed a series of UTXO commitment schemes that solve (or at least improve) the bootstrapping problem. The scheme in section 7.1 provides a simple method for miners to attest to the UTXO state at a given block height as an additional service and can be implemented by either an individual or collective of nodes.
In section 7.2, we separately introduce a bounty-based scheme. In this, a third party UTXO set service provider can offer a bounty for anybody to redeem by successfully attesting to the UTXO set snapshot of a previous block.
In section 7.3 we combine these two mechanisms into a hybrid scheme that collectively enables incentives for all miners to continuously calculate, verify and attest UTXO snapshots as the blockchain grows over time.
In all of these schemes, UTXO snapshots may be modelled using the HMT defined earlier, due to its efficient-update property.
7.1 Miner Commitments In a simple case, the miners that secure the blockchain network can attest to the UTXO set state of any block that they produce. The attestation can simply be in the form of the Merkle root for the HMT of the previous block, which can be included in the coinbase (or another) transaction in the block being produced by the miner.
Consider a miner M1 who is attempting to mine block B. The full procedure for M1 to commit to the UTXO state of the previous block Bn_i, represented by the HMT root Rn_1, is as follows.
1) M1 independently obtains its copy of the previous block's HMT root Rn_1 at the start of a new block interval (i.e., once 8n_1 has been mined).
2) 141 creates a coinbase transaction CT x, which includes hn_1 = H(Rn_i) as a data push.
3) M1 builds a candidate block B" on top of B"_1 based on the transactions they have seen during the interval, which includes CTx".
4) If M1 successfully finds a valid proof-of-work and publishes B before another miner finds an alternative valid block, then CTx" becomes the on-chain attestation of 8,1, containing the commitment hn_1.
Note that this process only allows for the snapshot of a given block to be committed in a subsequent block i.e. the UTXO snapshot for the (n -1)th block can only be committed in the nth block at the earliest. In fact, it is not possible to commit to Rn in block Bn because Rn, is effectively an accumulator of all of the other transaction IDs in Bn, and if any of these transactions (including the coinbase) were to contain R, then we would have introduced a circular hash reference, which is not possible.
This process can be performed by a single miner for each block they successfully mine and can be extended such that the miner includes the commitments of all blocks since they last successfully mined a block. It would also be feasible to allow a delay in publishing block hashes such that all network nodes are likely to have been able to calculate the commitment; for example, if the UTXO state of fin were to be included in B11+10, this gives an interval of 10 blocks in which miners can compute the required commitment H(R"). It would be possible to include the attestation R, in block B, if we were to change the definition of the UTXO snapshot to exclude the coinbase outpoints of CTx". This would allow R, to be committed within CTx" and would not stop any future transactions in CTx",i to CTx11+99 from being mined due to the coinbase maturity rule. The outpoints of CT x" would still need to be accounted for somewhere in this 100-block window in order to ensure that the UTXO set state is accurate after CTx11±99.
Another variation of this method might involve adding the UTXO snapshot commitment to a transaction in the relevant block other than the coinbase. We may assume in this case that the transaction would be signed by a well-known public key, and possibly linked to a MinerlD. Moreover, it may be desirable for many other miners on the network to add attestations to the same UTXO snapshot in this way, such that trust can be built up in the validity of the snapshot.
7.1.1 Bootstrapping process Given that miners are regularly committing to UTXO set snapshots in this manner, a node attempting to bootstrap from B, can simply do the following: 1) Obtain the list of block headers B0,B1,...,B, to verify the proof-of-work and integrity of the current chain tip B. 2) Obtain the coinbase transaction CT x11 for the chaintip Bn, and the corresponding Merkle proof F(CTxn).
3) Verify the Merkle proof F(CTxn) to check the integrity of CTxn.
4) If it passes, extract the commitment R"1 for the previous block B"1.
5) Obtain the full HMT Tn_1, verifying its root is equal to R,1, and the transaction set for B. 6) Execute updateTree(Tn_i,Bn) to obtain the UTXO set state at the chaintip R. 7) Validate incoming transactions and subsequent blocks as normal, independently validating the rest of the chain history in the background/parallel if desired.
We have assumed in the above that the delay in publishing UTXO snapshot commitments is only a single block, and that the commitments are made in the coinbase transaction of each block, rather than some arbitrary transaction.
7.2 Hash-based bounty scheme The previous scheme suffers from the requirement of trust in the miner, or group of miners, committing to the correct UTXO set snapshot as the chain evolves. In addition, the method is proposed as an additional 'free' service by Bitcoin miners, which means the outcomes may be less reliable due to the lack of a strong incentive to perform the task correctly.
We hereby propose an alternative scheme based on hash puzzle bounties for the required commitment information. This scheme introduces a well-motivated third-party Charlie, such as a regulator, government body, or payment processor, to provide the required bounty aimed to incentivise the mining nodes to generate hash commitments. We call Charlie a 'UTXO set service provider' for the remainder of this section.
The protocol has two phases, (i) a set-up by Charlie, followed by (ii) a fulfilment phase carried out by one or more miners. Consider the scenario where we want to incentivise miners to commit to the UTXO snapshot of a mined block Bn. The set-up phase is as follows: 1. Derive the UTXO set Sr, for block B. 2. Construct the HMT of the UTXO set to obtain the Merkle root, Rn.
3. Hash R, to get h.n.
4. Construct a bounty transaction for each known miner, locking funds with a hash puzzle requiring R11 to be redeemed. These bounty transactions can be mined from /3,±1 onwards.
Each bounty transaction should be bound to a particular miner. One way to achieve this is to make sure that the spendable bounty output also requires a signature from a key-pair belonging to the Miner ID associated with that miner, e.g. a signature that can be verified by PKA/t" for a miner Alice, where PKA/t" is a public key that is linked to Alice's Miner ID in some way.
However, it is conceivable that these bounty transactions would still be subject to a replay attack, whereby the first miner to redeem their bounty would reveal Rn, which the remaining miners simply copy to redeem their own bounties. One solution to this is to also include a second puzzle in each bounty transaction, such that each secondary puzzle is unique to a particular miner. This secondary puzzle could simply be a hash puzzle requiring the redeemer to provide the hash value of a randomly-chosen node on the HMT. Including this addition, our bounty locking scripts would take the form: [Hash Puzzle (R11)] [HashPuzzle (RandNode,)][Checksig PIC] where i denotes the ith miner in the bounty scheme.
Once the bounty transactions have been broadcast, each of the miners in the scheme can perform the following steps to redeem the bounty: 1. Derive the UTXO set ST, for block B. 2. Construct the HMT of the UTXO set to obtain the Merkle root, Rn.
3. Hash R, to get It,.
4. Claim the bounty transaction by providing Rn, the specified node RandNodei and a signature Sig PKi verifiable by PKi in a transaction that spends the output of the ith bounty transaction.
This scheme essentially allows all the participating miners to be able to earn a bounty as reward for successfully committing to the correct UTXO snapshot.
7.2.1 Bootstrapping procedure Consider the above hash-based bounty scheme is in place, where at least one bounty transaction TxnEcuntY for UTXO snapshot commitment R, and at least one corresponding bounty redemption transaction Tx,Redeem have both been broadcast for inclusion in block Bn+i, We also assume that the miners' public keys are known and linked to MinerlD, and that the public key PiCcnanie of the UTXO set service provider Charlie is well-known.
A node attempting to bootstrap from Bn can do the following: 1. Obtain the list of block headers Bo, B1, ...,B" to verify the proof-of-work and integrity of the current chain tip Bn.
2. Obtain the bounty and redemption transactions TxnBounty,TxThRed., e.g. from the mempool.
3. Verify that the Trgecteern satisfies the unlocking script of Tx"B° ntY by providing R. 4. Check that the signature in the input of Tx.gedeem is valid against PK Charlie* 5. Obtain the full HMT Tn for the chaintip B", and verify its root is equal to R. 6. Validate incoming transactions and subsequent blocks as normal, independently validating the rest of the chain history in the background/parallel if desired.
After step 5 is completed, the bootstrapping node has the UTXO set snapshot of the chaintip, and trusts it is correct because it is consistent with the hash puzzle provided by the trusted UTXO set provider Charlie, which contains H(R).
7.3. Competitive scheme The bounty scheme of section 7.2 is useful in providing some financial incentive for miners to commit to UTXO snapshots, making it an improvement over the scheme in section 7.1.
However, the previous scheme still only offers a relatively weak incentive for most miners to calculate the commitment correctly, because once a single miner reveals the result it can be easily replayed. Adding a bespoke hash puzzle for a random node ensures that even these miners will do some work, but they are still unmotivated to perform the full computation of Rn independently.
The cause of this weakness is that the scheme in section 7.2 is inherently uncompetitive. Hence, we now disclose a competitive variant of the scheme. This scheme rewards only the first miner to correctly produce the UTXO commitment of the relevant block, which creates a race between all the miners to successfully find the correct solution.
Consider the same scenario as in section 7.2, but with exactly m participating miners. The modified set-up phase is as follows: 1. Derive the UTXO set Si-, for block B. 2. Construct the HMT of the UTXO set to obtain the Merkle root, R. 3. Hash Rn to get hn.
4. Generate a random token a (e.g. a 256-bit string).
S. Construct a single bounty transaction T4"nt37, locking funds with a hash puzzle requiring both Rn and a to be redeemed. This bounty transaction can be mined from Bn±i onwards.
6. Broadcast T xnBount-v publicly and send a privately to each of the m miners.
The inclusion of the token a ensures that the reward will be earned by only a participating miner. It would be possible to make this a completely public bounty, whereby anybody could commit to the UTXO snapshot. However, the benefit of limiting this to only miners is that their commitments are in a sense backed by their reputation, which is itself associated with their accumulated proof of work, amongst other things. In other words, it is important for most users that the miners themselves attest to the UTXO state correctly.
The process a miner follows to redeem the bounty may be exactly as in section 7.2, with the only difference being that each miner will attempt to publish their own redemption transaction in their own block. If we assume that all miners are calculating R" correctly and as quickly as possible, then the winner of the bounty is simply determined by which miner successfully creates the next block first, effectively piggy-backing the native proof of work consensus mechanism to decide the winner.
The simple case of just using the alpha token as an extra hash puzzle is still vulnerable to the replay attack (an unscrupulous miner can copy another's answer) -in this case the alpha token just serves as an extra knowledge proof.
In a different implementation, a more secure use of an alpha token (i.e. unique info known only to a given miner) may be achieved by, e.g., requiring that the signing key is a combined key using alpha; i.e. P = (SminerA Saipha) * G, whereis saw ha _ another private key derived from alpha, such as Scapha = alpha, or Scapha = alpha mod ii, or Saipha H( alpha). In another variant the script could require both the miner signature for PintwA and a second R-puzzle signature, where the 'r' in the puzzle is derived from the miner's alpha.
7.3.1. Bootstrapping The bootstrapping process for a node joining the network may also be identical to section 7.2, other than the fact that there will only be one bounty redemption transaction that can be obtained.
7.3.2. Extension -Multiple layers and miners In this scenario, we currently have one bounty per UTXO snapshot, which can only be won by a single miner. This has the benefit of creating competition to provide the UTXO snapshot, which improves the likelihood that it will be correct and provided quickly. However, we can extend this model to achieve two additional benefits: Commit to lower layers of the HMT, providing more information about it to bootstrapping nodes; and Incentivise multiple different miners to commit to the same UTXO snapshot.
These two benefits allow users to accumulate confidence in the correctness of the UTXO snapshot commitment, and we can aim to reach a minimum threshold (e.g. simply majority) of network hash power that commits to the same root.
The modification that allows us to achieve this may be summarised as follows: 1) The UTXO set provider operates multiple bounty 'rounds' per UTXO snapshot that requires commitments.
2) The winner of the first round receives the highest reward, for attesting to the HMT root R. 3) Each subsequent round offers a decreasing bounty reward, for attesting to a node(s) one layer down in the HMT.
4) The winner of each round is removed from the participation set of miners for the next round, which is achieved by the UTXO service provider distributing a fresh a token to the remaining participants in each round only.
5) Rounds continue until either there are no more miners, or no more nodes in the HMT to commit to.
By making these modifications, we have created a bounty system that should result in multiple miners' commitment to different parts of the same HMT for a given UTXO snapshot. Each subsequent commitment, at a lower level in the HMT, can be checked against the previously-revealed higher layers in order to prove that all of the commitments are part of the same snapshot. Therefore, each round effectively provides a fresh commitment to the same snapshot but from a different miner, which builds confidence in the snapshot itself. In addition, this has the benefit of revealing more of the full HMT that a bootstrapping node needs to fetch anyway, and all these commitments can be made available to the new miner via the mempool during the block interval when they join.
8. FURTHER REMARKS
S
Other variants or use cases of the disclosed techniques may become apparent to the person skilled in the art once given the disclosure herein. The scope of the disclosure is not limited by the described embodiments but only by the accompanying claims.
For instance, some embodiments above have been described in terms of a bitcoin network 106, bitcoin blockchain 150 and bitcoin nodes 104. However it will be appreciated that the bitcoin blockchain is one particular example of a blockchain 150 and the above description may apply generally to any blockchain. That is, the present invention is in by no way limited to the bitcoin blockchain. More generally, any reference above to bitcoin network 106, bitcoin blockchain 150 and bitcoin nodes 104 may be replaced with reference to a blockchain network 106, blockchain 150 and blockchain node 104 respectively. The blockchain, blockchain network and/or blockchain nodes may share some or all of the described properties of the bitcoin blockchain 150, bitcoin network 106 and bitcoin nodes 104 as described above.
In preferred embodiments of the invention, the blockchain network 106 is the bitcoin network and bitcoin nodes 104 perform at least all of the described functions of creating, publishing, propagating and storing blocks 151 of the blockchain 150. It is not excluded that there may be other network entities (or network elements) that only perform one or some but not all of these functions. That is, a network entity may perform the function of propagating and/or storing blocks without creating and publishing blocks (recall that these entities are not considered nodes of the preferred bitcoin network 106).
In other embodiments of the invention, the blockchain network 106 may not be the bitcoin network. In these embodiments, it is not excluded that a node may perform at least one or some but not all of the functions of creating, publishing, propagating and storing blocks 151 of the blockchain 150. For instance, on those other blockchain networks a "node" may be used to refer to a network entity that is configured to create and publish blocks 151 but not store and/or propagate those blocks 151 to other nodes.
Even more generally, any reference to the term "bitcoin node" 104 above may be replaced with the term "network entity" or "network element", wherein such an entity/element is configured to perform some or all of the roles of creating, publishing, propagating and storing blocks. The functions of such a network entity/element may be implemented in hardware in the same way described above with reference to a blockchain node 104.
It will be appreciated that the above embodiments have been described by way of example only. More generally there may be provided a method, apparatus or program in accordance with any one or more of the following Statements.
Statement 1. A method comprising: determining a set of unconsumed outputs of blockchain transactions recorded on and/or submitted for recordal on a blockchain; and forming a hash tree comprising a plurality of layers arranged in a hierarchy from a leaf layer up to a root layer with one or more internal layers in between, wherein the root layer comprises a root node, and each internal layer and the leaf layer each comprise a respective plurality of nodes with each node having an indexed position within its respective layer, and wherein at each layer but the leaf layer, each node within the layer comprises a hash of a combination of a different corresponding subset of at least two of the nodes at the next layer down in the hierarchy; wherein at least some of the leaf nodes can each either A) designate a corresponding subset of one or more of the unconsumed outputs or B) comprise a placeholder value; and the method further comprises, on each of one or more occasions: I) determining an update to the set of unconsumed outputs, each update comprising an addition of one or more new outputs and/or removal of one or more outputs which have now been consumed, and II) recomputing the hash tree to reflect the update; wherein on at least one of the occasions the update comprises removal of at least one output and on at least one of the occasions the update comprises the addition of at least one new output. On each occasion the recomputing of the hash tree comprises: when any output is removed, removing the leaf node designating the removed output and replacing the removed leaf node, at the same position in the leaf layer, with a new leaf node which either: A) designates an updated subset of one or more unconsumed outputs not including the removed output, or B) comprises a placeholder value; and when any new output is added, forming a new leaf node designating a subset of one or more unconsumed outputs including at least the new output, wherein the new leaf node is inserted in the tree in place of a replaceable leaf node when available, at the same position in the leaf layer, a replaceable leaf node being a leaf node that either: A) designates an output being removed, or B) comprises a placeholder value; but otherwise if no replaceable leaf node is available, the tree is expanded to accommodate the new leaf node.
Statement 2. The method of statement 1, wherein each node designating a corresponding subset of one or more unconsumed outputs designates its corresponding subset by comprising a hash of a corresponding leaf preimage, the leaf preimage comprising an indication of the corresponding subset of one or more unconsumed outputs.
Statement 3. The method of statement 1 or 2, wherein on at least one occasion the recomputing of the hash tree on at least one occasion comprises only recomputing the hashes of subtree of the hash tree affected by the update.
Statement 4. The method of statement 1, 2 or 3, wherein the unconsumed outputs include one or more outputs of one or more generation transactions generated when a new block is mined onto the blockchain, wherein each generation transaction has an associated maturity condition which must be met before its one or more outputs can be consumed, and wherein the method comprises maintaining a list indicating whether the associated maturity condition has been met for each generation transaction.
Statement S. The method of any preceding statement wherein on each occasion the recomputing of the hash tree comprises: first removing any and all leaf nodes designating consumed outputs from the set, then inserting of any and all leaf nodes designating new unconsumed outputs.
Statement 6. The method of any preceding statement, wherein on at least one occasion, the recomputing of the hash tree comprises shrinking the tree by removing a layer if, after the update, the tree would otherwise contain a redundant number of layers for the number of leaf nodes that actually designate UTX0s.
Statement 7. The method of any preceding statement, comprising having recorded on said blockchain or another blockchain a challenge transaction which comprises code configured to: receive and verify a signature of a challengee party from a candidate transaction submitted for recordal on the same blockchain as the challenge transaction; also receive from the candidate transaction candidate preimage data comprising a candidate instance of the root node and a candidate instance of an additional piece of information specific to the challengee party; perform a hash operation on the candidate preimage data, thereby producing a candidate hash result; verify whether the candidate hash result matches a reference hash result included in the challenge transaction, the reference hash result being a result of applying the same hash operation to reference preimage data comprising a reference instance of the root node and a reference instance of the additional piece of information; and allow the candidate transaction to be validated for recordal on the same blockchain as the challenge transaction and thereby redeem a reward specified in the challenge transaction on condition that the signature and the candidate hash result are verified by said verifying steps.
Statement 8. The method of statement 7, wherein the other piece of information comprises an identified one of the nodes in one of the internal layers.
Statement 9. The method of any preceding statement, comprising, by a publishing party following the re-computation on at least one of the occasions, publishing the root node or an attestation of the root node as a public commitment to the set of unconsumed outputs.
Statement 10. The method of statement 9, wherein the publishing comprises including the root node or attestation in a commitment transaction recorded on the blockchain.
Statement 11. The method of statement 9 or 10, wherein the publishing comprises publishing the attestation wherein the attestation comprises a hash of a commitment preimage, the commitment preimage comprising the root node.
Statement 12. The method of any of statements 9 to 11, wherein the publishing party is a service provider providing a commitment service to miners of the blockchain transactions and/or other parties.
Statement 13. The method of statement 12, further comprising, on each occasion: determining a delta set comprising an indication of only the outputs being removed or added in the update, not other outputs that are not being removed or added; and making the delta set available to one or more of the miners as part of a subscription to the updates.
Statement 14. The method of any of statements 7 to 11, wherein the method is performed by a miner of the blockchain transactions to commit to the miner's view of the set of unconsumed outputs.
Statement 15. The method of any preceding statement, wherein the set of unconsumed outputs is the set of unconsumed outputs of only the blockchain transactions recorded in blocks on the blockchain, not blockchain transactions merely submitted for recordal but not yet validated.
Statement 16. The method of statement 15, wherein each of said occasions is when a new block is mined onto the blockchain.
Statement 17. The method of any of statements 1 to 14, wherein the method is performed by a miner and the set of unconsumed outputs comprises transactions in a pool of transactions submitted for recordal on the blockchain but not yet validated.
Statement 18. The method of statement 17, wherein said occasions are in response to one or more new blockchain transactions being submitted into the pool, or are periodic.
According to another aspect disclosed herein, there may be provided a method comprising the actions of Computer equipment comprising: memory comprising one or more memory units; and processing apparatus comprising one or more processing units, wherein the
S
memory stores code arranged to run on the processing apparatus, the code being configured so as when on the processing apparatus to perform the method of any preceding statement.
According to another aspect disclosed herein, there may be provided a computer program embodied on computer-readable storage and configured so as, when run on one or more processors, to perform the method of any of statements 1to 18.
Other variants or applications of the presently disclosed techniques may become apparent to a person skilled in the art once given the disclosure herein. The scope of the present disclosure is not limited by the described embodiments but only by the appendant claims.

Claims (20)

  1. CLAIMS1. A computer-implemented method comprising: determining a set of unconsumed outputs of blockchain transactions recorded on and/or submitted for recordal on a blockchain; and forming a hash tree comprising a plurality of layers arranged in a hierarchy from a leaf layer up to a root layer with one or more internal layers in between, wherein the root layer comprises a root node, and each internal layer and the leaf layer each comprise a respective plurality of nodes with each node having an indexed position within its respective layer, and wherein at each layer but the leaf layer, each node within the layer comprises a hash of a combination of a different corresponding subset of at least two of the nodes at the next layer down in the hierarchy; wherein at least some of the leaf nodes can each either A) designate a corresponding subset of one or more of the unconsumed outputs or B) comprise a placeholder value; and the method further comprises, on each of one or more occasions: I) determining an update to the set of unconsumed outputs, each update comprising an addition of one or more new outputs and/or removal of one or more outputs which have now been consumed, and II) recomputing the hash tree to reflect the update; wherein on at least one of the occasions the update comprises removal of at least one output and on at least one of the occasions the update comprises the addition of at least one new output; and wherein on each occasion the recomputing of the hash tree comprises: - when any output is removed, removing the leaf node designating the removed output and replacing the removed leaf node, at the same position in the leaf layer, with a new leaf node which either: A) designates an updated subset of one or more unconsumed outputs not including the removed output, or B) comprises a placeholder value; and - when any new output is added, forming a new leaf node designating a subset of one or more unconsumed outputs including at least the new output, wherein the new leaf node is inserted in the tree in place of a replaceable leaf node when available, at the same position in the leaf layer, a replaceable leaf node being a leaf node that either: A) designates an output being removed, or B) comprises a placeholder value; but otherwise if no replaceable leaf node is available, the tree is expanded to accommodate the new leaf node.
  2. 2. The method of claim 1, wherein each node designating a corresponding subset of one or more unconsumed outputs designates its corresponding subset by comprising a hash of a corresponding leaf preimage, the leaf preimage comprising an indication of the corresponding subset of one or more unconsumed outputs.
  3. 3. The method of claim 1 or 2, wherein on at least one occasion the recomputing of the hash tree on at least one occasion comprises only recomputing the hashes of subtree of the hash tree affected by the update.
  4. 4. The method of claim 1, 2 or 3, wherein the unconsumed outputs include one or more outputs of one or more generation transactions generated when a new block is mined onto the blockchain, wherein each generation transaction has an associated maturity condition which must be met before its one or more outputs can be consumed, and wherein the method comprises maintaining a list indicating whether the associated maturity condition has been met for each generation transaction.
  5. 5. The method of any preceding claim, wherein on each occasion the recomputing of the hash tree comprises: - first removing any and all leaf nodes designating consumed outputs from the set, - then inserting of any and all leaf nodes designating new unconsumed outputs.
  6. 6. The method of any preceding claim, wherein on at least one occasion, the recomputing of the hash tree comprises shrinking the tree by removing a layer if, after the update, the tree would otherwise contain a redundant number of layers for the number of leaf nodes that actually designate UTX0s.
  7. 7. The method of any preceding claim, comprising having recorded on said blockchain or another blockchain a challenge transaction which comprises code configured to: - receive and verify a signature of a challengee party from a candidate transaction submitted for recordal on the same blockchain as the challenge transaction; - also receive from the candidate transaction candidate preimage data comprising a candidate instance of the root node and a candidate instance of an additional piece of information specific to the challengee party; - perform a hash operation on the candidate preimage data, thereby producing a candidate hash result; - verify whether the candidate hash result matches a reference hash result included in the challenge transaction, the reference hash result being a result of applying the same hash operation to reference preimage data comprising a reference instance of the root node and a reference instance of the additional piece of information; and - allow the candidate transaction to be validated for recordal on the same blockchain as the challenge transaction and thereby redeem a reward specified in the challenge transaction on condition that the signature and the candidate hash result are verified by said verifying steps.
  8. 8. The method of claim 7, wherein the other piece of information comprises an identified one of the nodes in one of the internal layers.
  9. 9. The method of any preceding claim, comprising, by a publishing party following the re-computation on at least one of the occasions, publishing the root node or an attestation of the root node as a public commitment to the set of unconsumed outputs.
  10. 10. The method of claim 9, wherein the publishing comprises including the root node or attestation in a commitment transaction recorded on the blockchain.
  11. 11. The method of claim 9 or 10, wherein the publishing comprises publishing the attestation wherein the attestation comprises a hash of a commitment preimage, the commitment preimage comprising the root node.
  12. 12. The method of any of claims 9 to 11, wherein the publishing party is a service provider providing a commitment service to miners of the blockchain transactions and/or other parties.
  13. 13. The method of claim 12, further comprising, on each occasion: - determining a delta set comprising an indication of only the outputs being removed or added in the update, not other outputs that are not being removed or added; and - making the delta set available to one or more of the miners as part of a subscription to the updates.
  14. 14. The method of any of claims 7 to 11, wherein the method is performed by a miner of the blockchain transactions to commit to the miner's view of the set of unconsumed outputs.
  15. 15. The method of any preceding claim, wherein the set of unconsumed outputs is the set of unconsumed outputs of only the blockchain transactions recorded in blocks on the blockchain, not blockchain transactions merely submitted for recordal but not yet validated.
  16. 16. The method of claim 15, wherein each of said occasions is when a new block is mined onto the blockchain.
  17. 17. The method of any of claims 1 to 14, wherein the method is performed by a miner and the set of unconsumed outputs comprises transactions in a pool of transactions submitted for recordal on the blockchain but not yet validated.
  18. 18. The method of claim 17, wherein said occasions are in response to one or more new blockchain transactions being submitted into the pool, or are periodic.
  19. 19. Computer equipment comprising: memory comprising one or more memory units; and processing apparatus comprising one or more processing units, wherein the memory stores code arranged to run on the processing apparatus, the code being configured so as when on the processing apparatus to perform the method of any preceding claim.
  20. 20. A computer program embodied on computer-readable storage and configured so as, when run on one or more processors, to perform the method of any of claims 1 to 18.
GB2201967.3A 2022-02-15 2022-02-15 Attesting to a set of unconsumed transaction outputs Pending GB2615598A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB2201967.3A GB2615598A (en) 2022-02-15 2022-02-15 Attesting to a set of unconsumed transaction outputs
PCT/EP2023/050839 WO2023156102A1 (en) 2022-02-15 2023-01-16 Attesting to a set of unconsumed transaction outputs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2201967.3A GB2615598A (en) 2022-02-15 2022-02-15 Attesting to a set of unconsumed transaction outputs

Publications (2)

Publication Number Publication Date
GB202201967D0 GB202201967D0 (en) 2022-03-30
GB2615598A true GB2615598A (en) 2023-08-16

Family

ID=80820858

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2201967.3A Pending GB2615598A (en) 2022-02-15 2022-02-15 Attesting to a set of unconsumed transaction outputs

Country Status (2)

Country Link
GB (1) GB2615598A (en)
WO (1) WO2023156102A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240184759A1 (en) * 2022-12-06 2024-06-06 Beijing Volcano Engine Technology Co., Ltd. Data processing method and apparatus based on merkle tree

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117479235B (en) * 2023-12-28 2024-03-19 中通信息服务有限公司 Scheduling management method and system for terminal network facilities

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190103973A1 (en) * 2017-09-29 2019-04-04 R3 Ltd. Hash subtrees for grouping components by component type
WO2019120320A2 (en) * 2019-03-28 2019-06-27 Alibaba Group Holding Limited System and method for parallel-processing blockchain transactions
GB2592338A (en) * 2019-07-25 2021-09-01 Nchain Holdings Ltd Digital contracts using blockchain transactions

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190103973A1 (en) * 2017-09-29 2019-04-04 R3 Ltd. Hash subtrees for grouping components by component type
WO2019120320A2 (en) * 2019-03-28 2019-06-27 Alibaba Group Holding Limited System and method for parallel-processing blockchain transactions
GB2592338A (en) * 2019-07-25 2021-09-01 Nchain Holdings Ltd Digital contracts using blockchain transactions

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240184759A1 (en) * 2022-12-06 2024-06-06 Beijing Volcano Engine Technology Co., Ltd. Data processing method and apparatus based on merkle tree

Also Published As

Publication number Publication date
GB202201967D0 (en) 2022-03-30
WO2023156102A1 (en) 2023-08-24

Similar Documents

Publication Publication Date Title
US20220278859A1 (en) Digital contracts using blockchain transactions
US20230394063A1 (en) Merkle proof entity
WO2023156102A1 (en) Attesting to a set of unconsumed transaction outputs
US20230275770A1 (en) Pseudo-random selection on the blockchain
KR20230101883A (en) Merkle Proof Entity
US20240205030A1 (en) Uniform resource identifier
GB2615752A (en) Attesting to membership of a set
WO2023160921A1 (en) Data exchange attestation method
WO2023072778A1 (en) Sharded merkle tree
WO2024099693A1 (en) Blockchain transaction
EP4413688A1 (en) Implementing a layer 2 token protocol using a layer 1 blockchain
WO2024041862A1 (en) Blockchain transaction
WO2024061546A1 (en) Enforcing constraints on blockchain transactions
GB2621857A (en) Blockchain transaction
EP4413686A1 (en) Redacting content from blockchain transactions
WO2024061617A1 (en) Atomic swap token trades
GB2621405A (en) Attesting to knowledge of blockchain transaction outputs
GB2621144A (en) Wrapped encryption
WO2023208832A1 (en) Blockchain transaction