Why is the full Merkle path needed to verify a transaction?

8

5

If I have transaction A and wish to verify that it is in the block, and I also have the hash of transaction B and the hash of the hashes of A and B, then can't I just hash A and then see if what I get by hashing A's hash and B's hash is equal to what is in the Merkle tree? Why do I have go all the way to the root?

Jeff

Posted 2017-01-09T17:23:06.233

Reputation: 213

Answers

12

If you have A, HB and HAB you can obviously check whether A fits. As Jestin noted this is essentially a full Merkle tree with two leaves.

However, as a thin client, you only have readily available the Merkle root (which is in the block header) and get told about A. The intermediate levels of the Merkle tree are not provided, therefore, to calculate them, you'd need the block's complete set of transactions.

https://i.stack.imgur.com/2Ep7y.png Image via Mastering Bitcoin

So, for a thin client, we calculate the Merkle branch instead. For the Merkle branch, we just need the transaction's position in the block's transaction list and the hashing partners at each level, instead of the complete set of transactions. A Merkle branch is impractical to fake because it would require finding of a hash collision (which is not doable, or mining wouldn't work). So by going up the tree and combining our result with the respective hashing partner at each level, we finally get the Merkle root. Thus we can prove membership of the transaction in the block.

In the example image, you only need to provide the blue information to link HK to the Merkle root, whereas checking the whole Merkle tree would require all transactions from A to P and the Merkle root.

Murch

Posted 2017-01-09T17:23:06.233

Reputation: 41 609

But don't you need the path which in fact consists of intermediate levels?Jeff 2017-01-09T20:38:49.820

@Jeff: I've edited to clarify. :)Murch 2017-01-09T20:40:56.173

I assume the client who wants to verify the existence of the transaction tells the full node (?) the transaction id (?) and the full node then provides the path? What makes the client believe a given transaction is in a particular block? And when would we first detect that transaction K is not in the block? When you say "blue" I think you mean solid blue?Jeff 2017-01-09T20:48:17.077

Thin clients only get the block headers usually. They give a bloom filter to the full nodes which resolves to transactions they are interested in. In response full nodes tell them about transactions that are found with the bloom filter. The thin client then requests the Merkle branch for any transaction they are interested in. It is hard to prove that a transaction is not in a block, but it is easy to prove that it is. If nobody can provide proof that K is in the block, it probably isn't. –– Yes, I meant solid blue.Murch 2017-01-09T23:51:31.387

I am still not understanding why we need the full Merkle branch: why would not a single hashing partner be sufficient?Jeff 2017-01-10T17:09:38.140

2@Jeff: If you were trying to verify that K is in the block, and you only get H<sub>L</sub>, you can merely calculate H<sub>KL</sub>, but you don't know whether H<sub>KL</sub> is actually in the tree. To check that, you have to hash it with H<sub>IJ</sub>, but again, you don't know if the resulting H<sub>IJKL</sub> is in the tree. So, you go up until you can calculate the Root, because if your calculation matches the information that you have from the block header, you know that the transaction fits the block header.Murch 2017-01-10T22:32:48.740

@Jeff: Meanwhile, you're fairly sure that the block header is correct, because it adheres to the difficulty requirement and it would be quite the investment to fake a block header that adheres the difficulty requirement. Even better when you learn about new block headers that reference the block header of your transaction's block and also adhere the difficulty requirement.Murch 2017-01-10T22:34:13.187

Thanks for your explanation. Is there an example with a small tree that demonstrates this?Jeff 2017-01-13T05:25:43.567

@Jeff: The example I gave corresponds to the image in the answer. Or what do you mean?Murch 2017-01-13T17:18:32.633

Well, one question is, one you say: You don't know whether H<sub>KL</sub> is actually in the tree, does this mean that a "malicious" full node might deliberately give you a false hashing partner?Jeff 2017-01-13T23:35:20.923

1@Jeff: If they give us a wrong hashing partner, a few levels further up the result will not match the Merkle root, unless they managed a successful preimage attack i.e. found a second input that creates the same given output. The latter is thought to be infeasible for SHA-256d.Murch 2017-01-14T00:42:56.613

When starting with Hk and given Hl, how does one know which of (Hk + Hl) or (Hl + Hk) to compose when checking back up to the root? Does the Merkle Path include a branch-left/right indicator?GoZoner 2018-05-15T17:01:17.620

@GoZoner: Good question! The full information set includes the transaction you're getting the branch for, the transaction's position in the block's transaction list and the list of its hashing partners up to the Merkle root. You can calculate from the position whether it's the left or right hashing partner for each hashing partner. — I've updated my answer.Murch 2018-05-15T17:25:50.980

1

What you are proposing is essentially a Merkle tree with 2 transactions. The hash of the hashes is the Merkle root, and the hash of B is the rest of the Merkle path. You are going all the way to the root.

When you scale this up, you can provide the next highest node in the tree...but what verifies that it belongs in the tree? You'll need to provide the hashes all the way up in order to verify, because it's the Merkle root that is hashed into the proof of work algorithm.

Jestin

Posted 2017-01-09T17:23:06.233

Reputation: 8 339

<Jestin> But why would this fail if transaction A was one of 100s of transactions? If A were not in the block, then A's hash would not be in the hash of the hashes of A and B, right?Jeff 2017-01-09T18:07:05.297

@Jeff, are you saying that 100 hashes would be concatenated and then hashed? If so, that's a o(n) solution, compared to the o(log n) solution that a Merkle tree provides.Jestin 2017-01-09T18:10:13.760

I may not be /probably am not understanding the procedure which involves a "Merkle path" but it seems to me that the absence of A in the block would become apparent without having to use all of the nodes in the path.Jeff 2017-01-09T18:14:54.890

2

I think the part that you are missing is that intermediate nodes in the tree are not always available, only the root. Therefore, the path back to the root must also be provided. Check the "transaction-data" section of the developer guide (https://bitcoin.org/en/developer-guide#transaction-data), specifically the part about Simplified Payment Verification.

Jestin 2017-01-09T19:19:54.353

1

Merkle trees are used extensively by SPV nodes. SPV nodes don’t have all transactions and do not download full blocks, just block headers. In order to verify that a transaction is included in a block, without having to download all the transactions in the block, they use an authentication path, or a merkle path. To understand why we need authentication path, you need to understand how Merkle trees work.

A merkle path is used to prove inclusion of a data element. A node can prove that a transaction K is included in the block by producing a merkle path that is only four 32-byte hashes long (128 bytes total). The path consists of the four hashes (shown with a shaded background in A merkle path used to prove inclusion of a data element) HL, HIJ, HMNOP, and HABCDEFGH. With those four hashes provided as an authentication path, any node can prove that HK (with a black background at the bottom of the diagram) is included in the merkle root by computing four additional pair-wise hashes HKL, HIJKL, HIJKLMNOP, and the merkle tree root (outlined in a dashed line in the pic below)merkle path

user2203937

Posted 2017-01-09T17:23:06.233

Reputation: 89