How can I track transactions in a way that is not affected by their malleability?

7

2

All the recent problems caused by transaction malleability have brought to the center stage the fact that tracking a transaction id is not good enough.

So here is my question, what is good enough?

The best I can come up with, is to create my own hash of the transaction I want to track before sending it to the network. Let's call that hash "tracking id". That hash would cover only signed portions of the transaction. The elements hashed would be

  • for all inputs: txid, vout
  • for all outputs: amount, address

Given the fact that the transaction must still be declared valid by the official client, would this "tracking id" be robust enough to track a transaction no matter how it may have been covertly "adjusted" through transaction malleability?

If this scheme does not work, do you know of any way that would allow me to safely track the transaction?

LordOfThePigs

Posted 2014-02-12T12:03:48.607

Reputation: 275

Answers

4

Since none of the answers that were given were fully fleshed enough to properly answer my question, I'm posting my own answer which synthesizes everything that has been said:

The best way to track a transaction in a way that will be resistant to transaction malleability, is to actually track only the data that has been signed. To do so, I suggest using a double SHA256 of the signed data. The documentation of the OP_CHECKSIG script operation (and it's very useful diagram form hints at which procedure should be followed.

A transaction malleability invariant "Tracking Hash" can be obtained from a pay-to-pubkey-hash transaction by:

  1. Verify the technical validity of the transaction, as is done the reference bitcoin client. This includes verifying the signatures of the transaction. This is a crucial step, if you don't do that, the tracking hash is useless.
  2. double SHA-256 hashing all of the following things:
    1. the version number
    2. the number of inputs
    3. for each input :
      1. the transaction ID of the output to spend
      2. the index of the output to spend
      3. the sequence number
      4. (Note: We intentionally do not hash the input script or the length of the input script)
    4. the number of outputs
    5. for each output:
      1. the amount
      2. the length of the output script
      3. the whole output script
    6. the lockTime

Doing this should give you a nice tracking hash, that will be the same for all equivalent transactions, no matter how it may have been modified by taking advantage of transaction malleability.

Note that this tracking hash is not part of the bitcoin protocol, so you will not find it in the blockchain, and you won't be able to look it up using blockchain.info.

LordOfThePigs

Posted 2014-02-12T12:03:48.607

Reputation: 275

What's the minimal amount of things to hash for this to work?Pacerier 2014-05-22T16:05:14.460

You must include every single byte that is signed, and must exclude every singly byte that is not signed. If you don't respect that, you'll be vulnerable to some type of transaction malleability. You must follow this list precisely.LordOfThePigs 2014-05-26T11:06:48.730

3

The signing hash is immutable, since that's precisely what you signed. Using the signing hash to track the transaction should ensure you are not vulnerable to any malleability attacks.

David Schwartz

Posted 2014-02-12T12:03:48.607

Reputation: 46 931

But isn't there many different signatures in a transaction (one per input)? Or is signing hash something else?LordOfThePigs 2014-02-12T23:18:48.880

The signing hash is the thing you sign to produce the signatures. The entire purpose of the signature algorithm is to ensure the signing hash cannot be changed, so if someone could change that without invalidating the signature, ECDSA itself would be broken and we'd have much bigger problems than malleability.David Schwartz 2014-02-12T23:24:06.590

So there are as many signing hashes as there are inputs, correct?LordOfThePigs 2014-02-12T23:26:00.500

No. There is one and only one signing hash for the transaction.David Schwartz 2014-02-12T23:27:10.773

I'm sorry if I appear dense, but I can't find any reference to a TX wide signing hash here. Do you have a reference where I could find out more about it?

LordOfThePigs 2014-02-12T23:33:22.667

If I understand the OP_CHECKSIG flow properly, the signing hash you are referring to is actually different for each input of the transaction, since it is built by hashing a copy of the current transaction where all the txinput scripts of the inputs that are not being signed are zeroed out. So there is really one signing hash per input, but those signing hashes cannot be changed without invalidating the transaction.

LordOfThePigs 2014-02-12T23:52:45.373

Ahh, then you would have to build a signing hash where only the signatures are zeroed out. That would ensure you were protected from anyone changing the signatures in any way.David Schwartz 2014-02-13T00:09:00.243

Well, with what you've just said, and the fact that I stated that the transaction has first been verified to be valid using the official bitcoin client, it seems that the solution I give in my question would be good enough.LordOfThePigs 2014-02-13T00:15:45.727

I agree. Any hash of all the signed data and none of the signature itself should be safe.David Schwartz 2014-02-13T00:38:55.437

Was there not a problem with the signatures themselves being malleable? Someone could change the sign of the signature and the signature would still be valid? I may have misunderstood.Jannes 2014-02-14T09:52:54.053

Yes, that's why you hash the signed data and none of the signature. They can change the signature all they want, it won't change the signed data.David Schwartz 2014-02-14T11:19:33.107

1

It's obvious that given the malleability issue you shouldn't use the transaction hash to track a transaction. If your wallet software needs to track a specific transaction it should check its inputs against the block chain, and track it based on that.

The idea is to always rely on the block chain and check inputs/outputs for yourself. Don't rely on the transaction hash for identification purposes.


Edit: One way to implement a solution is to track the referenced output instead of the transaction hash. If you get two unconfirmed transactions, that have different hashes but use the same referenced output, you know something fishy is going on. Therefore a good way to start tracking things would be to make sure the referenced outputs (within an input of a transaction) are not being used more than once.

Luca Matteis

Posted 2014-02-12T12:03:48.607

Reputation: 4 784

Yeah, thanks. My question already makes it obvious that I know that I can't use the transaction hash. Would my tracking id solution work though? What parts of the inputs and outputs should I track against?LordOfThePigs 2014-02-12T12:42:59.677

Inputs are the crucial since they need to reference something in the block chain. Therefore I'd suggest tracking inputs. An output on the other hand can change (you can send different amounts to various people).Luca Matteis 2014-02-12T12:56:30.133

0

Once a transaction has 1 confirmation the Tx hash becomes immutable.

Tx hashes are widely used to track transactions (e.g., blockchain.info)

Therefore, I would rephrase your question to be:

How do I track a transaction from creation to 1 confirmation and capture the Tx hash that has been confirmed?

I think that's the right question.

Further, I'd specify that the solution should be implementable by the bitcoind API.

So... If I create Tx1 what API call should I use to determine if Tx1 has 1 confirmation?

user789235

Posted 2014-02-12T12:03:48.607

Reputation: 126

2You're assuming the block that holds the transaction is not orphaned. That can happen.David Schwartz 2014-02-12T21:52:00.633

Correct, I'm aware of that problem. And with transaction malleability, it could be that when that orphaned block gets ejected from the main chain, then the equivalent transaction that is present in the main chain (if it is) may not have the same txid either.LordOfThePigs 2014-02-12T23:20:28.890

You are correct - I forgot about orphaned blocks. So, you'd have to say it becomes statistically immutable after n confirmations; where n is a number that is sufficient for your purposes. Alternatively, you could say it is the transaction hash that matches your inputs and outputs contained in the longest chain.user789235 2014-02-13T01:46:08.237

1This may be a good rephrase of the question, but it's not an answer. Should have been a comment to the OP.Jannes 2014-02-14T09:56:31.973