Indexing of transactions in the blockchain

0

Say we have transaction T(1) from many years ago, at block height 125001; then a new transaction T(2), consuming all the outputs from T(1), comes up around block height 568720 (that's Today). There's a lot of blocks in between; many GB of data, and the nodes need to efficiently check that the outputs of T(1) were never used in all these years since block height 125001 before processing T(2). This looks like a scalability problem. How do the nodes usually do that? Do they perform a linear search? Do they have a B-tree index of all active transactions? A hash-index? Is there a more efficient way to find out? Is the blockchain structured in a way that this search is inherently trivial? What about the nodes that do not store the whole block chain history?

Thanks a lot.

Guillermo Prandi

Posted 2019-03-25T11:45:45.577

Reputation: 103

Answers

2

Each full node maintains an UTXO set, that contains all the unspent transaction outputs. This UTXO set is built by each full node when it is initialized by going through all the blocks until the genesis block. The UTXO set, unlike the blockchain, is stored in the RAM of the machine to allow faster query (UTXOs are actually stored in $DATADIR/chainstate, but are aggressively cached in memory.) When a new block is received and validated, the nodes will remove data related to the UTXOs that are consumed and add the outputs of the new transactions to the UTXO set.

Ugam Kamat

Posted 2019-03-25T11:45:45.577

Reputation: 5 180

Would that be https://charts.bitcoin.com/btc/chart/utxo-set-size ? 69 MB sounds like a very small size to worry about scalability. Then again, it sounds too small.

Guillermo Prandi 2019-03-25T12:34:07.723

It sounds like it's more like this: https://statoshi.info/dashboard/db/unspent-transaction-output-set (2.8 GB?). It's maintained in memory but, how will that escalate with time? Will forks be needed or something like that?

Guillermo Prandi 2019-03-25T12:38:16.977

1

The latter (https://statoshi.info/dashboard/db/unspent-transaction-output-set) is more accurate. There's still work going on as to how to achieve optimization

Ugam Kamat 2019-03-25T15:59:30.487