What data structure should I use to model a Blockchain

1

2

I'm working on a functional mini-Blockchain implementation in Scala. I've given this a lot of thought but still can't comprehend the most suitable data structure to model a Blockchain. Specifically :-

1.) It can't be a simple Linked list/Stack, because there are forks , which means that two different blocks can refer to the same Block.Should it be a List of top blocks , i.e. a list of topmost(heightwise) blocks ? I can then traverse the blockchain by starting my traversal from either of the topmost blocks.

2.) How do I make a block point to the previous block ? I know that every block has the hash of the previous block. But how is that enough for me to go from a block to the block it refers to in my blockchain data structure ?

Would really appreciate any help I can get.

Aarsh Shah

Posted 2018-01-27T06:17:09.883

Reputation: 11

Answers

1

First, a blockchain is it's own data structure. It's best to think of it as one. It represents a way to organize data in which desirable properties are gained. However, that still does not explain how to hold one in memory/storage.

In Bitcoin, the blockchain is stored on disk (not memory) using leveldb. This is a key/value database where the keys are the block hashes and the values are the blocks.

For an in-memory equivalent, a hash map is what you are looking for. This is a data structure where the key is a hash of the contents (such as a block header). From there, the header's Merkel root would point to another hash map of transactions arranged in a Merkle tree (as per the protocol).

However, an in-memory solution (which seems to be what you are after) is a terrible solution for a blockchain. The data simply grows too quickly. You will want to store the data to disk and have a fast way to query it. This is why Bitcoin uses an existing database such as leveldb.

Jestin

Posted 2018-01-27T06:17:09.883

Reputation: 8 339

This is a great suggestion. I think I'll go with the leveldb implementation. A couple of questions :-

1.) How would I model a fork, the fact that two blocks can exist at the same height at a given point in time. This is especially important while choosing the parent block to mine against.

2.) If all blocks are stored in disk, isn't validating a new block a very time consuming operation since we need to validate each transaction in the block against all transactions ever done (by querying the DB) to prevent a double spend ? – Aarsh Shah 2018-01-27T10:46:51.637

3.) What do you think of an implementation where I store the block headers (80 bytes each ) in memory and store the transaction data which occupies the bulk of the space in level-db ? This will make it faster to validate the blockchain by verifying the correctness of the prevHash field of each block.Aarsh Shah 2018-01-27T10:47:04.397

0

Actually, it occurs to me, I wonder could you use Git?

You could just use a stacked file naming convention that is the hash of the file contents header (including a Merkle Root or a hash for the contents) and refer in the header section of the file to the previous file. This allows for forks.

Willtech

Posted 2018-01-27T06:17:09.883

Reputation: 2 657

Thanks for the reply.

1.) Please can you elaborate a bit on how Git stores it's files ? Even some links would do ? 2) How would you leverage this to store the Blockchain in-memory ? – Aarsh Shah 2018-01-27T09:06:22.620

I was thinking that you were looking for a simple file structure for storing blockchain, I probably could have started the second paragraph with or. Git would allow for some interesting operations. You could probably read Git from the bottom up and Pro git - too big for the inline viewer; you have to download it. Bitcoin blockchain is a little to big to store in memory generally speaking at ~150GB.

Willtech 2018-01-27T12:15:20.920