The simplest explanation of a blockchain is as a chain of cryptographically linked blocks of data. Brett Benefield, Scott Herford, and Kyle Thomas write:
Each one of these blocks can have multiple pieces of information in it such as transaction information or asset data as an example. When a new block is added, it is added to the end of the chain of blocks and then a new hash value is computed. This relatively simple feature ensures that once the blockchain is large enough it is computationally secure preventing a malicious party from modifying any previous blocks. (Benefield, Herford and Thomas 2018, 1)
The blockchain is append only by design. An implementation feature maintains multiple copies of the blockchain, validated through a consensus mechanism. This increases the security of the blockchain, as a malicious actor would have to gain access to enough nodes and rewrite each blockchain quickly enough to achieve consensus that the altered chain is correct. To use Benefield, Herford, and Thomas’s term, this difficulty makes the distributed blockchain computationally secure.
The technology used to link the blocks of data is the cryptographic hash, which is a mathematical digest of a set of data. This cryptographic hash is the fingerprint of the block. It is impossible to use the cryptographic hash to determine the contents of the block, and it is impossible to change the block without altering the cryptographic hash.
A blockchain is a chain of cryptographically linked data blocks. The first block in a blockchain, known as the Genesis Block, is comprise of digital data and the cryptographic hash of that data. The next (and subsequent blocks) are comprised of the hash of the header of the previous block.
So what is a block, really? The block is a data structure called a container. The container consists of a header containing metadata (data about data) and the data itself. In the diagram above, the header contains only the hash of the previous block. To make the blockchain useful, we need define the data that can be contained within the block and the type of metadata used within the header.
There is no data structure endemic to the blockchain. A blockchain can be stored in a flat file or in a database. Because the blocks can be stored in any order, we need a way to identify the order of the blocks. For this, we add a timestamp. Next, according to Emmanuelle Anceaume (et al.), we also need a way to determine block validity. The method used to determine validity in the Bitcoin network is proof of work.
Proof of work adds a nonce to the block. “A nonce is a random number that acts as a placeholder variable in mathematical functions.” (Stewart, Chapple and Gibson 2015) The proof of work involves replacing the nonce with random values and computing the cryptographic hash until you find one that satisfies a particular criteria. In the case of Bitcoin, the difficulty is to find a hash that begins with a particular number of zeroes, a goal that dynamically changes as the number of miners changes.
Finally, the transactions to be added to the block are stored in data structure called a Merkle Tree. (See below for more information.) This is a hash-based data structure that provides efficient storage and secure verification of the data. The result of all this is a blockchain.
Proof of Work is a consensus algorithm used to determine who gets to create the next block, and requires a majority of the nodes to agree. Very democratic, but reliant upon the size of the network to prevent fraud. There are other consensus algorithms used in different blockchain implementations, such as “Proof of Stake”, “Proof of Activity”, “Proof of Burn”, “Proof of Capacity”, and “Proof of Authority”. Each type of Byzantine fault tolerance is implemented differently by each implementation. Thus, once again, the infrastructure determines the system's resistance to Byzantine faults.
There are some other things you need to know.
- The blockchain is a journal, not a ledger. A journal is a list of transactions; a ledger is an analyzed summary of the journal.
- The blockchain is not inherently secure. It is basically a text file, and visible to everyone with a copy. Securing the blockchain is an implementation feature.
- The blockchain is not immutable. There have been several cases where public blockchains have had to be rolled back. Unlike the transaction log in a database, rolling back the blockchain revises the historical record, changing the audit trail.
- Blockchain is neither a specification nor a protocol. It is a set of concepts that can be implemented in a variety of fashions. Your use case sounds like a private, permissioned blockchain, in which case you should investigate Hyperledger.
- Blockchain is not a database, at least not a terribly good one. Public blockchain implementations satisfy neither the ACID nor the BASE database properties. In particular, because the blockchain can be rolled back, in essence changing history, blockchains are not Durable. Blockchains place a premium on consistency over availability, which means they are not performant.
Good luck.
3"Blockchain technology" is as meaningless as "artificial intelligence" or "web 2.0". It refers to a collection of ideas, most of which are pure marketing hype. Some of those ideas are useful, but rarely standalone. If you think you need a "from scratch blockchain" without understanding the fundamental problems it solves (hint: not much!), I'm afraid you'll just waste your time. – Pieter Wuille – 2018-08-29T23:03:37.147
I have found this https://readthedocs.org/projects/blockchain-guide/downloads/pdf/latest/
– Web Sailor – 2018-08-29T23:14:20.317There is no official documentation. "blockchain technology" has no central authority nor anything official as it is just an abstract concept. – Andrew Chow – 2018-08-30T03:23:20.153