Yes, we know, running a full node is painful. From NodeReal’s BSC Annual Report 2023, we observed that running a full node requires minimal storage of 1.6TB.
As of writing this blog, it already exceeded 2TB. However, only a minority of key-value pairs have been accessed in the recent period. With this in mind, BSC developers have come up with a non-consensus state expiry scheme to minimize the state data that full nodes have to store.
Say goodbye to full node pain: Introducing the smallest full node ever.
|Account Trie (GB)
|Contract Trie (GB)
|Account Snap (GB)
|Contract Snap (GB)
|PBSS + State Expiry (not pruned)
|PBSS + State Expiry (pruned trie only)
|PBSS + State Expiry (pruned trie and snap)
Note: Chain data here refers to all state data excluding the ancient store (older blocks, header, tx, receipt, etc) and transaction index.
Based on the result above, we can observe a 79.4% chain data size reduction by pruning away expired trie and snap key-value pairs. Another significant advantage is that even with the inclusion of additional state epoch metadata, it only contributes to an increase of 1.02% of total chain data size. With an epoch period of 1 month, each transaction will access the expired state approximately 6% of the time.
In this design, we introduced a new state metadata called state epoch. A state epoch is a unit measurement to determine if a state is expired or not. A state epoch period is measured using a fixed number of blocks (e.g. 1 epoch for every 100000 blocks). In our state expiry rule, once a state has been left behind the latest epoch for at least 2 epochs, then it’s considered expired and can be pruned away.
The state epoch metadata is stored in the branch nodes of the contract trie, as shown in the following figure:
The length of the epoch map corresponds to the length of the number of child nodes (i.e. 16), where each epoch points to the direct corresponding child node.
Every state access operation (i.e. SLOAD and SSTORE) requires the traversal from the root of the contract trie to the value in the leaf node. During the traversal process, it is possible to pass by the branch nodes and have their state epoch checked against the state expiry rule. If it’s expired, then an error is returned back and the parent process will perform a state revive operation.
During the offline expired pruning process, each contract trie is scanned and the portion of the expired subtries is evaluated. Expired subtries are pruned, deleting the trie nodes from the database and shrinking the trie. The following figure shows an example:
If an expired state has been pruned away and some state operation needs to access the expired state, then state revival needs to be performed. State revive is done by requesting an MPT proof of the expired state from an entity called remoteDB. A remoteDB is simply a regular full node that contains the entire state or an archive node. The following figure illustrates the interaction between the different types of nodes:
Refer to this guide to experience this feature.
We have the smallest full node, but the next milestone is to build the smallest and most performant full node. In the future, the development team aims to increase the performance of the full node which includes optimizing remoteDB query, pruning, revive, and more!
We understand that it has not been easy to run a BSC full node, and that’s why we built this feature for you. Try it out and let us know your thoughts! If you face any issues, raise a GitHub issue on the BSC repository and the team will be happy to assist you.
State growth is a problem, and we found a way to mitigate the issue. By labeling a state with an additional state epoch metadata, an expired state can be pruned away and recovered later using proof.