BEP(Idea): State Expiry On BNB Chain

BEP(Idea v0.1): State Expiry On BNB Chain

1.Motivation

Storage has always been a big challenge for blockchain, since the data may not be deleted and it keeps growing.

Ethereum proposed EIP-4444, trying to prune history block, header, receipt to make the node lighter. But it didn’t touch the storage, which could be even larger than blocks.

Before addressing the storage challenge, there are some universal rules that we think are reasonable:

  • Very old data is most likely seldomly used.
  • The on-chain storage is very expensive, it would be better for users to delete unused data timely. But if they fail to do it, old data can be dropped from the chain.
  • Simply deleting is not acceptable, since the user should have the ownership of its own data. It is necessary to provide a mechanism to restore it in case the user wants it back and they may need to pay for it.
  • Users should be able to know the state of its account.

Keeping these rules in mind, helps us find potential solutions to address the storage challenges.

2.Where We Are

There are several different approaches to address the state growing problem across the blockchain industry during the past few years.

2.1.ReGenesis: Stateless

It was posted at June 2020, refer: ReGenesis Explained

Generally it is about only keeping the root hash and rebuilding the world state from scratch at a deterministic schedule, like 1,000,000 blocks. The block producer(miner or validator) only needs to keep the latest rebuilt state, which is much smaller and the unused state no longer needs to be revived.

It is a bit like the concept of Stateless, users can revive the useful state by sending transactions with witnesses.

The problem is that it could have a very heavy state revive at the beginning of ReGenesis, since it empties the whole state on ReGenesis.

2.2.Economy: Pay For Storage

Some of the chains try to solve this problem by economy, they may use different storage rent policies, but generally, users need to pay for the storage, as long as users stop paying for it, their data would be deleted.

Some of the projects you may refer:

Actually it is a good idea, but it needs to be done at the very beginning. For chains that are already running, it is not easy to change its economy model.

2.3.Epoch Based State Expiry By Vitalik

Vitalik introduced the idea of epoch based state expiry and posted several articles on it:

#1: 2021.01.30: Alternative bounded-state-friendly address scheme

#2: 2021.02.12: Theory of Ethereum State Size Management (recommend)

#3: 2021.02.23: Resurrection-conflict-minimized state bounding, take 2

#4: 2022.01.04: A few paths to statelessness and state expiry (recommend)

The main points include:

  • Refresh by touching
  • 4 bytes epoch prefix for account address
  • Epoch State Trees
  • A new opcode: Create3
  • Verkle Tree

I think valitak has explained his idea clearly in his blog, I will not repeat it here, you may directly refer article #2 and #4 to understand his proposal.

It is a great design, but it is quite complicated and there is tremendous work that needs to be done. And the Ethereum community is not quite active on this topic, maybe it is not that urgent for Ethereum and has a relatively low priority compared to other EIPs like staking withdrawal or proto-danksharding. There is no EIP for it yet, it could take a very long time to make it in ethereum.

3.Proposal For BNB Smart Chain

Although BSC(BNB Smart Chain) was launched at around Aug 2020, it now has much bigger storage size compared to Ethereum and gives state expiry a higher priority.

I would propose an idea trying to eliminate the storage pressure for BSC at some level and make a tradeoff between perfectness and complexity.

It is still an idea yet, needs to be improved.

3.1.General Workflow

a.ExpireDepth & TouchedBlock

There will be a configuration: ExpireDepth, which demonstrates the maximum block depth that can be accessed, it could be: 10,000,000.

There will be another element TouchedBlock added into the Account structure to record the latest block that an account has been successfully touched.

type Account struct {
Nonce uint64
Balance *big.Int
Root []byte
CodeHash []byte
TouchedBlock uint64
}

When a transaction tries to access a contract storage, it has to check the TouchedBlock of the target account. If the account is expired, i.e. it has not been accessed for a long time, the access will be denied and the transaction will be reverted. Otherwise, the access will be ok.

b.State Expire Operation

There is nothing that needs to be done, since state expiration occurs automatically as long as the TouchedBlock has not been updated for a long time.

When the contract account expires, its storage state would no longer be accessible.

Quite similar to downgrade a contract account to an EOA account, but the “EOA” account has: root, nonce, balance, codehash, LastAccessedBlock. And the “EOA” account can be revived later to upgrade back to a contract account.

c.State Revive Operation

  • A new transaction type will be added, it can provide witnesses to revive the state,the witness data could have a size limit.
  • The state object can be revived through several transactions, if the witness is too large, which means partial revive will be supported
  • Anyone can restore the data as long as it is willing to pay, once all the state of the object is ready, the validator could update its LastAccessedBlock to activate it.

d.Witness Tree & Node

Introduce these 2 concepts to support partial revival.

  • Witness Tree: it is the partial revived tree.
  • Witness Node: it is a special node type in Witness Tree, it provides proof but its branch nodes are not revived. Accessing the branch nodes of Witness Node will cause the transaction to revert as well.

e.EOA Account

Currently the EOA accounts will never expire, since it does not take too much storage compared to contract accounts. Only contract storage will expire.

But in the future, if EOA accounts grow to an unacceptable level, maybe they can be aggregated into an Witness Node, but it could face the same problem as Vitalik described: the shrink of the usable address space. It is that urgent for BNB Chain yet, could leave it right now.

3.2.Some Thoughts

a.Opcode: Create2

Create2 could generate contracts at the same address.

It is not allowed to create a new contract on an expired address, and will revert the transaction. The expired contract has to be fully revived before Create2 opcode can be executed.

b.Account Level or Key Level

Key level could add more complexity, if we add the TouchedBlock for every Key, it also gives pressure to storage.

So Account level is preferred.

c.Rollups & Abstract Account & Verkle Tree

Need further investigation to see if it will impact the implementation of Rollups, Abstract Account and Verkle Tree.

d.The Block Size

If the block is full of state recovery transactions, which contains lots of witness data, it could make the block very big. It should be avoided.

e. MEV & Incentive

Validator may prefer or not prefer these witness transactions?

How to encourage validators to include such kinds of transactions, low incentive would make validators unwilling to include it.

And some accounts could do extra work to prevent its data from being dropped, they may send heartbeat transactions to keep it alive. This is not what we intended, and we need to fix it.

f. Security

Need further investigation.

g.Compatibility

More transactions could be reverted, if they access expired state.

1 Like

The v0.1 design would add metadata to the node, so it is not preferred.
Update the v0.2 design, which has very big difference to the v0.1.

BEP(v0.2): Hybrid Mode State Expiry

1.Summary

This BEP proposes a practical solution to address the problem of increasing world state storage on the BNB Smart Chain, by removing expired storage state.

2.Motivation

Storage presents a significant challenge for many blockchains, as new blocks are continually generated, and transactions within these blocks could invoke smart contracts that also add more state to the blockchain.

A large storage size can cause several side effects on the chain, such as higher hardware requirements, increased network resources required for downloading and performing p2p sync, and performance degradation due to MPT write amplification.

Due to the high volume of traffic, the storage size on BSC grows very rapidly. As of the end of 2022, a pruned BSC full node snapshot file is approximately 1.6TB in size, compared to approximately 1TB just one year ago.

The 1.6TB storage consists mainly of two parts:

  • Block Data (~2/3), which includes the block header, block body, and receipt;
  • World State (~1/3), which includes the account state and Key/Value (KV) storage state.

The Ethereum community proposed EIP-4444 to address the first part, which is to prune old block data. However, EIP-4444 does not address the second part, which is more challenging.

There have been many discussions on how to implement state expiry, with one proposed solution involving the removal of EOA accounts, extension of the address space, and the use of Verkle Trees to reduce the witness size. However, implementing such a solution would be a significant undertaking that could have a substantial impact on the ecosystem, and may not be feasible in the short term. As BSC continues to face high traffic volumes, it is essential to develop a short-term solution for state expiry that can be implemented quickly, while still retaining the ability to upgrade to a long-term solution once it becomes available.

3.Specification

3.1.Hierarchy View

a.MPT Tree Mode: Single

The current MPT tree on BSC is a single tree composed of two trie trees, namely the L1 AccountTrie and L2 StorageTrie. The L1 account trie tree stores the account state, which includes the Nonce, Balance, StorageRoot, and Codehash for both EOA and Contract accounts. And each contract account also has its own L2 storage trie tree, which stores its KeyValue (KV) state that is updated by the corresponding smart contract. The world state is accessed through this single-layered MPT tree.

b.MPT Tree Mode: Hybrid

This proposal introduces the concept of an “epoch,” which refers to a period of time with a unit of blocks. The details of the epoch will be explained further later on.

With the epoch concept in mind, this proposal suggests the introduction of a hybrid MPT tree mode, where the L1 account trie tree will remain unchanged, while the L2 storage trie tree will be regenerated at the start of each epoch. In other words, the hybrid MPT tree consists of a single account trie and an epoch-based storage trie, as illustrated in the following diagram.

c.Expired Storage Trie

The BSC full node will only need to keep the L2 storage trie for the current epoch and the previous epoch. The L2 storage trie for older epochs will not be accessible and can be pruned.

3.2.The Components

a.Data Availability

The expired state will be kept by the DA for witness service, allowing users to revive their expired state. The details of data availability will not be covered in this proposal but can be provided by third-party projects such as the BNB Green Field project.

b.New Transaction Type

A new transaction type will be added, which can contain the witness. The EVM will be upgraded to support this new transaction type, along with an appropriate gas metering policy.

c.Epoch

The epoch period is a key parameter for this proposal, and by default, it could be set to 2/3 years for the BSC mainnet. Given a BlockInterval of 3 seconds, the EpochPeriod would be calculated as follows: EpochPeriod = 365 * (2/3) * 24 * 60 * 60 / 3 = 7,008,000.

d.Epoch Storage Trie

The access to the storage trie in Epoch 0 will remain unchanged, and can be directly referenced by its storage root. However, starting from Epoch 1, it will be referenced by EpochIndex_StorageRoot, meaning that at the beginning of a new epoch, the storage trie will be empty. The state access in the new epoch will be updated to the new storage trie.

e.Stub Node

The MPT trie node will be extended to support stub nodes, which could have a bitmap to indicate whether the target position is expired or not.

f.Witness Protocol

Witness protocol will define the format of the witness and it will be defined in another BEP.

g.State Revive

If the missed state is from the previous epoch, the state revive process will be automatic and will place the state in the current epoch. However, if the missed state is from an even earlier epoch, the user will need to provide a witness to revive it.

3.3.General Workflow

a.How To Expire

In contrast to the single tree mode, the hybrid MPT tree mode does not require metadata. The BSC node only needs to keep the storage trie for the latest two epochs, and the KV slots in older epochs will automatically expire. As previously mentioned, there is only one global L1 account trie, so EOA accounts will not expire; only the KV slots of the L2 storage trie will be subject to expiration.

b.How To Access KV In New Epoch

Here are the three cases for accessing the state in the proposed hybrid MPT tree mode:

  1. If the state has already been accessed in the current epoch, it can be returned or updated in the current epoch tree.
  2. If the state was last accessed in the previous epoch, it will be automatically moved to the new epoch before it can be accessed. Once it is moved, it can be accessed in the same way as in case 1.
  3. If the state is cold data and has not been accessed in the current or previous epoch tree, it is expired. Trying to access expired state will cause the transaction to revert. To access the expired state, the user needs to revive it first.

c.How To Revive State

A new transaction type will be added, which contains witnesses to revive the state.

The full storage trie can be revived through several transactions, and if the witness is too large, partial revival will be supported.

Anyone can restore the data as long as they are willing to pay.

3.3.Other Approaches To State Expiry

There are several different approaches to handle state expiry, which can be classified based on how the 2-layered trie trees are managed.

a.Single Tree

Metadata will be added to record when a node is visited and when it will expire. With the single tree mode, there is only one global world state, so there will be less state revive conflict and it will be easy to revive the state. However, the additional metadata would increase storage costs and greatly impact performance if every state access needs to be recorded on the chain.

b.Two Epoch Trees

This approach involves regenerating both the L1 account trie and L2 storage trie in each epoch, providing a comprehensive solution for state expiry. However, there are still two unsolved issues to address: account resurrection conflict and witness size. Additionally, this approach has a significant impact on the ecosystem and relies on other infrastructure, such as address extension and Verkle trees.

c.Summary

Single Tree Two Epoch Tree Hybrid Tree
L1 Account Trie Single Two Single
L2 Storage Trie

(Each Contract)|Single|Two|Two|
|Cons|1.addition metadata cost|1.great impact to ecosystem

2.many challenges to support state revive|1.incomprehensive, since L1 account trie will be kept.|
|Pros|1.no resurrection conflict|1.no metadata

2.friendly to Verkle Tree

3.comprehensive|1.no metadata

2.no resurrection conflict|

In summary, the Hybrid Tree mode proposed in this document offers a more practical solution for state expiry. In the long term, it could transition to the ideal Two Epoch Tree mode once it becomes available.

4.Rational

4.1.Why Keep The L1 Account Trie

There are several reasons to keep it:

  • The size of the L1 account trie is relatively small, constituting only around 4% of the L2 storage trie on BSC as of the end of 2022.
  • The L1 account trie contains crucial information about user accounts, such as their balance and nonce. If users were required to revive their accounts before accessing their assets, it would significantly impact their experience.
  • By retaining the L1 account trie, the witness verification process can be much simpler.

4.2.Reasonable Epoch Period

The state will expire if it has not been accessed for at least 1 epoch or at most 2 epochs. On average, the expiry period is 1.5 epochs. If we set the epoch period to represent 2/3 of a year, then the average state expiry period would be one year, which seems like a reasonable value.

5.Forward Compatibility

5.1.Account Abstraction

Account abstraction implementation will be impacted, as these accounts could be stored in the L2 storage trie and could be expired.

5.2.L2 Rollup: Optimism & ZK

Rollups could be impacted if the rollup transactions try to access expired storage.

6.Backward Compatibility

6.1.Transaction Execution

The current transaction types will be supported, but if the transaction tries to access expired storage, then it could be reverted.

6.2.User Experience

There are several changes that could affect user experience. The behavior of many DApps may change, and users will have to pay to revive their expired storage. If the revival size is very large, the cost could be expensive.

6.3.RPC API

Some of the APIs could be impacted, such as: getProof, eth_getStorageAt…

7. License

The content is licensed under CC0.

1 Like

v0.2 is not quite friendly to prune and witness provider, so we are working on v0.3, which has a clearer architecture, easy for prune and witness provider.

BEP(v0.3): Hybrid Mode State Expiry

1.Summary

This BEP proposes a practical solution to address the problem of increasing world state storage on the BNB Smart Chain, by removing expired storage state.

2.Motivation

Storage presents a significant challenge for many blockchains, as new blocks are continually generated, and transactions within these blocks could invoke smart contracts that also add more states to the blockchain.

A large storage size can cause several side effects on the chain, such as higher hardware requirements, increased network resources required for downloading and performing p2p sync, and performance degradation due to MPT write amplification.

Due to the high volume of traffic, the storage size on BSC grows very rapidly. As of the end of 2022, a pruned BSC full node snapshot file is approximately 1.6TB in size, compared to approximately 1TB just one year ago.

The 1.6TB storage consists mainly of two parts:

  • Block Data (~2/3), which includes the block header, block body, and receipt;
  • World State (~1/3), which includes the account state and Key/Value (KV) storage state.

The Ethereum community proposed EIP-4444 to address the first part, which is to prune old block data. However, EIP-4444 does not address the second part, which is more challenging.

There have been many discussions on how to implement state expiry, with one proposed solution involving the removal of EOA accounts, extension of the address space, and the use of Verkle Trees to reduce the witness size. However, implementing such a solution would be a significant undertaking that could have a substantial impact on the ecosystem, and may not be feasible in the short term. As BSC continues to face high traffic volumes, it is essential to develop a short-term solution for state expiry that can be implemented quickly, while still retaining the ability to upgrade to a long-term solution once it becomes available.

3.Specification

3.1.Hierarchy View

a.Current MPT Tree

The current MPT(Merkle Patricia Tries) tree on BSC is a single tree composed of two trie trees, namely the L1 AccountTrie and L2 StorageTrie. The L1 account trie tree stores the account state, which includes the Nonce, Balance, StorageRoot, and Codehash for both EOA and Contract accounts. Each contract account also has its own L2 storage trie tree, which stores its KeyValue (KV) state that is updated by the corresponding smart contract. The world state is accessed through this single-layered MPT tree.

b.Hybrid MPT Tree

This proposal introduces the concept of an “epoch,” which refers to a period of time with a unit of blocks. The details of the epoch will be explained further later on.

With the epoch concept in mind, this proposal suggests the introduction of a hybrid MPT tree mode, where the L1 account trie tree will remain unchanged, while the L2 storage trie will have a new Shadow Tree which will keep the access record for the MPT tree. In other words, the hybrid MPT tree consists of a single account trie and an epoch-based storage trie, as illustrated in the following diagram.

The storage trie in Epoch 0 will stay unchanged, but since Epoch 1 the storage trie will have a new trie node type as its root: RootNode. It consists of 3 elements: EpochIndex, MPT Root and Shadow Root.

  • EpochIndex: the latest epoch index in which the trie is accessed
  • MPT Root: same as the current MPT Root, it will pointer to the root trie node.
  • Shadow Root: the root hash of the shadow tree, which is used to record the access history of the storage trie

The RootNode and ShadowTree are very important in this proposal and the details will be revealed later.

3.2.The Components

a.Data Availability

The expired state will be kept by the DA for witness service, allowing users to revive their expired state. The details of data availability will not be covered in this proposal but can be provided by third-party projects such as the BNB Greenfield project.

b.New Transaction Type

A new transaction type will be added, which would contain the witness. The EVM will be upgraded to support this new transaction type, along with an appropriate gas metering policy.

There will be another BEP to explain the details.

c.Epoch

The epoch period is a key parameter for this proposal, and by default, it could be set to 2/3 years for the BSC mainnet. Given a BlockInterval of 3 seconds, the EpochPeriod would be calculated as follows: EpochPeriod = 365 * (2/3) * 24 * 60 * 60 / 3 = 7,008,000.

d. ShadowTree & ShadowNode

Before introducing the new concepts of ShadowTree and ShadowNode, it is important to have a clear understanding of the layout of the classical MPT, which is depicted in the diagram below:

There are three types of nodes in the MPT tree: Branch Node, Extension Node and Leaf Node.

As the name suggests, ShadowTree is the shadow version of the MPT tree and is also comprised of three types of ShadowNode:

  • ShadowBranchNode
  • ShadowExtensionNode
  • ShadowLeafNode

The Layout of the ShadowTree can be depicted as the following diagram:

Unlike the MPT tree, Shadow Tree is not a patricia tree, it can not be iterated by the Shadow Root. It is a shadow mapping to the MPT tree and can be accessed indirectly by the MPT tree.

The key to of the shadow node can be: ShadowKey =${TrieNodeHash}${suffix}, it is a combination of the corresponding trie node hash and a suffix. The client may need to keep the shadow node update record to support archive node or block rewind.

Shadow Leaf Node

The ShadowLeafNode is always nil, since it is the end of the path, no need to keep the metadata of its child.

type ShadowLeafNode struct {}

Shadow Extension Node

It has a single element: ShadowHash, which is calculated based on shadow information of its child.


type ShadowExtensionNode struct {
  ShadowHash *common.Hash
}

func (node *ExtensionNode) ShadowNode() ShadowExtensionNode {
  // to calculate the ShadowHash of a ShadowExtensionNode
  var shadowHash *common.Hash
  child := node.child
  if child.isLeafNode() {
    shadowHash = nil
  } else if child.isBranchNode() {
    branch := child.ShadowBranchNode
    shadowHash := Hash(branch.ShadowHash, branch.EpochMap)
  } else {
    // unreachable,
  }
  return ShadowExtensionNode{shadowHash}
}

Shadow Branch Node

For branch nodes, there would be 16 items corresponding to each of the sixteen possible nibble values ​​of the keys in which they are traversed. And the ShadowBranchNode would have two elements:

  • ShadowHash: a unique hash value of the following shadow nodes.
  • EpochMap: a metadata to keep the corresponding item’s last accessed epoch index.

type ShadowBranchNode struct {
  ShadowHash *common.Hash
  EpochMap []uint16
}

func (node *BranchNode) ShadowNode() ShadowBranchNode {
  // Get the epoch infor of itself
  epochSelf := node.EpochIndex
  // Get the epoch infor of its children
  epochChild := node.EpochChild()

  // to calculate the ShadowHash of a ShadowBranchNode
  var shadowHash *common.Hash
  hashList := []common.Hash{}
  for i, child := range node.children {
    // skip expired node.
    if epochSelf >= epochChild[i] + 2 {
      continue
    }
    if child.isLeafNode() {
      continue
    }
    if child.isExtensionNode() {
      ext := child.ShadowExtensionNode
      hashList = append(hashList, ext.ShadowHash)
      continue
    }
    if child.isBranchNode() {
      br := child.ShadowBranchNode
      brHash := Hash(br.ShadowHash, br.EpochMap)
      hashList = append(hashList, brHash)
      continue
    }
    // unreachable
  }
  if len(hashList) == 0 {
    shadowHash = nil
  } else {
    shadowHash = Hash(hashList)
  }
  return ShadowBranchNode{shadowHash, epochChild}
}

Actually, only the ShadowBranchNode will be stored on disk to save IO space, ShadowExtensionNode and ShadowLeafNode will be kept in memory and generated on need.

e.RootNode

type RootNode struct {
  Epoch uint16
  MptRoot common.Hash
  ShadowRoot common.Hash
}

// to calculate the ShadowRoot
r := rootNode
mRoot := GetMptRootNode(r.MptRoot)
if mRoot.isBranchNode() {
  br:= mRoot.ShadowNode()
  r.ShadowRoot = Hash(br.ShadowHash, br.EpochMap)
} else if mRoot.isExtensionNode() {
  ext:= mRoot.ShadowNode()
  r.ShadowRoot = Hash(ext.ShadowHash)
} else if mRoot.isLeafNode() {
  r.ShadowRoot = Hash(nil)
}

// to calculate the StorageRoot
Account.StorageRoot = Hash(r.Epoch, r.MptRoot, r.ShadowRoot)

The storage tries of Epoch 0 do not have the RootNode, if the storage trie is accessed after Epoch 1 then a RootNode will be generated and its hash will be kept in Account.

And the RootNode will never expire.

f.Witness Protocol

Witness protocol will define the format of the witness and it will be defined in another BEP.

g.State Revive

If the missed state is from the previous epoch, the state revive process will be automatic and will place the state in the current epoch. However, if the missed state is from an even earlier epoch, the user will need to provide a witness to revive it.

3.3.General Workflow

a.How To Expire

Accounts will not expire, only the L2 storage trie will be subject to expiration, which is used to keep the state of KV slots.

The state expiry is trie node level, the latest epoch index that the trie node was accessed is recorded by its parent, which can be used to determine if the trie node is expired or not. Since we only keep the latest 2 epochs, older trie nodes will not be accessible.

These expired trie nodes can be pruned, but in order to keep the proof generation capability, it is better to only prune the shadow node that is >=2 epochs behind its parent.

b.How To Access KV In New Epoch

To access the KV, there would be a path in MPT tree, for each trie node in the path, it has to check its corresponding shadow node first, if it is expired then the transaction would be reverted immediately.

And there are three cases to access the state in the proposal :

  1. If the state has already been accessed in the current epoch, it can be returned or updated in the current epoch tree.
  2. If the state was last accessed in the previous epoch, the shadow node will be updated, so that it is refreshed.
  3. If the state is cold data and has not been accessed in the current or previous epoch tree, it is expired. Trying to access expired state will cause the transaction to revert. To access the expired state, the user needs to revive it first.

c.How To Revive State

A new transaction type will be added, which contains witnesses to revive the state.

The full storage trie can be revived through several transactions, and if the witness is too large, partial revival will be supported.

Anyone can restore the data as long as they are willing to pay.

It is optional to revive the corresponding shadow node, if it is not provided or mismatched, then the elements in epoch map will all be reset to 0 and the ShadowHash will be nil.

d.Node Sync Mode: Light & Full & Snap

Light & Full sync mode will stay unchanged, but snap sync will need extra work in the world state heal phase. It will have to sync and heal the shadow tree as well.

4.Rationale

4.1.Why Keep The L1 Account Trie

There are several reasons to keep it:

  • The size of the L1 account trie is relatively small, constituting only around 4% of the L2 storage trie on BSC as of the end of 2022.
  • The L1 account trie contains crucial information about user accounts, such as their balance and nonce. If users were required to revive their accounts before accessing their assets, it would significantly impact their experience.
  • By retaining the L1 account trie, the witness verification process can be much simpler.

4.2.Why Not Create A New L2 Storage Trie

In this proposal, the trie skeleton will be kept in a new epoch. There are other approaches

which will generate a new trie tree from scratch at the start of a new epoch. Although they

provide a comprehensive solution for state expiry, there are still two unsolved issues to address: account resurrection conflict and witness size. Additionally, they would have a significant impact on the ecosystem and rely on other infrastructure, such as address extension and Verkle Tree.

By keeping the skeleton of the trie, it would be much easier to do witness verification and have less impact on the current ecosystem.

4.3.Reasonable Epoch Period

The state will expire if it has not been accessed for at least 1 epoch or at most 2 epochs. On average, the expiry period is 1.5 epochs. If we set the epoch period to represent 2/3 of a year, then the average state expiry period would be one year, which seems like a reasonable value.

5.Forward Compatibility

5.1.Account Abstraction

Account abstraction implementation will be impacted, as these accounts could be stored in the L2 storage trie and could be expired.

5.2.L2 Rollup: Optimism & ZK

Rollups could be impacted if the rollup transactions try to access expired storage.

6.Backward Compatibility

6.1.Transaction Execution

The current transaction types will be supported, but if the transaction tries to access or insert through expired nodes, then it could be reverted.

6.2.User Experience

There are several changes that could affect user experience. The behavior of many DApps may change and users will have to pay to revive their expired storage. If the revival size is very large, the cost could be expensive.

6.3.Web3 API

Some of the APIs could be impacted, such as: getProof, eth_getStorageAt…

6.4.Snap Sync

The snap sync mode will heal the world state after the initial block sync. The procedure of world state healing in snap sync mode will need to be updated.

6.5.Archive Node

More storage volume would be needed for the archive node, since more metadata will be generated in each epoch. The increased size could be remarkable, which would make the current archive node reluctant to keep the whole state of BSC mainnet. Archive service may have to be supported in other approaches.

6.6.Light Client

The implementation of the light client would be impacted, since the proof of the shadow tree would also be needed.

7. License

The content is licensed under CC0.

BEP(v0.4): State Expiry With Verkle Tree

Motivation

BSC has completed the preliminary design and implementation of state expiry before, but BSC will always maintain full compatibility with the EVM ecosystem, so based on previous practice and design experience, it will bring State Expiry to the latest verkle tree design.

Verkle Tree is the next important feature after Eth’s Dencun hard fork. It will impact a lot in the EVM ecosystem, like state proof, witness, statelessness, etc. This proposal will be based on Verkle Tree and promise to be compatible with the statelessness feature.

Specification

Verkle Tree Transition

Verkle Tree is the next generation state trie structure, the purpose is to further reduce the state proof size, in preparation for the subsequent statelessness.

The main differences between it and MPT are as follows:

  1. The trie structure is extended from 16-ary to 256-ary;
  2. Merge the two-layer trie structure, L1 Account Trie and L2 Contract Storage Tries into a single huge trie;
  3. Divide the Contract Code into Code Chunks and save them in the trie;
  4. The cryptographic proof is switched to Pedersen+IPA instead of hash-based;

The above pic is from Verkle tree structure | Ethereum Foundation Blog.

In order to implement State Expiry in the verkle tree, it is necessary to convert the MPT of the BSC to VKT. This part will not be described in detail, but it is an important part of the scheme.

State Epoch

The state is generated on the chain through the transaction initiated by the user. Before, the user only needs to pay once to store it permanently, but now all the states accessed by the user exchange cannot exceed two state epochs.

At the same time, in order to greatly improve the user experience, whenever the state is accessed, the status will be automatically renewed, and the state will be expired in the next of the next epoch.

This proposal will not change the current transaction gas pricing mechanism, and only contract storage slots could expire, other states will never be expired.

Epoch-based Metadata

The proposal only adds epoch as metadata in the value node. Since the value in the verkle tree is divided into two parts, the epoch information is encoded at the end of the second part, and the epoch is generally expressed as 2 bytes in size, which also does not exceed the limit of the IPA algorithm.

In addition to the verkle tree, the epoch metadata is also saved redundantly in the snapshot, especially after the value node expires, which is used to judge expired data.

LeafNode Operation

Because the state expires only in the value node of the LeafNode, there are two situations for the commitment update of the LeafNode:

  1. The children’s value node is not expired, and the operation is as usual;
  2. If any number of children value nodes expire, any access to expired value nodes will be reverted, and when CRUD other nodes, the LeafNode will be updated by commitment diff;

Resurrection Conflict

Generally speaking, there are two cases of resurrection conflict:

  1. After a status expires, the user creates a new state in the same slot, which conflicts with the previous expired state;
  2. The state has expired many times in history, and the user revives the non-recently expired state when reviving;

Both of these can lead to abnormal user behavior and the risk of attack. However, this proposal can avoid the above two situations.

After the slot expires, the original cryptographic proof will still be retained in the Verkle Tree. Therefore, the corresponding proof must be provided to resume it. When the slot expires multiple times, the latest expired proof will be used.

Witness

Since only the value of the LeafNode in the whole tree will expire, when generating the witness, it is only necessary to prove that the value is included in the commitment of C1 or C2 of the LeafNode, and the value should carry the expired epoch information, and the EVM will automatically refresh the epoch metadata after resurrection.

type ReviveWitness struct {
	witnessType byte // only support leaf value IPA Proof for now 
    address *common.Address // target account address 
    proofList []LeafValueIPAProof // revive multiple slots (same address)
}

type LeafValueIPAProof struct {
	key []byte // key, to find stem & leaf index
    proof []byte // proof value for commitment of C1 or C2
    value []byte // value with history epoch metadata
}

Witness Size

The size of the witness is small enough, and the most important thing is the constant size, so the cost of restoring a fixed number of states grows linearly and is predictable.

New Transaction type

To support state revive, this BEP introduces a new transaction type, while the current transaction type can still be used in most scenarios, called LegacyTx, the new transaction type in order to maintain compatibility, begin with a single byte REVIVE_STATE_TX_TYPE in the RLP format, the complete transaction definition is as follows:

type SignedReviveStateTx struct {
	message: ReviveStateTx
    signature: ECDSASignature
}
type ReviveStateTx struct {
	Nonce    	uint64          // nonce of sender account 
    GasPrice 	*big.Int        // wei per gas
    Gas      	uint64          // gas limit 
    To       	*common.Address // nil means contract creation 
    Value    	*big.Int        // wei amount 
    Data     	[]byte          // contract invocation input data 
    WitnessList	[]ReviveWitness  
}

The ReviveStateTx structure is similar to LegacyTx, with the StateWitness array added for batch state revival. ReviveStateTx will verify witness and revive state first, and continue to execute transfer, contract calls and other behaviors as normal transactions.

If the user only wants to revive the states of the contract, it is suggested to simply read the contract or guarantee that the transaction can not revert.

State Expire & Prune

A state with no access in two epochs will be marked as expired, and all expired states cannot be accessed again, and the expired value will be deleted through asynchronous prune.

Any access behavior with expired status will cause revert and return the corresponding error code to the user.

Rationale

Why Keep All Account Data And Contract Codes

There are several reasons to keep it:

  1. The size of the account data is relatively small, constituting only around 4% of the storage slots on BSC as of the end of 2022.
  2. The account data contains crucial information about user accounts, such as their balance and nonce. If users were required to revive their accounts before accessing their assets, it would significantly impact their experience.

Reasonable Epoch Period

The state will expire if it has not been accessed for at least 1 epoch or at most 2 epochs. On average, the expiry period is 1.5 epochs. If we set the epoch period to represent 2/3 of a year, then the average state expiry period would be one year, which seems like a reasonable value.

Why Not Expire Intermediate Nodes of The Trie

Because in the Verkle Tree proposal, all Account Data, Contract Code Chunks, and Contract Storage Slots are stored in one tree. At the same time, the width of each node in the tree is expanded to 256.

So if only Storage Slots expire, there is a very small probability that a large number of subtrees will expire as a whole, and at the same time, the storage cost of adding more Archive Nodes to the previous shadow tree is avoided.

Finally, the intermediate nodes of VKT are reserved, and only the leaf value nodes are expired.

Storage Growth

This proposal is not a perfect solution. The purpose is to slow down the state growth of BSC in the future. At the same time, it can reduce the state maintenance of historically low-frequency access and make the validator/full node lighter.

However, this proposal retains the intermediate nodes, and cannot completely remove the storage impact of the existing state or the new state. Yes, it can only reduce the speed of storage increase.

Forward Compatibility

Account Abstraction

Account abstraction implementation will be impacted, as these accounts could be stored in the L2 storage trie and could be expired.

L2 Rollup: Optimism & ZK

Rollups could be impacted if the rollup transactions try to access expired storage.

Statelessness

Statelessness is a proposal with great influence in the community, which can further improve decentralization and reduce node operating costs.

Fortunately, this proposal has the greatest flexibility, retains the original tree structure, can continue to optimize the solution in the future, and upgrade to statelessness is also supported.

Backward Compatibility

Transaction Execution

The current transaction types will be supported, but if the transaction tries to access or insert through expired nodes, then it could be reverted.

User Experience

There are several changes that could affect user experience. The behavior of many DApps may change and users will have to pay to revive their expired storage. If the revival size is very large, the cost could be expensive.

Web3 API

Some of the APIs could be impacted, such as getProof, eth_getStorageAt

Snap Sync

The snap sync mode will heal the world state after the initial block sync. The procedure of world state healing in snap sync mode will need to be updated.

Archive Node

More storage volume would be needed for the archive node, since more metadata will be generated in each epoch. Archive service may have to be supported by other approaches.

Light Client

The implementation of the light client would be impacted since the proof of expired state would also be needed. And in this proposal, only the archive node could prove an expired state.

License

The content is licensed under CC0.

1 Like

I am afraid that VerkleTree is unfriendly to StateExpiry, we have a new design StateExpiry v0.5, will be posted soon, which could be much more applicable to BSC.

1 Like

Also with read-op like account is not in a dirty state? just for a confirmation.

Another question: I think all data in trie or verkle tree is the consensus data, same with the expired flag. When and how to mark the node expired? I know two epochs, but after two epochs is there an async task to mark them?

Also I saw BSC is also supporting Erigon client, if this proposal only works for geth, how to keep compatible with Erigon?

A1: read-op is also an access operation, the state will be renewed.
A2: with the latest v0.5(update soon), each KV slot would have its latest accessed epoch information, the epoch information will not be used to calculate the root hash. So the MPT of contract stay unchanged. And there is no async task to mark the expired node.
A3: Erigon need to be updated as well, if this proposal was enabled.

1.Summary

This BEP proposes a practical solution to address the problem of increasing world state storage on the BNB Smart Chain, by removing expired storage state.

2.Motivation

3.Specification

Variable Description
Epoch A time span defined by block number, if set it to ~⅔ year, if BSC validators produce block every 3 second, it would be (365*2/3)*24*60*60/3 = 7,008,000
NumKeysLastEpoch Number of keys that are accessed during last epoch
NumKeysLive Number of keys that are not expired, i.e. accessible
NumKeysTotal Total number of keys, both expired and not expired
HotContractThreshold The threshold to determine if the contract is hot or not, hot means a certain percentage of live states were accessed during the last epoch. If the majority of the state were accessed recently, then the state of the contract will not expire.
WitnessPrice The gas price of 1 byte witness, as witness will be part of the block and needs extra effort for validator to collect it, it could be more expensive than gas price of calldata.

3.1.Design Guide

a.The Two Principles

  • Rule 1: Permanent delete is unacceptable, expired state must be able to be revived.
  • Rule 2: Contract’s execution logic should stay unchanged, if it fails to get the correct state, the execution result will be discarded, i.e. transaction will be reverted.

b.Better To Have

  • Simple protocol

  • Less impact to UX

  • Affordable price: storage access fee, state revive fee…

3.2.The Components

There will be some new components introduced and some existing components will be updated.

a.New Components

  • Meta Info: it is used to record the contract’s storage status.
  • Rent Model: if a user wants to keep their state from being expired, they will need to pay for the storage rent fee. The rent model will define the detailed rules of how rent fees will be charged.
  • Revive & Witness: it will define the procedure for user to revive their expired state

And witness will define the format of witness to revive state, to make it more compact, witness could be merged to remove duplicated bytes.

  • Account & Block: the structure of Account and Block is quite important, they will be upgraded to support state expiry.
  • Big Scan: it is a special phase to collect the storage information of each contract by scanning all the contract accounts. It could take several weeks, depending on the capability of the nodes.
  • Epoch: it represents a certain time span, state expiry will be Epoch based, one Epoch could be around ⅔ year.
  • SystemContract(StateExpiry): a new system contract would be created to manage the state expiry policies.

b.Existing Components

  • TxPool: transactions with expired state accessed need to be treated specially, tx pool module may need to be upgraded to make it more efficient.
  • Sync: will be upgraded to support sync of witness and metainfo.
  • Prune: it will be upgraded to support prune expire state, beside MPT trie node, it also needs to prune the storage of snapshot.
  • Snapshot: snapshot will be upgraded to record the status of each KV
  • Governance: some of the arguments need to be adjusted dynamically, e.g. the ScanSpeed, RentPrice, ContractLivenessThreshold…
  • Gas Metering: will be upgraded to support witness charge.

3.3.General Workflow

a.From User’s Perspective

Users will not need to be aware of StateExpiry, generally before users send out the transaction, they would query a RPC node about the estimated gas needed. The RPC nodes will estimate the gas needed based on execution fee and the fee to revive all the expired states. If the estimated gas needed is acceptable, the user will send out the transaction as usual.

When the transaction is propagated to the tx pool of validator, the validator would know whether the transaction needs to access any expired state or not, if yes, the validator would be required to collect the witness and rewarded accordingly. It is ok if the validator does not get the witness, then the validator will not be allowed to include this transaction.

b.From Node’s Perspective

In Epoch 0

Nothing will be changed, nodes just operate as usual.

In Epoch 1

State will not expire when entering Epoch 1, so the whole state will be there in Epoch 1, no rent will be charged either. But the meta information will be updated on block finalization to record the latest status.

In Epoch 2+

Since Epoch 2, state expiry will start to work. The state of the contract will be expired if none of the following requirements are met:

  • The state has been accessed in current or last epoch.
  • It is hot, i.e. the percentage of keys accessed during the last epoch is greater than or equal to $LivenessThreshold (unable to do it without big scan, )
  • It has enough RentBalance left for KV pairs in old epochs, KV in current and previous epoch will be exempted. (not determine, if rent model will be used)

3.4.The Meta

a.Definition

Version uint8 // in case meta definition upgrade in the future?
RLP {
  EpochRecords []{epoch, numKeys} // for epoch based rent charge
  NumKeysTotal uint64 // without keys in epoch0 if no big scan
  RentBalance uint64 // deduct on access
}

// if no rent, no meta content needed, can be empty

b.How to persistent the meta bytes

There will be a meta hash in account structure, which will be the key to access encoded meta information.

c.How To Update The MetaInfo

It will be updated on block finalization, not by transaction level. When a block is finalized, then the access record will be determined, we will know the keys that are accessed, created or deleted. These information will be updated to the meta structure

d.How To Calculate The MetaHash

Metahash is just simply the keccak256 hash of the meta bytecode

3.5.Epoch Information In Trie Node

It would be very simple, only extend the current branch node, which will include an epoch map, which is used to mark its children’s accessed epoch value. Hash calculation will include this epoch map element, so once the epoch map is updated, the corresponding intermediate nodes will be updated as well.

3.5.About GetState/SetState

If GetState tries to access an expired state, then the transaction will be reverted.

For SetState, it must do GetState first, if GetState fails due to accessing expired state, then the transaction will be reverted.

And since delete operation can be treated the same as set, it also needs to perform GetState first.

3.6.Gas Metering

Depends on the witness size used in this contract, the cost can calculated simply by: WitnessSize * WitnessPrice

If several transactions have overlapped KVs to revive, the first transaction would probably pay more, as when the later transaction to access these KVs, they are already revived.

It is somehow reasonable, as the first transaction will be executed first, so it will pay slightly more.

3.7.Support Snapshot

Snapshot will still be corresponding to the MPT structure.

Once the MPT is shrinked due to more sub-paths being expired, which will make the MPT end up with some boundary nodes. Boundary nodes are trie nodes that are either leaf nodes or intermediate nodes with at least one of its children expired.

The snapshot shrink can be conducted by off-line prune according to the MPT.

a.By Off-Line Prune

After off line prune of the MPT trie, the boundary will be generated. Then just go through the MPT tree, prune the snapshot according to the intermediate boundary node.

// For contratc A, it has several several hashed keys with epoch marked:
0x000000000000000000000000_00000000, epoch 1
0x000000000000000000000000_01000000, epoch 0
0x000000000000000000000000_01010000, epoch 0
0x000000000000000000000000_01020000, epoch 0
0x000000000000000000000000_01030000, epoch 0
0x000000000000000000000000_02000000, epoch 1
0x000000000000000000000000_03000000, epoch 1

// Currently, it is in epoch 2, so after prune, the tree will be
0x000000000000000000000000_00000000, epoch 1
0x000000000000000000000000_01, epochMap(0,0,0,0) // 4 children expired in epoch 0
0x000000000000000000000000_02000000, epoch 1
0x000000000000000000000000_03000000, epoch 1

b.Handle State Revive

TBD

c.Handle New Key Insert

TBD

3.8.The Rent Model

a.Rent Policy

  • KVs accessed in current or previous epoch will not be kept, no extra fee needed
  • Liveness:

If the percentage of not expired KV accessed in the last epoch is greater than a threshold(30%? governable), then these alive KVs will not be expired.

update: no liveness check, since no big scan

  • The Rent Balance will be used to pay for these alive KVs, pay by best

User could save a certain range of Epoch, user may prefer to only save the KV of a few recent epochs

b.How To Fill The RentBalance

There will be a system contract to handle it, users just need to call it with the target address & balance provided, the system contract will help add the rent balance to the target address.

The balance can not be reclaimed.

But if the user sends the balance to an un-existed address, can it be refunded?

// StateExpiryContract: 0x0000000000000000000000000000000000001008
// or Precompile Contract?
func AddRentBlance(target Address) {
  balance = GetValue() // value in transaction
  metaInfo = GetMetaInfo(target)
  metaInfo.rentBalance += balance
}

c.How To Determine The RentPrice

TBD

d.How To Charge Rent Fee

On first access of metainfo in a new Epoch, i.e. CurrentEpoch is not in EpochRecord, there will be a rent price for each epoch

EpochRecords []{epoch, numKeys} // for epoch based rent charge

if curEpoch in EpochRecords {
  // rent fee already charged, once per epoch
  return
}

// first time to access
var NewEpochRecords := {curEpoch, 0} // reset
EpochRecordsSorted := sort(EpochRecords) // decending order
for epoch, numKeys := range EpochRecordsSorted {
  fee := GetEpochFee(epoch, numKeys)
  if RentBalance >= fee {
    NewEpochRecords := append({epoch, numKeys})
    RentBalance -= fee
  }
}

3.9.Precompile Contracts: BSCStateExpiry

This contract will has some variable to set for governance

contract BSCStateExpiry {
  uint256 public scanSpeed;
  mapping(uint256 -> uint256) public rentPriceMap;
  uint256 public livenessThreshold;
  function updateParam(string key, bytes value) {
    if (Memory.compareStrings(key, "ScanSpeed")) {
      scanSpeed = BytesToTypes.bytesToUint256(32, value);
    } else if (Memory.compareStrings(key, "RentPrice")) {
      rentPrice = BytesToTypes.bytesToUint256(32, value);
      rentPriceMap[curEpoch] = rentPrice
    } else if (Memory.compareStrings(key, "LivenessThreshold")) {
      livenessThreshold = BytesToTypes.bytesToUint256(32, value);
    }
  }
}

3.10.New Account Structure

type Account struct {
Nonce uint64
Balance *big.Int
Root []byte
CodeHash []byte
MetaHash []byte // for StateExpiry
}

3.12.New Block Structure (TBD)

  • witness can be aggregated into block
  • witness can be discarded after a certain period, like blob data, since witness can be generated by re-executing the blocks.
type Witness struct {
addr Hash
keys []Hash
proof []byte
}

type Block struct {
header *Header
uncles []*Header
transactions Transactions
witness []Witness // for StateExpiry
}

3.12.State Revive

  • transaction does not need to provide a witness, just pay the gas fee as usual, the validator will help collect the witness.
  • partial revive will be supported, but need at least 1 KV
  • depends on boundary nodes, as witnesses must begin from a boundary node?

3.13.Prune (Important)

off-line prune(bloom-filter) + off-line prune(epoch based, go through the tree, there is a safety check)

And if PBSS is enabled, on second off-line is needed

3.14.Remote DB

may not need to cover in BEP

3.15.New RPC

query new account.meta, KV…

x.Some Challenges

  • meta storage(fixed): extend the account
  • meta hash(fixed): based on the metaHash + epochChangeSet
  • TxPool, expensive to check expired transactions, need to execute first (still ok?)
  • snapshot: to keep it compact, how to shrink?

can be supported, see: leveldb/include/leveldb/iterator.h at 068d5ee1a3ac40dabd00d211d5013af44be55bea · google/leveldb · GitHub

  • possible to remove the witness? like blob data? (open)
  • What if there are lots of states that revive demand? Like an old project becoming popular again… has to revive by 1 by

4.Rationale

  • Keep L1 Account Trie Now, could introduce GC mechanism to remove “tiny account”

  • Verkle Still Can Be Used In Storage Trie?

Must:

  • State Must Be Recoverable
  • Contract Behavior Should Be As Expected, access expired key can revert

Bettwe To Have:

  • Affordable revive cost

4.1.Why Keep The L1 Account Trie

There are several reasons to keep it:

  • The size of the L1 account trie is relatively small, constituting only around 4% of the L2 storage trie on BSC as of the end of 2022.
  • The L1 account trie contains crucial information about user accounts, such as their balance and nonce. If users were required to revive their accounts before accessing their assets, it would significantly impact their experience.
  • By retaining the L1 account trie, the witness verification process can be much simpler.

4.2.Why Not Create A New L2 Storage Trie

In this proposal, the trie skeleton will be kept in a new epoch. There are other approaches

which will generate a new trie tree from scratch at the start of a new epoch. Although they

provide a comprehensive solution for state expiry, there are still two unsolved issues to address: account resurrection conflict and witness size. Additionally, they would have a significant impact on the ecosystem and rely on other infrastructure, such as address extension and Verkle Tree.

By keeping the skeleton of the trie, it would be much easier to do witness verification and have less impact on the current ecosystem.

4.3.Reasonable Epoch Period

The state will expire if it has not been accessed for at least 1 epoch or at most 2 epochs. On average, the expiry period is 1.5 epochs. If we set the epoch period to represent 2/3 of a year, then the average state expiry period would be one year, which seems like a reasonable value.

4.4.Why Rent Model

less impact to user’s business

5.Forward Compatibility

5.1.Account Abstraction

Account abstraction implementation will be impacted, as these accounts could be stored in the L2 storage trie and could be expired.

5.2.L2 Rollup: Optimism & ZK

Rollups could be impacted if the rollup transactions try to access expired storage.

5.3.VerkleTree & Stateless

5.4.PBSS

6.Backward Compatibility

6.1.Transaction Execution

The current transaction types will be supported, but if the transaction tries to access or insert through expired nodes, then it could be reverted.

6.2.User Experience

There are several changes that could affect user experience. The behavior of many DApps may change and users will have to pay to revive their expired storage. If the revival size is very large, the cost could be expensive.

6.3.Web3 API

Some of the APIs could be impacted, such as: getProof, eth_getStorageAt…

6.4.Snap Sync

The snap sync mode will heal the world state after the initial block sync. The procedure of world state healing in snap sync mode will need to be updated.

6.5.Archive Node

More storage volume would be needed for the archive node, since more metadata will be generated in each epoch. The increased size could be remarkable, which would make the current archive node reluctant to keep the whole state of BSC mainnet. Archive service may have to be supported in other approaches.

6.6.Light Client

The implementation of the light client would be impacted, since the proof of the shadow tree would also be needed.

7. License

The content is licensed under CC0.

How to update the Epoch-Map only in the block-execution progress ignoring other scenes like RPC-query, using a specified flag to distinguish?

I don’t have a deep understanding of geth and state expiration. Here I have a question about design philosophy: the consistency of snapshot and MPT
in my opinion,

  1. Geth implements snapshot and MPT update at the transaction level in a way similar to database transactions
  2. The snapshot is used to speed up the query of the trie tree, and the snapshot needs to be consistent with the MPT at all times
  3. The prune task here refers to deleting data. Is the data already marked as expired during block verification?
  4. If the above conclusion is correct, can you describe in more detail the process of snapshot and MPT data status expiration