Dear BnB Community,
NodeReal is one of the core contributors of Greenfield and we are thrilled to share the proposal of Greenfield Executable design to discuss with the community.
Our proposal introduces the “data operation” capability to Greenfield, which we believe is important to Greenfield since it helps enhance its flexibility and enable it to better support various data use cases in the ecosystem.
Welcome for any comments!
(This proposal will be divided into several posts due to the limitation of words, A wellformed version is available at greenfield-executable-design/Design_Proposal.md at main · node-real/greenfield-executable-design · GitHub)
Greenfield Executable (Design Proposal)
Requirement
Basic capabilities
1. Support “Executable” data on Greenfield with access control
1.1. Executable - The object with executable permission that conduct data operations
-
Reads the specific data or data snippets
-
Deletes exist data or create new data
-
Invokes other executables with data
-
Operates on data stored on greenfield storage SPs
-
Must have correct permissions of the data it operates on
-
Capable of performing all kinds of data operations
1.2. Access control
-
Executable is an entity that takes access control of other data.
-
Limitations on the executable:
-
The executable can act on the data/content only if it is allowed.
-
The executable can invoke other executables only if it is allowed.
-
All system/native related behavior must be handled under control by the greenfield framework or SDK.
-
The executable can create/delete data only with granted access.
-
1.3. Capability of the executable
-
The executable is read-only code that cannot be dynamically changed at runtime.
-
The executable consumes gas fee to run operations (binary or micro-code) on the data.
-
The executable result is not constrained to be verified, meaning that it is not limited that the execution result must be on-chain for consensus, although the execution status is recorded on-chain.
2. Help build up Greenfield ecosystem from the programming language level
2.1. Be the key-player of BSC ecosystem
-
Prioritize data operation over token operation.
-
Cover data processing scenarios as much as possible.
-
Leverage the token chain (BSC) for any token operations and economics.
2.2. Provide programming language level support to build a friendly ecosystem
-
Rust, Go (tinyGo), Assembly script, C
-
Targets towards WebAssembly
-
Provides Greenfield executable SDK APIs for each language
-
Standard API for calling Greenfield-related functions
-
-
More languages (not in day one)
-
Popular languages to help widen the use case of Greenfield for web2/web3 and cover all kinds of data processing.
-
Limited capability of general-purpose programming languages
- Java, JavaScript, Python, etc.
-
More generalized runtime environment for full-featured programming languages (long-term target).
-
3. Consolidate the security of data operations on greenfield
3.1. Language level
-
Language should not allow unexpected behaviors without monitoring (syscall, io, etc)
-
Must avoid data leakage and unauthorized access
3.2. Whole stack level (sandboxing)
- Ensure that a compromised system cannot create security holes in the virtual machine.
4. Principle - Open and Close
- Open
- The execution system should be open enough to satisfy different data computation requirements.
- Close
- The execution system should be closed enough to protect the data and code from misuse and leaks.
5. Execution proof mechanism
- fraud-proof or validity-proof system should be in place to help users verify and challenge the execution results.
Design and architecture overview
Design Overview
-
The executables are stored as objects in the creator’s storage SPs, with metadata containing the configuration information of the executable to be used at runtime in the form of a package. The meta-info (hash) of the package will be recorded on-chain.
-
The creation, invocation, and return of the executable will be recorded in the form of a transaction.
-
The invocation is triggered by a transaction that specifies the executable, invocation type and parameters.
-
The result contains a receipt that records the CPU/RAM usage, gas fee, and proof of execution, among other details.
-
-
The permission design principle for executables is to treat them as entities with permissions and to avoid runtime checks as much as possible to reduce complexity. During development, the developer grants the executable access rights to all the data it interacts with statically. Prior to invocation, the executable must have access to the data from the invoker or third-party data providers (note that neither the invoker nor the executable developer necessarily has permission for all touchable data). During runtime, only minimal checks are performed to avoid interrupting the program. Additionally, any newly generated data is granted access rights to the executable by default. The data is stored in the correct bucket at the end of execution.
-
By default, the executable only has access to the data it is packaged with and the data whose access is granted to it specifically (static access). Once deployed, any third-party users/accounts can add or remove data access to the executable by submitting transactions(dynamic access). At runtime, the invoker can introduce new data access to the executable before running, for any data to be used during runtime (dynamic permission).
-
A new copy-execute-destroy mode will be introduced. This mode will elect an executable service provider, set up the execution environment, copy the executable, and all necessary data into the execution environment (Prologue). Then, it will conduct the running process (Conduct), submit the results, and finally destroy the environment (Epilogue). Please see the following description of the three phases.
-
To invoke an executable, the invoker sends Tx to greenfield. The Tx specifies the executable to be invoked, the input data, and the gas information. The greenfield will then identify the related service provider (SP) to set up the execution environment and invoke the executable. Upon completion, receipts will be generated and sent to the greenfield as a Tx.
-
At runtime, the execution environment is deployed at the SP side as a sandbox, and the invoker pays gas to execute the contract. Part of the gas will be paid to execute SP for computation resources consumption, and part of the gas will be paid to the storage SP if new data is generated and stored. If the gas is exhausted during execution, the program will pause and create a “Pause” typed Tx, which notifies the invoker to charge the fee. The invoker can then decide whether to charge or not by sending a new Tx. Once the fee is paid, the executable will resume. If there is a lack of gas or pending timeout, the execute process will exit.
-
At execution time, three phases are designed: Prologue, Conduct, and Epilogue.
-
The Prologue phase handles preparatory work such as permission checks, data preparation, and environment setup. This includes selecting the executable SP, setting up the execution environment, copying all necessary data, performing gas checking, and launching the executable.
-
The Conduct phase performs the actual execution work. It consumes gas, makes inside/outside calls, and handles pending/resume operations. It then generates results to return.
-
The Epilogue phase handles result checking, data finalization, generating receipt transactions, destroying the execution environment, and updating the receipt on-chain.
-
-
To protect against data/code leaks, the execution system should have permission control of the executable at runtime. This includes access checking, syscall monitoring, and out call guarding. Access checking can be fast and simple at runtime or even eliminated in production due to the permission check in the prologue.
-
From SP’s perspective, the executable and runtime can be provided as a FaaS service, where the invocation transaction acts as a trigger. Greenfield has no limitations on the deployment and implementation of the entire execution solution.
-
The runtime system should introduce a mechanism for generating a proof of execution, and provide users with a way to verify and challenge it.
workflow
-
Developers create the executable package by writing code and compiling it to a Greenfield-compatible executable binary (wasm). They then combine it with metadata to generate a package.
-
The developer creates a transaction to deploy the executable, using the transaction type “putObject” and marking the object as executable.
-
The Greenfield node receives the transaction, checks related permissions, and stores the executable package in the creator’s primary storage provider (SP). It then propagates the package to secondary SPs. Once the package is finalized (the block generated), the executable object information (hash and metadata) and ABI are recorded on the chain.
-
Users who want to invoke the executable issue a transaction specifying the invocation metadata. This metadata contains the executable ID (address/hash), input parameters, and invocation type.
-
The node identifies the transaction, performs permission checks, and prepares it to satisfy the user’s invocation pre-requests. It then conducts the prologue work, which includes the following permission checks (using on-chain information):
- The invoker has the permission to run the executable
- The executable has the permission to access the input data provided by invoker
- The executable has permission to access its third-party shared data.
After completing all the checks, the node randomly selects an execution SP to set up the execution environment.
-
Execute SP then starts to build the execution environment, which includes parsing the executable configuration, verifying bytecode (if necessary), configuring the execution engine, preparing all required data, and launching the VM (mainly the Prologue phase).
-
The executable runs on the VM during this phase, which usually involves loading, interpreting, resource management, data access, input/output calls, and more.
-
During phase 7, there may be data access, creation, or deletion that requires SP to conduct related data operations. In such cases, the corresponding out-call is triggered to act as a bridge.
-
In phase 7, queries for accessing on-chain information may occur (although this rarely happens due to the preparation steps in the prologue). This can be achieved by out-calling to the greenfield node for information gathering. Additionally, the meta service could also be used for data queries, especially when the node API is unstable due to network traffic, system throttling, API bandwidth, etc. Or when the data being queried cannot be grabbed on-chain directly. Please note that when Pause/resume happens at runtime, the executable will pause and then do an out-call to the greenfield framework, which creates the related TX and then waits for an incoming signal from the resume TX.
-
After execution is complete, a receipt is generated and sent to the node to be recorded on the chain. Users can find proof from the receipt to verify the execution result. If any data creation/deletion occurs at runtime, they will be finalized to storage SPs during this phase.
Execute Service Provider (Execute SP)
Similar to greenfield storage providers for data stores, there are greenfield execute service providers dedicated to providing the execution environment and resources to support greenfield executables.
To become an execute SP, providers must register themselves by depositing on greenfield as their “service staking”. Greenfield validators will go through a dedicated governance procedure to vote for the execute SPs of their election. Execute SPs are encouraged to advertise their information and prove to the community their capability, as they must provide a professional execution environment with quality and security assurance.
The challenging system for storage SPs also works for execute SPs. Users, validators, storage SPs, other execute SPs, and the greenfield itself may challenge an execute SP for data integrity, resource availability, and security breaches, among other issues. The challenger needs to provide “proof”, and the validator would help verify and vote. If the challenge succeeds, the challenger and validator would be rewarded, whereas the challengee would be punished by having part or all of their stakes slashed (depending on the severity of the issue).
Execute SP Pool & election
All registered execute SPs are added to the execute SP pool along with their capability descriptions. The executable provides the minimal resource requirements for its execution. Then, at the time of invocation, the greenfield selects an execute SP from the pool based on the executable’s minimal requirements and the invoker’s specified preferences.
Elastic Scaling (not in day one)
Given the various computational scenarios of the executable, it is reasonable for the execute SPs to provide elastic scaling capability. To support this, the execute SP provider adds their scaling capability limit in their SP description. The executable could specify its recommended scaling, and the invoker could provide their preference at invoke time.
Permission
The Permission System plays a key role in the greenfield executable. Access permissions for which part of the data should be accessible to the executable and accounts (creator, invoker, 3rd party) are vital and must be in accordance with the whole greenfield permission system.
A straightforward idea is to use the method of “static + dynamic permission” combination. The executable would have “static permission” specified by the developer at development time (assigned at install time). It would also have “dynamic permission” assigned by the 3rd party data provider and invoker at runtime. Static permission is a property fixed and bound with the executable, taking effect at any execution. Dynamic permission is a property that may be changed from invocation to invocation, it is akin to the invoker/3rd party and only taks effect at runtime. Different invocations would have different permission assigned.
This is similar to Android applications, where the app developer specifies which data can be used by the app internally (under the application’s installed folder) and which data needs to be granted from the system. The application user specifies whether to give the permission at runtime (actually at the time of launching the application or at the time of installation).
For a greenfield executable, there can be three kinds of data accessed at runtime: the executable’s internal data, the invoker’s input data, and the third-party public or shared data. The greenfield permission work modes are required to handle them correctly and safely.
The internal data is packaged with the executable, stored on the same bucket as the developer (creator)'s SPs. The executable natively has access to the data, and there is no access check for the executable to use this data at runtime.
The input data are invoker-provided parameters of the executable. For these data access, the executable inherits permission from the invoker at preparing time, which makes it have the correct access to the exact input data. Be aware that in this case, only the access of the given data is inherited. The Greenfield should guarantee that the executable has no permission leaked to visit any extra data of the invoker. (At the implementation level with copy-exec-destroy mode, these steps can be just permission check + data copy.)
The third-party data is a little complex, it plays the role of being “shared” data but may not be public. Therefore, there can be three kinds of shared data from the executable point of view:
a. “Executable provided” shared data provided by executable, which means the executable internally visits the shared data that is not in the package. Note that it is not necessary for the invoker to have permission of accessing this kind of data, considering the data security.
b. “Invoker provided” shared data provided by invoker, which means the executable accepts the input data that is not owned by the executable itself, in this case, the invoker should guarantee that the data access is granted to him/her and hence inherited to executable.
c. "3rd party provided (dynamically)" These are the data that can be shared to the executable dynamically. Which means the 3rd parties can dynamically grant or cancel the access to executable at any time on-chain. This kind of property makes sure that neither the executable developer nor the invoker could necessarily “touch” the data ahead of time, the data only available to executable at specified time span and with specified limitation. To implement this feature, the 3rd party data provider first grant the data access to the executable by a transaction, then at runtime, the permission are checked and data are copied as normal.
All these kinds of data accessing can be achieved by copy-exec-destroy mode, by first checking permission and then copying all needed data.
(To be continued…)