RFC 005: Optimistic Execution
- 2023-06-07: Refactor for Cosmos SDK (@facundomedica)
- 2022-08-16: Initial draft by Sei Network
Before ABCI++, the first and only time a CometBFT blockchain's application layer would know about a block proposal is after the voting period, at which point CometBFT would invoke
Commit ABCI methods of the application, with the block proposal contents passed in.
With the introduction of ABCI++, the application layer now receives the block proposal before the voting period commences. This can be used to optimistically execute the block proposal in parallel with the voting process, thus reducing the block time.
Given that the application receives the block proposal in an earlier stage (
ProcessProposal), it can be executed in the background so when
FinalizeBlock is called the response can returned instantly.
The newly introduced ABCI method
ProcessProposal is called after a node receives the full block proposal of the current height but before prevote starts. CometBFT states that preemptively executing the block proposal is a potential use case for it:
- Contains all information on the proposed block needed to fully execute it.
- The Application may fully execute the block as though it was handling
- However, any resulting state changes must be kept as candidate state, and the Application should be ready to discard it in case another block is decided.
- The Application MAY fully execute the block — immediate execution
Nevertheless, synchronously executing the proposal preemptively would not improve block time because it would just change the order of events (so the time we would like to save will be spent at
ProcessProposal instead of
Instead, we need to make block execution asynchronous by starting a goroutine in
ProcessProposal (whose termination signal is kept in the application context) and returning a response immediately. That way, the actual block execution would happen at the same time as voting. When voting finishes and
FinalizeBlock is called, the application handler can wait for the previously started goroutine to finish, and commit the resulting cache store if the block hash matches.
Assuming average voting period takes
P and average block execution takes
Q, this would reduce the average block time by
P + Q - max(P, Q).
Sei Network reported
Q=~300ms during a load test, meaning that optimistic execution could cut the block time by ~300ms.
The following diagram illustrates the intended flow:
OEG[Optimistic Execution Goroutine]-.->FinalizeBlock
- In the case that a proposal is being rejected by the local node, this node won't attempt to execute the proposal.
- The app must drop any previous Optimistic Execution data if
ProcessProposalis called, in other words, abort and restart.
The execution context needs to have the following information:
- The block proposal (
- Termination and completion signal for the OE goroutine
The OE goroutine would run on top of a cached branch of the root multi-store (which is the default behavior for
FinalizeBlock as we only write to the underlying store once we've reached the end).
The OE goroutine would periodically check if a termination signal has been sent to it, and stops if so. Once the OE goroutine finishes the execution it will set the completion signal.
The application developers will have the option to enable or disable Optimistic Execution, being disabled by default. To enable it, the application will need to pass the
SetOptimisticExecution option to
NewBaseApp. This is because this feature will be considered experimental until it's proven to be stable. Note that individual nodes should not enable/disable this feature on their own, as it could lead to inconsistent behavior.
Upon receiving a
ProcessProposal call, the SDK would adopt the following procedure if OE is enabled:
abort any running OE goroutine and wait for goroutine exit
if height > initial height:
set OE fields
kick off an OP goroutine that optimistically process the proposal
respond to CometBFT
OE won't be enabled during the first block of the chain, regardless of the configuration. This is because during the first block
InitChain writes to
FinalizeBlock starts with an already initialized state. In the case that an abort would be needed it would mean we need to run
InitChain again or discard/recover the state; either way complicates the implementation too much just for a single block.
Upon receiving a
FinalizeBlock call, the SDK would adopt the following procedure:
if OE is enabled:
abort if the currently executing OE's hash doesn't match the proposal's hash
process the proposal synchronously
wait for the OE goroutine to finish
respond to CometBFT
This can only be implemented on SDK versions using ABCI++, meaning v0.50.x and above.
- Shorter block times for the same amount of transactions
- Adds some complexity to the SDK
- Multiple rounds on a block that contains long running transactions could be problematic
- Each node can decide whether to use this feature or not.