Mastering the Final Boss in Blockchain Scalability: State Growth
 
0x3ED166e87F7b8904dBe0BcF8F918dBf7831a7614
Fuel Labs
0x3ED1
February 2nd, 2024
0xbf166f12A60B21b6Ef992103e9A4A65E36431Ad7
0x0e186b704783Ba103eE32723084eef498475d50B
0x32C52cC786DC9b919ebb0A14426E3a94FEE48154
87 Collected
Mint
 

This is a repost from Nick Dodson’s blog on Jan. 30, 2024.

Introducing Native State Rehydration, State Minimizing Techniques, and a State Minimizing Transaction Model.

I’ve done several talks about state growth recently, and seeing more discussions on X about this topic I thought it’d expand on my presentations and the approach we’re taking with Fuel.

So let’s jump straight in. One of the most significant hurdles that blockchains face is the issue of 'state growth'—a problem that, if left unchecked, could destroy the scalability and efficiency of blockchain networks. Let’s explore what state growth is, why it's a problem, and the solutions proposed to keep blockchains lean and functional as they scale.

Understanding the Processing Bottlenecks of a Blockchain

Before diving into the complexities of state growth, let's break down the three main components of a blockchain that are typically the bottlenecks for scaling network usage:

  • Execution. The work the CPU does to ensure proper syncing, validation, and future block creation.  Solved: There are a lot of options that solve this, such as more efficient virtual machines (FuelVM, Stylus, SVM, MoveVM) and parallel transaction execution (using all cores of your CPU), and better pre-compiles (preset functions in a VM).

  • Data (both storage and availability). Actual transaction data that drives state transitions and allows other nodes to synchronize with the blockchain network and enables fraud or validity proving for rollups.  Solved: There are a few options that solve this, like EIP-4844, sharding designs, and external data-availability layers like Celestia, EigenDA, and Avail.

  • State. This is the active stored information in a local database that ensures proper chain validation and state transitions. This is typically in the “hot path” for blockchain processing, requires a lot of random access on disk and incurs a lot of IO which is typically the slowest area of processing aside from signatures and hashing. ❌ Not Solved.

Each of these components plays a crucial role in a blockchain's operation, but it's the 'state' that we're particularly interested in when discussing growth issues.


The Challenge of State Growth

State growth refers to the ever-increasing accumulation of data that must be entirely stored and managed by nodes in a blockchain network. Because state is something that grows over time, it’s often dismissed as a “future problem”. However, as the state growth snowballs to reach its threshold, node operation is drastically burdened, and this becomes the bottleneck for scalability — proven fatal when it impedes broader adoption and decelerates innovation.

State growth leads to bloated blockchains, where slower transaction times and higher storage costs become the norm, which in turn, can limit a network's scalability and accessibility. Sound familiar? That’s because tackling state growth will be the next catalyst to supercharge rollup economics, not unlike its predecessor problem, throughput, which sparked the rollup revolution.

State size approximations of popular EVM chains.

Data is indicative and used for illustrative purposes only.
Data is indicative and used for illustrative purposes only.

…but, rollups don’t solve state growth?

Rollups allow Ethereum to open the door to “something new”. Existing solutions address the execution layer, with some modular solutions going a step further to tackle data availability. But if these new solutions don't address the core issue of state, then you’re back to battling a zero-sum game. Any blockchain designed today, rollup or not, that doesn’t have a strategy for tackling state growth will ultimately be limited by state bloat, irrespective of its execution or data environment.

Active addresses (not the same as state size, but should be loosely correlated) on Arbitrum and Optimism. Source - Etherscan.
Active addresses (not the same as state size, but should be loosely correlated) on Arbitrum and Optimism. Source - Etherscan.

Comparing State Designs

To illustrate the problem, let's compare Bitcoin and Ethereum management of state:

  • Bitcoin State: Utilizes a UTXO (Unspent Transaction Output) set which is simpler and has traditionally been easier to manage, but with limited programmability.

  • Ethereum State: Includes account balances, smart contract code, and smart contract state—encompassing token balances, approvals, and more.

The Bitcoin state management model is streamlined but limited in scope. Bitcoin's state is managed through individual transaction outputs that can either be spent or unspent. Its UTXO (Unspent Transaction Output) model maintains a clear-cut state through transaction outputs that are either unspent and ready for future transactions, or spent and thus archived in the blockchain history. This makes the UTXO model relatively more manageable and ensures that the state does not grow uncontrollably with every transaction. However, this simplicity comes at the cost of Bitcoin's limited programmability compared to Ethereum's Turing-complete system.

Contrast this with Ethereum's state model, a rich ecosystem of account balances, smart contract codes, and myriad contract states—each interaction a thread in the ever-growing tapestry of data. This constant state evolution, while a testament to Ethereum's versatility, poses significant scalability challenges. As the state inflates with every smart contract execution and transaction, it leads to a bloated network with increased storage requirements and slower processing times, which in turn throttles innovation and user adoption.

The contrast between Bitcoin's and Ethereum's approaches to state management underscores a fundamental trade-off in blockchain design: the simplicity and efficiency of state management versus the complexity and potential of on-chain operations.


Proposed Solutions to State Growth

Several strategies have been proposed to manage state growth:

Letting State Grow

Accepting state growth in exchange for greater bandwidth usage. This is not a good option as it puts higher requirements on full nodes, which restrains the decentralization of the network.

State Rent

Charging fees for storing state data, with the trade-off of potential issues like 'tree rot' (if all of the state elements in Ethereum are in one tree and you forget some of the leaves, you corrupt some of the branching paths), among other issues.

Statelessness

Full nodes would not need to store state, relying on state proofs included with transactions and blocks. Essentially moving away state from the layer 1 chain to rollups. This is the direction Ethereum is going, but there are a lot of unanswered questions on how efficient and maintainable this will be.

Verkle trees.
Verkle trees.
Verkle trees
 
vitalik.eth.limo

Un-Merkalizing the State

A technical approach to managing state data differently. Effectively you would be using full nodes to validate everything or sample things with light clients and forget the state tree altogether.

Application-Level State Compression

Using call-data techniques to compress state data. Essentially you are trading state for bandwidth. Higher bandwidth demands lead to constrained networks, with implications weighed heavily against infrastructural robustness and efficiency trade-offs.

Example 1: Uniswap V3 staker (left side image). State must be rehydrated over bandwidth. This enables a very state minimized design, and calldata is much cheaper than storage on Ethereum. Example 2: Compressed NFTs (right side image). Merklize NFT ownership data and store root in state.

Compressed NFTs - Helios.
Compressed NFTs - Helios.

And Now…Native State Rehydration.


Fuel’s State Philosophy

By leveraging the UTXO model, there are several “freebies” that you get:

  • Localized State Trees: No global state tree, only local state trees for each smart contract.

  • Native Assets: All asset transfers only touch a single state element. Native assets can be used to represent non-value state (ex, and NFT to represent ownership). These do not need to be merkalized, simplifying the state.

  • No Approval State: Eliminating unnecessary state changes from approve and transferFrom functions.

The UTXO model allows for all of this while retaining rich cryptographic light clients and verifiability — creating a “fast mode” for true interoperability (more on this in a future post). The main philosophy behind Fuel’s approach is: use more bandwidth and execution, while using less state. But how?

Native State Rehydration

Native state rehydration describes the methods that Fuel developers can use to dehydrate or compartmentalize the state. Things are rehydrated over bandwidth which allows for re-accessing the state when needed. This would be opposed to the conventional (“use smart contracts for everything”) approach of Ethereum, using contract state lookups.

The new approach:

  • Store root hashes / state changes only

  • Present data over bandwidth to “rehydrate” state

  • Provide state minimized techniques for the developer to leverage this.

State Minimized Techniques

A focus on bandwidth and execution over state storage. Fuel gives the developer many ways to do things other than just smart contract storage:

  • Scripts: Ephemeral logic is included in transactions, not stored in state. Unlike EVM transactions which can call a contract directly (but can only call a single contract), Fuel transactions execute a script, which may call zero or more contracts.

  • Predicates: Lightweight, stateless contracts. A predicate is a new, pure mechanism for authorizing transactions. A predicate can only access the data in a transaction, it cannot view the current chain state. Predicates can be used, among other things, to enable native (stateless) account abstraction.

Learn more about Predicates in this post by Ryan Sproule: A predicate is not a smart contract but still allows custom authentication logic for spending coins. This means that predicates can be spent without the need for a private key, unlike any EVM transaction. In practice, this means users can construct predicates that can be spent fully permissionless. When combined with the Fuel concept of scripts, the user experience for interacting with smart contracts becomes supercharged.

State-Minimized Transaction Model

Combining state minimized techniques with the UTXO model allows us to create a new Flexible Transaction Model. This allows more options to form multi-party complex transactions that don't require smart contracts for accessing state.

The Fuel UTXO based transaction model.
The Fuel UTXO based transaction model.

What would this look like in practice? Example:

Smart-contract wallets (with only one 32 byte state element)

  • Contract state is stored in a single root hash in a UTXO

  • State is rehydrated over bandwidth when needed

  • UTXOs ensure light client verifiability without global merkle tree

  • Only requires one IO read

  • State can be changed when state UTXO is spent

  • No loss of smart contract wallet functionality VS Ethereum

  • Bandwidth and execution are prioritized over state

  • All done at the native level (predicates)

Fuel’s architecture is designed to incorporate all of these features along with state minimized execution to create a package purpose built for Ethereum rollups. Fuel brings new capability into the Ethereum ecosystem while preserving security by final settlement on Ethereum.

While the battle against state growth continues, the tools and strategies, such as those from Fuel, offer hope for a scalable and efficient future. As the proverb goes, "Necessity is the mother of invention," and in the world of blockchain, the necessity to conquer state growth has indeed led to some zero-to-one solutions.