As Ethereum network activity continues to trend higher, the network is becoming increasingly overwhelmed with transaction volume. Due to the rising demand for transactions combined with the limited number of state updates (for the sake of simplicity, we can think of this as changes to account balances on chain) that can be made in a given time span, the fees necessary to interact with the Ethereum base layer have become prohibitively expensive for many users. The average transaction fees on Ethereum since August have ranged from $15 to $55 per transaction, as illustrated in the following chart from Messari Research. While these averages are likely skewed to the upside by users willing to pay a premium to mint NFTs (an action which at times requires gas fees orders of magnitude higher than those shown on the chart), it remains the case that for Ethereum to achieve mass adoption, users will need a cheaper way to interact with the network.
While Ethereum is working towards increasing the throughput of the base layer via a sharding approach with ETH 2.0, a number of solutions are being developed that allow for scalability improvements by handling transactions and computation off-chain. Layer 2s are a subset of these solutions characterized by the fact that they derive their security from the blockchain itself rather than from a separate consensus mechanism. As is apparent in the following chart from L2Beat, Ethereum users have been moving funds to L2 to take advantage of cheaper gas, liquidity mining incentives, and potential token airdrops.
Due to Ethereum’s high gas fees, alternative blockchains that offer lower fees have become increasingly popular. However, there are compromises required to provide these scalability advantages on a single chain. The “blockchain trilemma”, a term coined by Vitalik Buterin, describes the reality that blockchains must make tradeoffs between scalability, security, and decentralization. In general, these other blockchains (Solana, for example) provide higher throughput by increasing the amount of state-update space in each block and/or generating blocks faster than Ethereum. By increasing the number of transactions that the blockchain can handle in a given time frame, these networks effectively increase the supply of state-update space which results in a lower cost to transact compared to Ethereum.
However, validating these larger, faster blocks requires much more powerful and sophisticated hardware than what is required to validate on Ethereum. For example, Solana validators require a machine with a bare minimum of 128 gigabytes of RAM, which is significantly more than what is provided in the vast majority of consumer-use hardware. As a result, the number of people with the necessary hardware to validate blocks is much lower, which limits the decentralization of the network. While it is significantly more expensive to become an Ethereum validator than a Solana validator today due to the 32 ETH requirement (roughly $130k at the time of writing), since Ethereum has been around since 2017, the cost to acquire sufficient Ether was far lower for early users than it is today. Furthermore, due to Ethereum’s minimal hardware requirements, the cost to validate is investing in the network, a much more attractive expenditure than sinking funds into non-appreciating hardware. As a result of Ethereum’s longer history and minimal sunk costs required to validate, it currently has ~275,000 validators compared to Solana’s ~1350. While it would be ideal for both networks to make validation more accessible to the average user, the decentralization of the nodes is a far more pressing concern. Most users will not choose to validate, but every user relies on validation being sufficiently decentralized to protect the integrity of their transactions.
Some view the sacrifices made by alternative L1s as contrary to the decentralized ethos of public blockchains. The fewer validators there are securing the network, the more feasible it is for a single bad actor to gain majority control or for the validators to collude with malicious intent. In practice, sacrificing some amount of decentralization to avoid paying exorbitant gas fees is a compromise that many users are willing to make. Ultimately, alternative L1s provide a cost effective alternative to Ethereum, but they don’t truly scale Ethereum’s properties since they compromise on decentralization.
Monolithic vs Modular Blockchains
So how can Ethereum defy the blockchain trilemma and scale without sacrificing security or decentralization? In order to understand how Layer 2 solutions can provide genuine scalability improvements, it is important to first establish three critical roles of a public blockchain: consensus, execution, and data availability. First, a blockchain must provide a highly secure consensus mechanism which ensures that every state change made complies with the set of financial rules by which everyone wants to play (nobody can steal another person’s assets, nobody can spend assets they do not own, etc.). Second, a blockchain must perform computation to execute transactions and change the state of the blockchain. Finally, a blockchain must provide data space in each block to record new state changes in a publicly accessible way.
While blockchains today are monolithic in their design, a new concept known as a modular blockchain may provide the breakthrough needed to scale blockchains without compromising decentralization or security. As the name implies, a monolithic blockchain is designed to fulfill each of the three critical roles on a single blockchain. The need for a monolithic chain to multitask is what results in the blockchain trilemma: assuming no reasonable user would accept weak security on a blockchain, monolithic chains must either sacrifice scalability in favor of decentralization, or vice versa. A modular blockchain philosophy, by comparison, envisions handling consensus on the main layer while delegating at least one of the other two critical roles to a separate layer. Since we can maximize two of the three trilemma properties on each layer, this allows us to favor security and decentralization on the consensus layer (L1) while maximizing scalability and security on the execution layer (L2). As long as the transactions made on the execution layer are enforced and finalized by the highly decentralized consensus layer, this approach can scale Ethereum without compromising decentralization.
Early Layer 2 Solutions
Plasma and state channels are two scaling solutions that have been around for a number of years but have somewhat fallen out of favor as of late. State channels work by locking part of the blockchain via a smart contract. Two or more users will enter into this contract and deposit funds to be used in the channel. These users can then “send” funds back and forth to each other via signed transactions that could be posted on Layer 1, but are not until the channel closes. These signed transactions act as a record of the net change between the users’ accounts. Once a user wants to exit the state channel, they can submit a signed transaction to the smart contract to be posted on Layer 1 which, if accepted, closes the channel. However, before the user can exit there is a waiting period (usually 24 hours) in which another member of the state channel (or a third party representative) can challenge the validity of the signed transaction. If the user attempting to exit posted a transaction that was not the most recent transaction in the channel, the challenger will submit the most recent transaction to the smart contract, proving that the initial submission did not represent the up-to-date net change in balances. The smart contract will then act as a judge, posting the most recent signed transaction on-chain and protecting the integrity of the state channel. While this approach can be applied to smart contract logic, the actual implementation of state channels for this purpose is relatively complex due to the number of possible edge cases that must be accounted for to ensure that everything works as intended.
Plasma works by using merkle trees to create non-custodial child chains that are tethered to Layer 1, which provides very high throughput and significantly lower gas fees than Ethereum L1. Unlike rollups, general smart contract execution is not possible on plasma and transaction data is not made available on chain. Furthermore, in order to withdraw to L1 users must wait 7-14 days to allow users to challenge potentially fraudulent transactions.
The primary drawback of plasma and state channels is that neither effectively handles smart contract execution. While it is possible to do so with state channels, it is far more complicated to implement than with recent alternatives, and plasma sacrifices this capability altogether. With regard to gas fees, state channels effectively amortize fees over the number of transactions that take place in the channel. Since users must interact with an L1 smart contract at least twice (sometimes more in the event of a fraud attempt) in order to use the channel, an extremely high number of transactions would need to take place in order to offer any cost savings over rollups. While plasma solutions can provide lower user gas fees than some rollup options, these cost savings are also amortized and thus to some extent reliant on plasma transaction volume. There are also already L2 solutions available (ZK rollups) that provide very inexpensive payment processing without some of plasma’s shortcomings like the long waiting period to withdraw. Soon these solutions will also be able to execute smart contracts and, for users who don’t mind off-chain data availability, provide near gasless transactions and smart contract interactions.
Today, rollups are widely considered to be the future of Ethereum scaling due to their myriad of benefits when compared to state channels and plasma, a sentiment reflected in the following chart from Etherscan.
Rollups are by far the most popular Layer 2 solution at the moment. Rollups improve scalability by executing transactions off-chain, batching and compressing the transaction data, and then posting it on Layer 1 through a smart contract. By moving the execution of the transaction off-chain, fees incurred by users decrease and transaction speed increases , all while maintaining a comparable level of security to that of Layer 1. But how does Ethereum know if state updates that are posted to Layer 1 are valid? The L1 smart contract deployed by the rollup not only processes deposits, withdrawals, and transactions, but also verifies proofs related to the validity of each transaction block. The method by which each rollup verifies the validity of transaction data is what differentiates the two main variations: Optimistic rollups and ZK rollups. As illustrated in the chart below from Vitalik Buterin’s An Incomplete Guide to Rollups, both types of rollup provide significant scalability improvements.
Optimistic rollups are “optimistic” in that they assume all rollup transactions are valid by default and rely on game theory to enforce transaction validity. In the event that a rollup validator broadcasts a transaction that is successfully identified as being fraudulent, their staked assets (collateral posted by validators to back up their promise of good behavior) will be slashed. These fraudulent transactions are identified by users who submit fraud proofs to formally challenge the validity of the transaction in question. The rollup then executes a fraud proof by running the transaction using the prior updated state to determine its validity. If the prover is correct in their suspicion, they are usually awarded with some of the misbehaving validator’s slashed funds. There are numerous benefits to this approach. One is that Optimistic rollups are naturally EVM compatible, meaning that anything that can be done on Layer 1 can also be done using an Optimistic rollup. ZK rollups (at least currently) do not share this property, and are instead limited to specific applications rather than being able to execute any arbitrary smart contract. Additionally, Optimistic rollups are much easier to build and deploy than ZK rollups. However, a significant drawback is that, by nature of the fraud-proof system, users usually must wait for a significant amount of time (usually one to two weeks) before withdrawing their funds. Since Optimistic rollups assume all transactions are valid, they must allocate a challenge period between when transaction data is posted to Layer 1 and when funds are eligible for withdrawal from the rollup. This provides adequate time for users to detect potentially fraudulent transactions. It is worth noting, however, that liquidity providers on third party bridging services like Hop exchange can offer users near-instant withdrawals from Optimistic rollups for a fee.
Optimism vs Arbitrum
Two of the most popular Optimistic rollups are Optimism and Arbitrum. The primary difference between the two is the fraud proof implementation. Optimism uses a single-round fraud proof, which must re-execute all of the transactions in the questionable rollup transaction batch on mainnet. Arbitrum, on the other hand, implements a multi-round, interactive fraud-proof approach. This means that instead of rerunning an entire batch’s worth of rollup transactions on mainnet, the two parties in a dispute can narrow down their disagreement to as few transactions as possible. These specific transactions are then executed on mainnet to check for validity, which reduces cost significantly when compared to checking the entire batch of transactions, many of which are likely not of concern to the user who posted the fraud proof. It is worth noting that Optimism is looking to implement a similar interactive proof model for their version 2.
Optimism Forks: Boba and Metis
Two newer Optimistic rollups are Boba Network and Metis Andromeda. Each is considered an Optimism “Fork”, a rollup based on Optimism’s codebase but with changes made to provide different features to users. Some of Boba Network’s unique features include hybrid compute, which allows smart contracts to trigger off-chain computation on a service like AWS and then import the result of that computation back into the smart contract. This enables dapp developers to use far more complex algorithms and implement them in a wider variety of programming languages than would be possible in an EVM-constrained environment. Boba also integrates the liquidity provider withdrawal model (like the one used by Hop exchange) into their native bridge, allowing users to pay a fee to exit the rollup before the end of the transaction challenge period.
Metis provides features focused on the development and expansion of the metaverse. One advantage Metis provides over other L2s is a framework for DACs, (decentralized autonomous companies) a variant of DAOs (decentralized autonomous organizations). While DAOs have been remarkably popular as of late, the DAO framework is missing important features that are necessary for companies to operate in a decentralized manner. In a DAO, a user’s voting weight is determined solely by the amount of DAO tokens they own. This creates issues since voting power is determined in large part by a DAO members’ financial wherewithal without taking into account their competence or history in the organization. There is also no easy way for DAOs to encode the subdivisions that exist in most companies. Metis’ DAC framework provides solutions to these issues, allowing decentralized companies to track users’ metrics, assign them roles and corresponding privileges, and implement horizontal subdivisions like human resources and payroll departments. Metis’ stated goal with this framework is to make it easier for web3 companies, as well as web2 companies and traditional organizations, to implement a decentralized structure while maintaining the necessary corporate stratification for the company to run efficiently.
Another notable feature of Metis is that it provides blockchain middleware, essentially tooling that makes life easier for developers. Metis’ middleware solution is known as Metis Polis and allows developers to abstract away some of the low level details associated with building on blockchain, speeding up development time and lowering the barrier to entry for those with web2 experience. The services provided by Polis include a smart contract domain service, traditional authentication methods, a smart contract API service, and a built in block explorer. The smart contract domain service works like ENS but for smart contracts, allowing developers to name a smart contract and address it by name rather than by it’s raw hexadecimal string. Traditional authentication methods allow users to interact with the protocol via passwords and authentication tokens rather than by signing with private keys. The smart contract API service provides developers with the means to authenticate and send transactions to any smart contract using HTTP, which helps to onboard web2 devs who are comfortable with REST APIs. Finally, the built in block explorer allows application developers to see all transactions made by their application in one place (even if the transactions are executed through multiple different smart contracts), without the need to use a public block explorer like etherscan.
ZK rollups work similarly to Optimistic rollups in that the execution of transactions is handled on Layer 2. The critical difference is that a ZK rollup broadcasts a validity proof (known as a zero knowledge proof, hence the “ZK” abbreviation) along with the batched transaction data to the Layer 1 smart contract. The validity proof is easily verifiable by other validators and demonstrates that every transaction in the batch is valid. Because every batch of state updates comes with a validity proof, there is no need for users to monitor the network and post fraud proofs. Therefore, unlike on Optimistic rollups, there is no need for such a long waiting period before withdrawing funds from a ZK rollup. Another advantage of ZK rollups is that the more transactions that are included in a transaction batch, the lower the amortized gas cost is per transaction since the size of a zero knowledge proof grows very slowly (and in some cases not at all) as the number of transactions included in the proof increases.
There are, of course, drawbacks to ZK rollups as well. One significant issue is that zero knowledge proofs often require a significant computational overhead, multiple orders of magnitude higher than running the computation directly. Zero knowledge proving systems may require specialized hardware to maximize efficiency, similar to the ASICs required to mine Bitcoin. This barrier to entry has the potential to increase centralization due to the smaller number of entities that have, or can attain this hardware. Another significant issue is that no ZK rollups currently live on mainnet are EVM compatible, so they are unable to execute arbitrary smart contracts and instead are relegated to specific applications. With all this being said, it is important to note that ZK rollup tech is still in its infancy and there are a number of projects making rapid progress towards eliminating the current drawbacks.
StarkWare is a major ZK rollup provider that has processed over 200 billion dollars in volume, more than any other rollup available (including Optimistic rollups) by a significant margin. Starkware has provided applications like dYdX, a decentralized derivatives trading platform; Sorare, an NFT based fantasy soccer game; Immutable, a gasless NFT marketplace; and DeveresiFi, a portfolio management and trading platform, with custom tailored, permissioned, ZK-proof-based scaling solutions through their StarkEx platform. These applications can choose between two different approaches with different security and cost tradeoffs. One option is to use a true ZK rollup, which improves scalability substantially and provides on-chain data availability. On-chain data availability results in a higher level of security since anyone with access to the internet can verify the transaction data as it is posted to the Ethereum blockchain, but posting this data costs roughly 1% of a Layer 1 transaction.
Applications that are willing to sacrifice some amount of security and accept some centralization in favor of further reducing fees can opt to use StarkEx’s Validium system. This solution works just like a ZK rollup, except that transaction data is stored off-chain and managed by a Data Availability Committee, a group of significant entities in the crypto industry. These committee members sign every batch of transactions posted on-chain to indicate their possession of the underlying data. In theory, the fact that each of the entities in the committee is a well known member of the blockchain industry in areas with established legal jurisdictions will incentivize them against any bad behavior enabled by their position, like freezing user assets. However, although it is unlikely blockchain companies would ever want to freeze user assets, the very thing that makes them trustworthy (easy identifiability and subject to strong legal jurisdictions) also makes them vulnerable to state coercion. For example, operators could be legally obligated to implement KYC regulations and freeze the accounts of users who do not comply.
StarkWare also recently launched the alpha version of their StarkNet zkRollup solution on mainnet, which enables general computation on a permissionless ZK rollup. This is possible through the use of their Cairo programming language, which is Turing complete and allows for the creation of zero knowledge proofs of any arbitrary computation. Developers who want to use this system will need to use Cairo rather than Solidity (the native programming language of Ethereum) in the near term, but a Solidity-to-Cairo transpiler is in the works at StarkWare. Furthermore, StarkWare has plans to roll out a system known as Volition in the near future, which will enable individual users (rather than just applications like dYdX) to choose whether they use a ZK rollup or Validium approach depending on their data-availability preferences.
Another promising ZK rollup project is zkSync, created by Matter Labs. zkSync has been live on mainnet since the summer of 2020 and is currently usable strictly for payment processing. However, Matter Labs is currently developing an EVM-compatible ZK rollup, which would allow arbitrary smart contracts written in Solidity to be executed with all the benefits of ZK rollups. While this is not available on mainnet yet, the test net version is live along with a fully ported Uniswap V2 dapp. Matter Labs, like Starkware, is pursuing a two-solution approach for zkSync 2.0 that is similar to Starkware’s Volition system. zkSync 2.0 will make use of their general purpose ZK rollup to provide users with on-chain data availability. For users willing to sacrifice L1 data availability for lower fees, zkSync 2.0 provides a solution for maximal scalability known as zkPorter. This is comparable to the Validium system developed by Starkware, but rather than relying on a small group of trusted third parties, zkPorter data will be made available on a decentralized proof of stake network. Furthermore, both solutions under zkSync 2.0 will be fully interoperable, meaning liquidity will not be fragmented between ZK rollup and zkPorter users.
Polygon (formerly Matic network) was one of the first projects to focus on scaling Ethereum and has recently made a significant foray into the ZK rollup space. Polygon aims to provide a diverse repertoire of scalability solutions to their users including the Polygon POS sidechain (or “commit chain”), a data availability layer known as Polygon Avail, and a host of ZK rollup solutions, each with their own strengths. Polygon indicates on their website that they plan to offer an Optimistic rollup solution for Ethereum as well, but it seems that they will be primarily focused on the ZK side of the rollup family moving forward. Polygon recently hosted a zk day and announced the addition of Polygon Zero (formerly Mir Protocol) to their family of scaling solutions. Polygon Zero is a ZK rollup solution that will implement recursive zero knowledge proofs. The recursive property of these proofs enables much faster proof generation. During the “zk day” livestream, the Polygon Zero team demonstrated the generation of a proof in 35 milliseconds on a Macbook Air. This is significant because it solves one of the major issues around ZK rollups, namely the potential for centralization of proof creators due to the currently high hardware requirements to create zero knowledge proofs. If proofs can be generated in under a second with standard consumer laptops, this allows the vast majority of users interacting with the network the opportunity to participate in proof production.
Polygon Nightfall is a hybrid solution currently in development that uses elements of both ZK rollups and Optimistic rollups to create a private transfer and payment platform on Ethereum. Users deposit their assets to a shield contract which uses ZK proof technology to provide privacy on Layer 1. This shield contract then uses Optimistic rollup assumptions (7-day withdrawal time, validity of transactions are enforced by fraud proofs rather than validity proofs) to provide security.
Polygon Miden and Polygon Hermez are the other two ZK rollups in the Polygon family, with Miden using STARK-based proofs and Hermez utilizing SNARK-based proofs. Hermez is currently live, but only usable for payment processing while Miden is in development and will support arbitrary smart contract computation upon release. STARKs and SNARKs are the proof styles used by Starkware and zkSync, respectively. To avoid going too much into the technical details of each, it suffices to understand that each proving system has its own strengths and weaknesses.
Adoption: Bridging, Tokens, and TVL
While Layer 2 solutions have the potential to be a game-changer for widespread Ethereum adoption, this adoption will only occur if users are able to easily move their funds over from mainnet. Unlike on Ethereum alternatives like Solana, Ethereum users who wish to interact with the network in an inexpensive manner need to “bridge” their funds to Layer 2, which requires a rather expensive on-chain transaction. One solution to this issue is for centralized exchanges to allow users to withdraw ETH directly to their Layer 2 of choice. This way, users can avoid the expensive Layer 1 bridge fees and interact with Ethereum solely via Layer 2s from the time they buy their coins to the time they sell them. Both Crypto.com and Binance now offer withdrawals to Arbitrum, and there will certainly be more options (both in terms of exchanges offering L2 withdrawals and the L2s available on each exchange) in the future as Layer 2 ecosystems become more developed and attract more capital. There are a number of bridging protocols that allow users to move assets between L1 and a number of different L2s. Since L2 to L2 bridging is inexpensive, as long as users can avoid touching Layer 1 they will have the ability to move back and forth between L2s in a cost effective manner. Furthermore, bridges that connect alternative L1s and Ethereum L2s have the potential to bring users back to Ethereum who previously left (or avoided it altogether) due to the high fees.
Another way Layer 2 projects are looking to drive adoption is by releasing their own token and providing liquidity mining incentives to encourage users to move assets to their L2 in return for very high yield. Two examples of how effective this approach can be are Boba and Metis, the only two general purpose rollups to have released a native token. After the release of their tokens and high-APY liquidity incentives on their native dex’s, the TVL’s of both Boba and Metis have shot up dramatically, quickly surpassing well established, non-tokenized L2 solutions as illustrated by the L2Beat TVL rankings in the image below. A number of use-case specific L2 applications like Loopring, ImmutableX, and DeversiFi also have native utility tokens.
Unsurprisingly, many of the L2 solutions which have yet to release a token have hinted at or explicitly stated that they will also launch one. zkSync has confirmed that they will have a token; StarkWare’s official position is “no comment” according to their discord; Optimism has declined to give any information on whether they will launch a token; while Arbitrum has explicitly stated that “there will be no Airdrop, ICO, or presale.” Much of this hesitance to commit one way or another is likely the result of the recent surge of “airdrop hunters” (users who use protocols solely for the purpose of meeting the requirements to be given free tokens). Too many airdrop hunters make it more difficult for protocols to identify and give tokens to their genuine early adopters. That said, the hype around a potential token launch certainly seems to be an effective marketing strategy. As we can see from the Dune Analytics dashboard below, the number of unique addresses on zkSync has gone parabolic ever since word spread about their upcoming token. Only time will tell if all of these new users will stay and support the zkSync ecosystem or leave after the token is launched.
The Ethereum foundation is very focused on making Ethereum a rollup-friendly blockchain. This choice is apparent in a recent post from Vitalik regarding EIP 4488 where he says in the opening lines: “Rollups are in the short and medium term, and possibly the long term, the only trustless scaling solution for Ethereum… there is greater urgency in doing anything required to help facilitate an ecosystem-wide move to rollups.” It seems clear at this point that rollups will be the cornerstone of Ethereum scaling for the foreseeable future. While there may be a place for both Plasma and State Channels for certain use-cases, it is likely that most applications will be best served by either ZK rollup or Optimistic rollup solutions in the near term. In the medium to long term, ZK rollups that can execute arbitrary smart contract logic will likely be the ideal solution for most applications as they require fewer trust assumptions than Optimistic rollups and provide significant cost amortization benefits as they increase in scale.
By Lucas Streckenbach
January 1 2021
Disclaimer: This report is for educational purposes and should not be construed as investment advice. Additionally, the author may hold any of the assets mentioned.