Everyone can understand the "Ethereum 2.0 Sharding Design"

When we buy breakfast at 7-11, if there is only one cashier, we have to wait in a long line to check out; if there are two cashiers, it will be twice as fast; if there are four cashiers, maybe there is no need to queue. This is the basic logic of sharding, which is to divide the work of one person among multiple people to improve efficiency.

From the perspective of Ethereum's distributed ledger: before sharding, there was only one ledger, the main chain, which could process approximately 12 to 45 transactions per second. When the transaction volume was greater than this, queues would be required, which meant that the network would be congested. Sharding turns one ledger into 64 ledgers, allowing them to process transactions simultaneously, which is equivalent to a 7-11 with 64 cash registers.

The logic of sharding is simple, but why is it so difficult to implement? Because dividing a ledger into 64 ledgers for bookkeeping will face many new problems, and sharding technology is to solve them. This article will start from these problems to figure out what the sharding of Ethereum 2.0 is all about.

01 How to shard

1. Assign transactions to shards

A shard contains transactions and validators who package transactions into blocks. The first step to complete sharding is to determine how to assign transactions and validators to a shard. Let’s first look at assigning transactions.

Let's use the story of three villages to understand: there is a fishing village, a hunter village, and a farmer village. There are often transactions within and between villages, but there is no currency, so everyone keeps accounts. In the past, one account book was used to record the accounts of the three villages, which was a bit slow. Now it has been changed to three account books, so which account book should be used to record which accounts?

One method is to put three ledgers there, and when a transaction comes in, it will be recorded in the ledger that has no one in line; but this will bring a problem that each ledger must have the account information of everyone, otherwise I will come to your queue and you don’t have my account.

Because of this, a major problem with this sharding approach is that it cannot reduce the amount of data stored on a single ledger, and this storage requirement is a high threshold for nodes that want to participate in bookkeeping; this approach also needs to solve the double-spending problem, because a person can spend the same money in different shards at the same time.

Another method is that the fishing village has one account book, the hunter village has one account book, and the farmer village has one account book. Each account book only contains the account information of its own village and only records transactions within its own village. In this way, the three account books can record accounts at the same time, with high efficiency and low storage requirements. This is exactly the sharding method adopted by Ethereum: state sharding, where each shard stores and only stores the account status of its own shard. In terms of implementation, Ethereum is a shard that users choose to join, rather than sharding by natural village.

The biggest problem with state sharding is, what if people in the fishing village want to trade with people in the hunter village? The fishing village’s ledger does not have the hunter village’s account, and the hunter village’s ledger does not have the fishing village’s account. In fact, this is the biggest challenge facing sharding technology, cross-shard communication. When this problem is completely solved, Ethereum 2.0 can be used. This article will discuss some solutions to this problem in the second part.

2. Assign validators to shards

After arranging transactions to different shards, the next problem to be solved is how to assign account keepers to a certain shard, that is, assign validators.

Ethereum has 64 shards, each with 128 validators. If the validators of a shard are fixed or predictable, it is easy for an attacker to control the shard, that is, to bribe 2/3 of the 128. What should be done? Ethereum's solution is to randomly select the validator of a shard from all validators and replace the validator every 6.4 minutes (the length of an epoch). In this way, an attacker has a probability of less than one in a trillion to control 2/3 of the people in a shard (see reference 1 for the reasoning process).

One of the main tasks of the beacon chain is to assign validators to the shard chains. The most important thing in this task is the realization of randomness. First, the importance of randomness. If validators cannot be randomly assigned, the security of the ledger cannot be guaranteed. Second, the difficulty of randomness. It is extremely difficult to achieve randomness on the blockchain. It can be said that there is no truly proven random algorithm that can be called an engineering implementation.

Ethereum's solution is to use RANDAO+VDF to provide random numbers to achieve randomness. It is easy to understand if we break RANDAO down into RAN (random) and DAO. It means that each person in a group of people independently proposes a random number, and then all the random numbers are combined to generate the final random number used. Because it is difficult for anyone to know the numbers provided by others, it is also difficult to predict the final number of the combination.

However, the RANDAO model has a flaw, that is, the person who provides the last number has the opportunity to cheat: he knows the sum of the random numbers provided by all the previous people, and can adjust the number he provides to make the final result favorable to himself.

To solve this problem, Ethereum introduced VDF (Verifiable Delay Function), which has a simple function of preventing the last person to provide a random number from calculating the sum of all previous random numbers before providing the number, thus making it impossible to manipulate the random number. (For a detailed introduction to RANDAO+VDF, see Reference 2)

3. Shard storage by relayers

I don’t know if you have noticed that the rotating ledger verifier will bring a new problem: the verifier is assigned to the fishing village to keep accounts, and then assigned to the hunter village to keep accounts. If he does not have all the account information, how can he keep accounts? If he has all the account information, he will have a full ledger and fail to achieve state sharding.

To solve this problem, Ethereum proposed an important new design: stateless client. To put it simply, the ledger of the fishing village is kept in the fishing village, and the ledger of the hunter village is kept in the hunter village. The validator does not hold the ledger, but is only responsible for running back and forth between different villages to keep accounts.

So who will keep the ledgers of different villages? Ethereum introduces the role of relayers (state providers), who are responsible for storing the account status of different shards and can only serve a certain shard. The work of relayers is easy to understand, but how to pay for their services and how to ensure their honesty... The design of these related mechanisms is a new problem that needs to be solved, and it is also a governance issue that community members should participate in discussing.

The actual situation of stateless clients is much more complicated than described above. The composition of the "transaction" itself is different from that before sharding. It must be accompanied by witness data to prove that it is valid. It can be considered that in 1.0, the verifier needs to store the old account himself to verify the new transaction; in 2.0, the transaction needs to bring the old account by himself and hand it over to the verifier for verification.

However, we cannot require every user to store all the old accounts so that we can prove the transaction after initiating it. At this time, we need a "relay", which stores all the account status of the shard. As long as the user raises a demand, it can help the user provide transaction witness data to the validator.

Vitalik Buterin published an article on March 11 proposing to use polynomial commitments instead of state roots. This technology is used here. It uses zero-knowledge proof to provide proof for transactions. It can be understood as providing the calculation results of the data to the verifier for verification, rather than directly providing all relevant data to the verifier for verification. This method can greatly reduce the size of witness data and effectively reduce various overheads. (For a detailed introduction to Vitalik's new method, see Reference 3, and for a detailed introduction to the stateless client, see Reference 4)

At this step, the work of dividing one ledger into multiple ledgers, that is, sharding, is completed.

02 Cross-shard transactions

If people in the fishing village only trade with people in the fishing village, and people in the hunter village only trade with people in the hunter village, then each village can just keep its own account, which does not require any new technology. But what if people in the fishing village want to trade with people in the hunter village? How can different ledgers communicate with each other? This is exactly the most difficult problem faced by state sharding.

There are two ways to solve this problem, one is synchronization (tight coupling) and the other is asynchronous (loose coupling).

Suppose there is a person named A in the fishing village and a person named B in the hunter village. A wants to give B 100 yuan. Synchronization means that when A initiates the transfer, the accountants of the fishing village and the hunter village know about the transaction and the progress of the transaction. The accountant of the fishing village subtracts 100 from A in the ledger, and the accountant of the hunter village adds 100 to B in the ledger. The transaction is completed, and the two villages generate new blocks synchronously.

Asynchronous means that when A initiates the transfer, the fishing village's account book subtracts 100 from A and generates a new block; the person who keeps accounts in the hunter village later receives this message in some way, and after confirming that A's money has indeed been reduced, he adds 100 to B in his own account book, and the transaction is completed, but the two villages generate new blocks asynchronously.

The synchronous method looks friendly, and its transaction execution process looks and feels the same as unsharded, but it hides a big problem, which is that it is difficult to deal with "continuous state changes". What does this mean?

If A only transfers 100 yuan to B, the fishing village and the hunter village can easily confirm that everyone is keeping accounts in this way after hearing the transaction. The fishing village's account book will subtract 100 from A, and the hunter village will add 100 to B to complete the accounting. But if A transfers 100 to B, and then transfers 50 to B, a continuous state change occurs, but A only has 120 yuan in total. At this time, it is difficult for the two villages to confirm how the other party keeps accounts:

If each validator were to communicate with the other validator, the communication overhead would increase dramatically and it would be extremely difficult to reach a certain result. If the communication were done through the village heads of both sides, each village would need to conduct a round of consensus in advance, and then the village head would tell the other party a definite result. In addition to increasing overhead, this would also be difficult to achieve because Ethereum's consensus mechanism itself is unable to reach a definite result (finality).

The asynchronous method will not be troubled by continuous state changes, because its approach is to "wait". Once your state is determined, I will proceed to the next step; after the fishing village has finished recording the account for A, the hunter village will see whether A has subtracted 100 or 50, and then decide to add 100 or 50 to B.

The problem with asynchronous methods is atomic failure. Transactions are supposed to be atomic, either executed or not, but in asynchronous methods, it is possible that part of the transaction is confirmed, but the other part is discarded.

For example, the block where the fishing village subtracted 100 from A was finally confirmed on the main chain of the fishing village, but the block where the hunter village added 100 to B was finally abandoned on the hunter village side chain. Atomicity failure is a problem, but it can be solved by design. For a detailed introduction to this part, see Reference 5 at the end of the article.

Another problem with the asynchronous method is the time overhead, communication overhead, and storage overhead, that is, the waiting time and resources required to complete a cross-shard transaction. The way information is transmitted between different shards determines the amount of these overheads. Different types of overheads are interrelated and difficult to achieve, so the goal is to achieve a balance when designing. The future performance of Ethereum 2.0 is dominated by the way information is transmitted.

Ethereum has discussed some asynchronous architecture models. The latest one was proposed by Vitalik at the DevCon 5 conference in October 2019. The basic idea is to use the beacon chain to transmit information: in each slot (12 seconds), the shard chain generates blocks and cross-links with the beacon chain blocks. The connection method is as follows. In this way, any shard can know the information of all other shards through the beacon chain when packaging its new transaction. Different shards are asynchronous for one slot.

This method reduces the waiting time for cross-shard transactions, but increases the requirements for the beacon chain, which needs to store proof data for all shards; this method also increases the number of cross-linked links (the original design was to cross-link once every epoch, that is, 6.4 minutes and 32 blocks), which will inevitably increase various related expenses. For this reason, the number of Ethereum shards was changed from 1024 to 64, reducing the total number of links from another design direction. (For a detailed introduction to this architecture, see Reference 6)

Judging from some current sharding design schemes, the synchronous model is more inclined to shards communicating with each other, while the asynchronous model is more inclined to shards not communicating with each other but communicating through a third party; the former faces the problem of communication volume, while the latter faces the problem of balancing multiple overheads. The design and implementation of cross-shard transactions are still in progress, and it is not yet certain which architecture Ethereum 2.0 will eventually adopt.

03 Cross-shard smart contracts

After introducing sharding and cross-shard transactions, the ultimate big boss on the road to Ethereum 2.0 development is here, which is cross-shard smart contracts. The difference between cross-shard transactions and cross-shard smart contracts is that transactions only have global variables, while smart contracts have local variables. What troubles will local variables bring?

After sharding, Ethereum has 64 ledgers from a physical perspective, but only one ledger from an abstract perspective: you can imagine the ledger as a big tree, each leaf of the tree stores an account status data, and the 64 ledgers are 64 trees. If the roots of these trees are given to the beacon chain, a new big tree will be formed, and the 64 ledgers will be combined into one ledger (this is just an approximate metaphor).

In cross-shard transactions, when a shard needs to know the account status of another shard, it can always find the leaf that stores the status along the tree, and then change the account status of its own shard to complete the transaction. It can be considered that through this tree, different shards have completed the exchange of information.

But for cross-shard smart contracts, the problem arises. The data stored on the leaves of this tree are all global variables, without local variables. If a smart contract on one shard calls a smart contract on another shard, how do the two pass the information of local variables? This tree cannot provide services for them.

It can also be understood that cross-shard transactions only need to look at global variables, that is, the first-level state, and cross-shard smart contracts need to look at local variables, that is, they also need to look at the second-level state. The design difficulty of cross-shard transactions and cross-shard smart contracts is not on the same order of magnitude.

So far, there is no systematic design for cross-shard smart contracts, but there are two proposals. One is to put related smart contracts into the same shard for execution, that is, to eliminate the need for cross-shard smart contracts; the other is to use SIMD (single instruction multiple data) technology to enable smart contracts to be executed in parallel.

Ethereum 2.0 will introduce smart contracts in Phase 2, which means that cross-shard smart contracts will not be realized until Phase 2. Only after taking this step can Ethereum truly enter the 2.0 era.

The above is an introduction to the design of Ethereum sharding and the difficulties in the design. Currently, Ethereum 2.0 is still in its early stages of implementation, and the following keywords are worth focusing on at this stage: state sharding, stateless client, and random number.

References:

1. "Minimum Committee Size Explained"; Author, Chih-Cheng Liang; https://medium.com/@chihchengliang/minimum-committee-size-explained-67047111fa20

2. Ethereum 2.0: Randomness; author, Bruno Škvorc; translation, Jhonny, Ajian; https://ethfans.org/posts/two-point-oh-randomness

3. "Using polynomial commitments to replace state roots"; author, Vitalik Buterin; https://ethresear.ch/t/using-polynomial-commitments-to-replace-state-roots/7095

4. "Relay Networks and Fee Markets in Eth2.0" by John Adler, translated by IAN LIU and Ajian, https://ethfans.org/posts/relay-networks-and-fee-markets-in-eth-2

5. The Authoritative Guide to Blockchain Sharding: The Concept and Challenges; Author: Alexander Skidanov; Translated by Jhonny, Echo, and Ajian; https://ethfans.org/posts/the-authoritative-guide-to-blockchain-sharding-part-1

6. "Eth2 shard chain simplification proposal"; author, Vitalik Buterin; https://notes.ethereum.org/@vbuterin/HkiULaluS

7. An Engineer’s Guide to ETH 2.0; author: James Prestwich; translation: Aisling, Qiqi, stormpang, Ajian; https://ethfans.org/posts/what-to-expect-when-eths-expecting

8. "Merge blocks and synchronous cross-shard state execution"; author, Vitalik Buterin; https://ethresear.ch/t/merge-blocks-and-synchronous-cross-shard-state-execution/1240

<<: ECC director dispute re-emerges: accused responds in 3 points

>>: Is there any way to buy at the lowest point? Also talk about cash flow and coin flow

What is the temper of people with fleshy cheeks?

Everyone can understand the "Ethereum 2.0 Sharding Design"

01 How to shard

02 Cross-shard transactions

03 Cross-shard smart contracts

What is the temper of people with fleshy cheeks?

BOCHK to become first Chinese bank to use blockchain technology

What was the past life of a woman with a "川" palm like?

It’s better to stay away from Bitcoin, which is rising rapidly again.

The reason why noble people have no noble ears

What does the big triangle in the center of the palm lines represent?

Bithumb, South Korea’s largest digital currency exchange, was hacked, and billions of won of users’ funds disappeared

What are the specific features of a good person's face? You can tell at a glance that he is a good person.

Crypto New Year's Talisman: A Complete Guide to Fraud Prevention

Do women with hanging needle lines have a long life span? How to resolve hanging needle lines

Recommend

What is the nature of interest rate hikes?

What are the facial features of a man with good sexual function?

Ring finger predicts marriage fortune

How to read a woman's nose

What do the two career lines represent?

How can palmistry tell whether a person has good fortune in making money?

Blockchain security is only as strong as its weakest link, $430 million bet you won’t find it

A gentle attitude and a smooth marriage with these women

Is it good for a woman to have scissor-shaped eyebrows? What is the fate of a woman with scissor-shaped eyebrows?

A man's face reveals whether he is popular with women

Five major fundamentals help Bitcoin reach new highs in 2021

Qicai Research Institute X Protocol made it to the Wanxiang Hackathon finals and is on its way to the championship

The length of your fingers tells you your destiny

What kind of body shape is richer?

People with moles around their cheeks