Chain and Producer | How far is the sharding technology, which is regarded as the future of the public chain, from us?
卢晓明
2018-06-22 10:16
本文约7409字,阅读全文需要约30分钟
There are four key issues that sharding technology needs to solve.

The public chain has become a battleground for strategists.

The water delivery industry such as exchanges and wallets is emerging, and opportunities are hard to find. The public chain is no different. Countless people have said that today's blockchain field is like the Internet in the 1990s, everyone wants to be a protocol and operating system.

In the PC era, Microsoft became the dominant player in the industry. In the era of mobile Internet, Google ranks first in market capitalization. Maybe we can't predict which scenarios will have a future when they are moved to the chain in the future, or even whether the blockchain will have a future. But what we can be sure of is that if the blockchain can really bring value to the Internet, a public chain that can carry such a mission must be needed.

However, the reality is that such a public chain has not yet appeared.

EOS has high hopes, but now that the main network has just launched, it is too early to judge the situation. From the current point of view, scalability, security, decentralizationThe "impossible triangle" of this blockchain, still not fully resolved. Generally speaking, the current common methods are as follows: one is to change the consensus mechanism, such as Hyperledger’s PBFT, EOS’s DPoS, which often sacrifices some decentralization; the other is to change the network structure, such as IOTA, byteball It uses a DAG (directed acyclic graph) data structure different from the blockchain; the third is to directly use off-chain solutions, such as sub-chains/side chains, state channels under the chain, and even cross-chain middleware, etc. ; The fourth type is fragmentation.

, divide the nodes in the network into different shards, and each shard can process different transactions in parallel, so that transactions that are not connected to each other can be processed in parallel to increase the network concurrency. The characteristic of the fragmentation scheme is that as the number of nodes increases, the network throughput also increases.

The basic idea is, divide the nodes in the network into different shards, and each shard can process different transactions in parallel, so that transactions that are not connected to each other can be processed in parallel to increase the network concurrency. The characteristic of the fragmentation scheme is that as the number of nodes increases, the network throughput also increases.

However, this technology has a certain complexity, and there are many problems to be solved in the specific implementation. Few projects have actually implemented the sharding scheme.Its market value once rose to the 21st place among cryptocurrenciesIts market value once rose to the 21st place among cryptocurrencies

Recently, Odaily interviewed Jia Yaoqi, co-founder and technical director of Zilliqa, who shared Zilliqa's progress, future plans, pros and cons of various blockchain expansion solutions, and his views on the industry.

databasedatabase, it divides the database into multiple slices and places these slices in different storage devices (partitions), so that the amount of data in each partition is relatively much smaller, thus meeting the performance requirements of the system.Industry insiders believe that sharding enhances system performance and scalability (Scalability), but at the same time makes system development more complicated. For example, if two records are located on different servers, if an association needs to be established, it is likely that the records indicating "association" must be placed in each of the two partitions. In addition, once a transaction needs to be processed across data partitions, performance will be greatly reduced. After understanding this, we may be able to better understand the problems that need to be solved when sharding is implemented in the blockchain field.

Zilliqa's research on blockchain sharding began in 2015, when Prateek Saxena and Loi Luu, teachers and students of the National University of Singapore, published a paper on sharding at the top international security conference CCS.A Secure Sharding Protocol For Open BlockchainsAccording to the content of the interview, we organize the content in the form of questions and answers as follows:

*This article is mainly to explore more possibilities of the next phenomenon-level public chain, underlying design, consensus mechanism, etc. Since the project is still in its early stage and the market is yet to mature, Odaily does not endorse the project, and this article does not have investment guidance.

According to the content of the interview, we organize the content in the form of questions and answers as follows:

1. Features of Zilliqa: sharding technology, PoW+PBFT hybrid consensus mechanism

Odaily: What is the sharding strategy used by Zilliqa? How is the sharding technology implemented?

Jia Yaoqi: Zilliqa's sharding technology can be understood in this simple way: Suppose we have a network containing 1000 nodes, Zilliqa will automatically divide the network into 10 shards, each containing 100 nodes, and all shards Transactions can be processed in parallel. If each shard can process n different transactions per second, then all shards can process 10n transactions per second together. This is a horizontal expansion method, that is, the network throughput increases linearly as the number of nodes increases, which is a feature that other expansion methods do not have.

The shards we are currently doing includeNetwork sharding, transaction sharding, and smart contract sharding or computational sharding.

The most important of these is network sharding, because other sharding mechanisms are built on top of network sharding. easy to understand,network fragmentationIt is to group the entire network, each group is called a shard, and all shards process different transactions at the same time. During this process, we ensure security by constantly refreshing the network and shards. At the same time, according to our paper published at the CCS Security Conference, when the number of nodes per shard is no less than 600, the probability that a third of them are malicious is one in a million.

Whenever a transaction enters the network, we will perform certain operations according to the address of the sender of the transaction, and randomly assign it to different shards. This process is calledtransaction sharding. It is worth noting that these transactions cannot independently choose to enter a certain shard, because the nodes in each shard will refuse to execute transactions that do not belong to their own shard, which also ensures the security of processing transactions.

At present, we have successfully implemented network sharding and transaction sharding, and will release version 2.0 of the public test network at the end of this month. This version of the public test network will allow ordinary users to join the network as nodes and become miners.

Odaily: We know that the pure sharding technology itself cannot guarantee the high throughput of transactions, because the throughput also depends on the single time of each shard and the speed of block generation. What is the consensus mechanism used by Zilliqa?

Jia Yaoqi: Zilliqa uses a PoW+PBFT hybrid mechanism.

In public chains, malicious nodes may try to disrupt the system by manipulating multiple nodes and influence any decision-making process based on the majority of nodes. This is what is commonly referred to as Sybil Attacks. There are several possible ways to make a Sybil attack costly or difficult. For example, by requiring nodes to deposit considerable amounts of money (or tokens) as collateral, or by requiring them to perform some computationally intensive task such as PoW.

In order to ensure the security of the Zilliqa network, we require all nodes joining Zilliqa to do PoW. At the same time, we also know that calculation-intensive PoW requires a lot of time for calculation and may slow down the consensus protocol, and consumes high energy. Therefore, on Zilliqa, PoW is run at a larger interval, that is, all nodes are joining the network and doing PoW every once in a while. For the rest of the time, in order to achieve consensus, Zilliqa uses the pBFT formula mechanism.

Odaily: It is often said that the PBFT consensus protocol is generally implemented in a smaller consensus group, such as less than 50 nodes, so it is more suitable for alliance chains. How does Zilliqa feel about this issue?

Jia Yaoqi: We just mentioned that Zilliqa ensures the security of the network through sharding technology and PoW. However, PoW has weaknesses such as long time-consuming, slow confirmation, and high energy consumption. Therefore, Zilliqa chose pBFT for consensus. The main considerations are: 1. It is not computationally resource-intensive and consumes less energy than PoW; 2. It is more efficient because it can utilize a small consensus group; 3. It does not Repeated confirmation is required, giving the transaction finality. In other words, unlike the Nakamoto consensus mechanism based on PoW, which usually requires multiple confirmation blocks such as Bitcoin requires at least 6 confirmations, pBFT does not require confirmation because its consensus protocol ensures that temporary forks will not occur.

In many people's minds, a big reason why pBFT is mainly used in consortium chains is that the cost of communication between nodes in the pBFT consensus is high. For example, in a network of n nodes, the total number of communications required to reach a consensus using pBFT is n(n-1)/2, which is the quadratic level of n. When the number of nodes exceeds 50, this is a very large The number is up, and the larger n is, the faster the communication cost rises. To solve this problem, Zilliqa adoptsMulti-Signature AlgorithmAnd some other performance optimization methods to reduce the communication cost spent by pBFT.

2. Difficulties and implementation of sharding technology

Odaily: What problems or difficulties do you think may need to be solved in the practice of sharding?

Jia Yaoqi: The principle of sharding technology sounds simple, but the following key issues should be paid attention to in the actual implementation process:

1. Defense against witch attacks.This issue has already been mentioned above, even if PoW is used to prevent it, I will not go into details here.

2. Create shards and assign nodes and tasks to shards.For example, how each node chooses which shard it goes to; after a period of time, the entire network must have old nodes leaving and new nodes joining, how to realize the dynamic exchange of these old and new nodes; and nodes in each shard to process transactions , but also to achieve protocol control, how to achieve high efficiency, etc. Where each node is assigned certainly cannot be controlled by a specific person or group, because if those people decide to be malicious, then they can compromise the security of the network by concentrating all malicious nodes in a single shard. As mentioned before, Zilliqa uses random sharding and dynamic shuffling to ensure the liquidity and security of the network.

3. Fragment size.The fewer the number of nodes in the shard, the faster the consensus will be reached and the higher the efficiency will be. But at the same time, if the number of nodes in a shard is too small, it becomes easier for an attacker to control it. And every time, if a node in a shard goes offline or cannot be contacted for a long time, the total number of nodes in the shard will be further reduced, and security cannot be guaranteed. As mentioned earlier, we have proved through the paper that when the number of nodes in each shard is not less than 600, security and efficiency can be better balanced.

4. Cross-chip transactions.Technical experts and engineers in the blockchain field generally believe that because cross-shard transactions require lock protocols, their overhead costs are high. When the number of cross-shard transactions increases, it will affect the throughput and economic benefits of the entire network.

We currently deal with this problem in two ways: on the one hand, we try to avoid cross-shard transactions at the beginning of the sharding design; on the other hand, we also mentioned the atomic commit protocol technology in the original Zilliqa sharding paper This is also one of the directions we have been researching in the past few years. In addition, we are still studying a number of other alternatives, and we will share the details with you after we realize a relatively excellent solution.

Odaily: At present, many blockchain projects are also considering the use of sharding technology. What do you think of the implementation of sharding technology in the industry?

Jia Yaoqi: Nowadays, sharding technology is "blossoming everywhere", which shows that sharding technology has increasingly become a mainstream technology to solve blockchain expansion. On the other hand, as the voice of sharding technology continues to improve, more and more community members have begun to pay attention to and support sharding technology.

It has to be said that sharding technology is indeed a very difficult technology, which is the fundamental reason why there are many projects in the market that claim to do sharding, but few actually do it. The reason is that sharding technology has extremely high security requirements. I think there is still some chaos in this market:

One is to fall into the trap of TPS competition and ignore the most important security.Everyone knows that the peak TPS of Taobao’s Double Eleven transactions last year was 256,000 per second, which is the processing speed of the centralized system that has been developed for so many years. The blockchain is an emerging technology, and its development level is far less mature and advanced than the centralized system. Currently, the TPS of the well-known bit and ether does not exceed 30. Therefore, many projects claim hundreds of thousands, millions or even tens of millions of TPS on the chain at this stage, mainly to attract the attention of the public, but ignore the fundamental elements of decentralization and security.

The second is that there is no mathematical calculation or published papers to support it, and the premise and conclusion are hasty and not rigorous.Sharding is also known as "divide and conquer", and its focus is not only on "separation", but also on "governance", that is, to ensure security while sharding. Sharding technology has a long history in the traditional field, but it is an emerging technology in the blockchain. The two are somewhat similar in concept, but they are completely different in operation. Therefore, if you take the sharding technology of the blockchain for granted and think that the process is completed by placing a few nodes in each shard, then in fact, it will lead to malicious nodes easily bringing a lot of damage through malicious operations such as double-spending transactions. A series of vulnerabilities, and it is difficult to verify or roll back the system in the later stage to reduce the problems caused by these malicious transactions.

The third is to draw conclusions easily without large-scale testing, which is not professional enough.When computer bandwidth is not a constraint,Ethereum can run tens of thousands or evenmillion TPS, but Ethereum does not achieve such a high TPS in real life. The reason is that the real network is not composed of dozens or hundreds of nodes, but tens of thousands. It is possible to run out any data with only dozens or one or two hundred nodes for testing, but such data is not convincing.

Fourth, some methods may not be considered real fragmentation.At present, sharding is indeed a very hot topic, and I personally think that some projects are really doing more like sub-chains, state channels, or layers than sharding.

If there is no support from scientific mathematical calculations or published papers, sharding is taken for granted, and the data that a small number of nodes run out of in an ideal state is mistakenly regarded as the data realized by the main network, which may bring some serious security consequences. . In a relatively mild case, the network has undergone multiple hard forks, and in serious cases, it will bring huge losses to investors. This is very unfortunate for both investors and the development of the blockchain itself.

Odaily: At present, there are quite a few projects claiming to use sharding. What do you think is the difference between the subchain, sidechain and state channel solutions just mentioned and sharding?

Jia Yaoqi: Subchains, sidechains, and state channels all belong to off-chain expansion. I think their core ideas are similar, that is, each chain can independently process transactions or things without communicating with each other, and finally put the settlement information in the on the main chain. To make a simple analogy, among these chains, chain A can be used for advertising, chain B for games, chain C for transactions, etc. The most essential difference between them and sharding is that sharding is on-chain expansion, which is a reconstruction of the entire blockchain network, and nodes are also related to each other.

I think off-chain expansion and on-chain expansion do not conflict with each other, but complement each other and can be combined in the future, because the application scope and focus of each other are different, which provide important technical support for blockchain expansion .

Odaily: In other words, sharding must be sharding of nodes or transactions in the same main chain. Since it is on the same main chain, all the nodes must participate in the consensus or verify the transactions of the whole network. So, how does Zilliqa ensure that all nodes participate in confirming or verifying the transaction records of the entire network while participating in their own shard consensus?

Jia Yaoqi: We have a separate shard, the DS Committee, to integrate the results of each shard, collect transaction hashes in different shards, conduct a consensus protocol, form a hash of the hash, and then broadcast it, and other nodes verify the signature. Our transaction confirmation is divided into several stages. If your transaction is confirmed in a single shard, then your transaction has a high probability of being written into the blockchain. In this case, we will make a reminder later, Inform that the transaction has been initially confirmed, and if it is finally confirmed, a notification will be sent to you, which means it is finally confirmed.

In addition, what I want to mention is that because what we are currently doing is not state sharding, some people may have the misunderstanding that each node in the Zilliqa network is a full node, and mistakenly think that this will lead to information storage after a period of time. explode. But in fact, what each node in the Zilliqa network needs to save is the latest state of the entire network, not the history of all transactions. Of course, a node can also spontaneously act as a full node and store all such a history. One advantage of such a full node is that it can provide its own services, such as EtherScan, which provides a block explorer and makes money through advertising. Moreover, even if the state is divided into different shards in the future, the storage capacity at the constant level will be reduced, but the difference is not that big. At the same time, we have cooperated with Bluzelle and Genaro, two decentralized storage companies and projects, to support the decentralized storage of smart contracts.

Odaily: Each node must synchronize the latest status. As the number of nodes increases in the future, will this affect the confirmation speed of the entire network?

Jia Yaoqi: In theory, the throughput of Zilliqa increases as the number of nodes increases. But in fact, there is an optimal point for the number of nodes, and the throughput increases linearly until the scale increases to this point.

For example, if 20,000 nodes brings a bandwidth limit, which leads to the system throughput can no longer increase, then the entire network may be limited to 20,000 nodes. According to the latest published data, Ethereum currently has 16,000 nodes, and we are still learning about this sweet spot through experiments.

Odaily: What is the TPS of Zilliqa's latest test network?

Jia Yaoqi: Our data is constantly being updated. In the published data, we used 1400 nodes and 6 shards, and ran about 2800 TPS of data. Ideally, each shard has 600 nodes. We currently choose to use 200 nodes per shard for testing mainly because of cost considerations, because the nodes we currently rent are AWS and EC2, which cost millions of dollars per year. do a test.

Odaily: Zilliqa has any plans for smart contracts. Will there be a smart contract system when the mainnet goes online? Another point of view is that Zilliqa's smart contract language Scilla is not Turing complete, why?

Jia Yaoqi: Just like our roadmap, the currently released 1.0 version of the public test network does not have a smart contract layer; the 2.0 version of the public test network will be launched at the end of this month. From this version, Zilliqa will support some smart contracts; the third quarter The main network will be launched; some practical decentralized applications will be launched in the fourth quarter.

paperpaperto argue. Scilla is a proof-carrying, intermediate smart contract language whose underlying computational model is based on communicating automata. We hope that by using Scilla, writing smart contracts on the blockchain platform will be more convenient, simple, safe and reliable, and have higher performance.

DAO attacks and Parity vulnerabilities in recent years have caused huge amounts of funds to be stolen and frozen. An important reason is that Solidity has no formal verification, and the division between communication and calculation is not clear enough. Scilla provides a variety of separation layers for the communication and operation of smart contracts, and supports formal verification. By using proof assistants such as COQ, developers can write code that conforms to the logic they want.

Scilla's non-Turing completeness is more concerned with the security of smart contracts. Although Solidity, the smart contract language of Ethereum, is Turing complete, due to the existence of gas costs, the actually deployed smart contract cannot be infinitely looped, so it cannot be Turing complete. We found thousands of vulnerabilities in smart contracts on Ethereum. Scilla is to avoid loopholes in existing smart contracts, so some more dangerous API interfaces and functions are removed; and we find that all smart contracts do not need Turing completeness at present.

We are currently developing a compiler for the Scilla language, so that all future contracts written in Solidity can be easily transferred to Scilla through the compiler. At the same time, we also publishedZilliqa Ecological Construction Funding Program, Spend 5 million US dollars to fund excellent projects, teams and individuals who build tools and applications for Zilliqa, and also build the Zilliqa ecosystem.

3. Three major problems in the domestic blockchain industry

Odaily: Do you currently have any public chains in the market that you are optimistic about?

Jia Yaoqi: Each project on the market has its own highlights and characteristics, and I personally prefer Ethereum.

Odaily: What if it's not so mature and mainstream? It may be done by some entrepreneurial teams.

Jia Yaoqi: For project standards, I personally value innovation and rigor. Rigorousness refers to the verification of published academic papers, which can be passed in theory. If there is no rigorous paper to prove it, the system must at least have a test network with more than a thousand nodes, and the code is also public, so that it will be more convincing.

Odaily: From the perspective of the overall industry, what do you think is the biggest problem in the domestic and foreign blockchain field, or public chain projects?

Jia Yaoqi: There are three main issues: the first is to achieve scalability and high throughput while ensuring decentralization and security; the second is privacy; the third is technical staff, especially developers. There's not enough.

I am Lu Xiaoming, editor of Odaily. I am exploring the real blockchain. Please add WeChat lohiuming for breaking news and communication. Please note your name, unit, position and reason.

I am Lu Xiaoming, editor of Odaily. I am exploring the real blockchain. Please add WeChat lohiuming for breaking news and communication. Please note your name, unit, position and reason.

卢晓明
作者文库