What Is This Sharding You Speak Of?
A Look At the Scaling Technique That Is Transforming Blockchain
Sharding — the process of splitting up a large computing load into smaller, manageable pieces — is widely used in traditional database technologies as a means of scaling storage capacity exponentially. The blockchain industry is actively exploring sharding as a technique to scale. This article will explore the background of sharding, the potential it has to scale blockchains, and the challenges facing developer communities looking to implement sharding.
Every student of mathematics has learned this tactic of partitioning data into small chunks in order to make a problem easier to solve. In computer science terminology, the process of splitting up data into smaller, manageable chunks is known as sharding. As more data and traffic flows through their applications, companies can scale with their growth by partitioning a single logical dataset across multiple machines.
While the blockchain industry is most familiar with sharding in the context of the upcoming Eth2.0 upgrade, sharding is not unique to blockchain. Traditional database systems have been using sharding for decades as a way to work around the limitations of a single machine and enable exponentially more storage capacity. Today, popular databases such as MongoDB, MySQL, AWS Elasticsearch and many others utilize sharding in order to achieve scale.
Sharding is a form of horizontal scaling, which seeks to scale systems by adding more machines to the pool of resources, and has the advantage of increasing capacity exponentially as more machines get added to the resource pool. This is different from adding more capacity to an existing machine, known as vertical scaling.
Scaling horizontally through sharding offers a number of key advantages:
- As the volume of traffic flowing into the application increases, sharding allows for significantly more storage capacity and faster processing by dividing the computing load across a number of machines.
- Sharding achieves security through redundancy. If an application or site relies on a single machine, that machine serves as a single point of failure that can cripple the entire application should it go down. With a sharded database, although the failure of a single machine will harm the availability of an application, the overall impact would still be less than if the entire database crashed since the rest of the database would still be functional.
While database sharding is widely used in production environments, blockchain sharding is still in an experimental phase with few early implementations working successfully that haven’t been made available for public use. Now developers can spin up custom, production-ready shards with LiquidChains technology. Furthermore, these chains can all be connected to one another, and to public blockchains, in order to enable trustless collaboration and exchange.
To appreciate the importance of sharding when it comes to scaling blockchains, we first need to understand the unique challenge blockchain’s face when it comes to increasing capacity.
The Theory Behind Blockchain Sharding
One of the more challenging issues facing blockchains is known as the ‘Scalability Trilemma.’ Every network wants to optimize decentralization, scalability, and security but faces an inherent tension that makes this extremely challenging. This is known as the ‘Scalability Trilemma’, which states that blockchains can only solve for up to two of the three problems, but not all three at the same time.
Although optimizing decentralization and security is crucial, it also means designing a system that is both hard to change and shares the maximum amount of network information with a maximum number of peer-to-peer nodes. Most blockchain networks have been designed as such, with censorship-resistance and defense against attackers in mind. Decentralization purists are extremely hesitant to make any protocol changes that would compromise the trustlessness of the network in any way.
However, by choosing to prioritize decentralization and security, these networks are sacrificing the production capacity that would allow them to service scalable applications.
On Ethereum, for instance, this trilemma takes the form of block space limitations that make it difficult for the network to process a significant amount of transactions. An example often used to drive home this constraint is the fact that Ethereum processes around 8 transactions per second, as opposed to Visa which can support up to 1,700 transactions per second.
While in theory these networks could increase the limited resource in question, block size in the case of Ethereum, in practice this would cause network operations to center around a few supercomputers and cause nodes running on consumer hardware to drop out. Scaling in such a fashion (vertical scaling) comes at the expense of harming decentralization.
With sharding, blockchains can scale horizontally without harming decentralization and security. Sharding splits a single blockchain into many different chains, or shards, each with its own set of validators and pool of network resources. Each shard processes its own transactions and keeps a record of all validated transactions locally. The ‘main chain’ serves as a ‘ledger of ledgers’ by holding proofs of all transactions that happen on every shard.
The Purpose of A Blockchain
Following on from the early successes of Bitcoin, Ethereum and other networks, blockchain enthusiasts shifted quickly to a ‘decentralize everything’ mindset that saw value in putting everything directly on public networks. This view drastically limits the extent to which these networks can accommodate the transaction throughput which can be found in centralized, legacy systems.
By shifting to sharding, mainnets could offload all the heavy lifting to a series of sidechains and begin to function as lightweight ledgers for public proofs only. With less transaction load to take care of, these networks could finally enable the scale that we’ve been waiting for.