The Merge

Even if you are not involved whatsoever in the blockchain or Web3 space, you may have heard something about an upcoming event known as The Merge. In simple terms, the technology behind Ethereum is getting an upgrade, and it has long-term implications for the future of blockchain. I am writing this on September 14th, and The Merge is slated to take place at 1 am later tonight, so let's dive in. Ethereum is switching consensus mechanisms from Proof-of-Work to Proof-of-Stake.

I am going to import previous work I have written on what conesus mechanisms are, and what PoW and PoS are. If you are not interested in the technical side of this upgrade, just go to the end where I'll go into the impact of this swap.

Consensus Mechanisms

When we put money into existing financial institutions, we trust that our funds will be safe and responsibly used because there is (or was) a widespread consensus that financial institutions are legitimate and well-intentioned organizations. As mentioned in the Bitcoin Whitepaper, that trust was broken for many, leading people to search for sanctioned financial tools unreliant on trust in an institution. For public blockchains, since there is no central authority validating the authenticity of data being added, they are reliant on automated consensus mechanisms to validate data.

Data needs to be validated because of the double spending problem. In order for a digital currency to work, it must be ensured that people can not spend the same token twice in separate transactions. In the case of cryptocurrencies, nodes must validate that a user does have the correct amount of tokens for a transaction, and has not spent it already. Every node in the network must validate this before it can be added to the blockchain.

Byzantine Fault Tolerance

Assume you are a Byzantine general. The Emperor of Byzantium has ordered you and several other generals to attack an enemy fortress, all from different directions. If you and all the other generals attack simultaneously, the campaign will be successful. If any of the generals betray the cause, everyone will fail. There are also traitorous generals among the attacking group, some of which will secretly retreat rather than attack in unison. Additionally, the generals can only communicate with each other via messengers that can only be sent to one general at a time. The traitorous generals will also attempt to circulate false information. With these conditions in mind, how would you attempt to structure a coordinated attack or retreat? This scenario is known as the Byzantine generals problem, among other names, and holds significance in blockchain and computer science.

In a more general sense, the Byzantine fault is a condition in a system in which actors in the system must reach a consensus to avoid catastrophe, while also assuming a portion of the actors are unreliable. In the case of blockchain, nodes in the network must agree on data within the chain, and thus on the state of the network as a whole. This is a difficult task to accomplish, assuming malicious actors will try to validate falsified data, especially when currency is on the line.

A simple solution to the Byzantine fault would be for the King to find one trustworthy general, and give him full power of the Byzantine military. For aforementioned reasons, the blockchains (and especially cryptocurrencies) are built with the intention of avoiding centralized authority. Therefore, assume the King does not want to vest the power of the military in one general for fear of a coup d'état.

The King decides to write up a contract that all the generals must abide by. The contract includes penalties for misbehavior, and incentives for participation and loyalty. A decision is made with majority-approval, and is then executed. This allows the general to function autonomously without constant oversight from the King. In the case of blockchain, nodes must individually validate new data and vote in attestation in order for it to be added to the blockchain. Nodes in a decentralized network are governed by a ‘contract,’ and it allows blockchains to circumvent a Byzantine failure while also avoiding a central authority. There are many different types of ‘contracts’, and they are referred to as consensus mechanisms.

Proof of Work

Proof of Work is an important consensus mechanism first developed and employed by Nakamoto. People (miners) compete for the right to add the block because they are rewarded with bitcoin for doing so. This is how Nakamoto incentivizes participation in the blockchain. Permission to add the next block is only granted when a user has correctly solved an arbitrary and complex mathematical puzzle. Once a user is permitted to add the next block, the transaction data is validated by all nodes in the network, the block is added, and the user is rewarded with bitcoin. The “work” part of Proof of Work is solving the mathematical puzzle. The only way to solve the puzzle is to guess random numbers. There are so many possibilities that guessing the correct number in a timely manner requires an enormous amount of computing power. In order for a hacker to add a malicious block to a Proof of Work blockchain, they would need computer power greater than 51% of the network capacity, which is an unimaginable amount of power.

The "complex mathematical puzzle" is more of a guessing game than a math problem. Miners are not actually ‘solving’ a problem, but rather guessing answers until they get a number below a certain threshold. SHA-256 is a function, meaning it is given an input, and it returns a pseudo-random output. In the case of SHA-256, it is given a string of information, and it returns a 256-bit integer. This integer can be interpreted as a hash, which is a string of letters and numbers. For Bitcoin in particular, there is a certain number threshold that a hash must be below in order for it to be a solution. What a miner must do is use the transaction data stored in a new block, and insert it as an input into SHA-256, and receive a number below the current threshold.

SHA-256 is not entirely random, so if we input the transaction data into SHA-256 multiple times, we will receive the same output multiple times. The probability of that output being below the desired threshold is extremely low. To circumvent this, the transaction data is combined with a ‘nonce’ (number-used-only-once), which is a random 32-digit integer. A miner combines the transaction data and a random nonce, and submits it as input. SHA-256 will then return a 256-bit integer. If that integer is below whatever the threshold value is, the miner wins the right to make a block and receive the award. If it is not below the threshold, the miner moves on to the next nonce. Miners continue plugging in the transaction data and random nonces until SHA-256 returns a number below the threshold.

SHA-256 is meant to be a one-way function, so finding the inverse is not possible. As such, the only way to find a desired hash is to use the brute force method (guessing until you are correct). The total number of possible hashes is 2^256 (2 choices for each of 256 bits). Given that there are so many options, miners need an enormous amount of computing power to guess below the threshold in a timely manner. To sum it up, miners are trying to find a nonce such that: SHA256(nonce + transaction data) < threshold.

Proof of Stake

Proof of stake is another very important consensus mechanism implemented by Cardano, Solana, and Avalanche, and soon Ethereum. Proof of Stake is also meant to remediate many of the potential problems with Proof of Work. To explain Proof of Stake, we will follow the soon-to-be-implemented version of the mechanism employed by Ethereum. In a general sense, validators put cryptocurrency as collateral for bad behavior, and then are permissioned as a node in the network. They are then tasked with approving new blocks and occasionally adding new blocks based on the size of their stake. As a reward for their commitment, they are given the native cryptocurrency, with the amount being based on the size of the stake. For Ethereum, a user becomes a validator by depositing 32 ETH into a smart contract. The new validator joins an activation queue, and, upon being activated, begins receiving block candidates from other nodes. The validator runs the transaction, ensures it is valid, then sends a vote of attestation (in favor) for adding the block across the network. Every 12 seconds, a validator is randomly chosen to propose a new block containing transaction data. A group of validators to validate the block is simultaneously chosen. The group provides their votes of attestation, and then the block is added.

A transaction is labeled as finalized when it is part of an approved block. To revert a finalized block, a hacker must commit to losing at least one third of all staked Ether ($10 billion+ USD). Finality is maintained by two-thirds of the network, so a hacker could vote against finality using one-third of the network. However, as long as a finalized block remains challenged, the amount of Ether staked by the challenging validators will slowly be burned. This decreases the total amount of staked Ether, thus giving the majority a larger percentage of the network until they cross the two-thirds threshold.

There are three types of bad behavior that the network automatically monitors and potentially punishes. If a user does not validate a block when called upon (‘laziness’), they miss out on a potential Ether reward. The other two forms of bad behavior are submitting contradictory attestations for a block, or submitting multiple block proposals when called upon to submit one block. For these two actions, an amount of Ethereum is slashed based on the amount of simultaneous bad behavior in the network (correlation penalty). There is a positive correlation between bad behavior and burned Ether because hackers need a large stake in the network to have an impact. Thus, if a hacker acts up in the network all at once, an enormous amount of their Ether will be burned, resulting in a huge financial loss.

As with Proof of Work, there is the risk of a 51% attack. In order for a hacker to mount such an attack, they would need to own 51% or more of staked Ether (around $15 billion USD). The non-compromised nodes have methods of counter-attack, such as running a smaller fork. If such an attacker were to fail in taking over the network, they would face an enormous financial loss, making it a poor risk to take.

According to Ethereum documentation, Proof of Stake makes the following improvements upon Proof of Work:

Better energy efficiency – there is no need to use lots of energy on proof-of-work computations.

Lower barriers to entry, reduced hardware requirements – there is no need for elite hardware to stand a chance of creating new blocks.

Reduced centralization risk – proof-of-stake should lead to more nodes securing the network.

Issuance Reduction - because of the low energy requirement less ETH issuance is required to incentivize participation.

Cost of Attack - economic penalties for misbehavior make 51% style attacks exponentially more costly for an attacker compared to proof-of-work.

Recovering from an Attack - the community can resort to social recovery of an honest chain if a 51% attack were to overcome the crypto-economic defenses.

Cons of Proof of Stake include the fact that it is not a time-tested system, especially compared to Proof of Work. The system is also far more complex both conceptually and in terms of implementation. Users must also set up and run three different pieces of software to participate in the network. Although Proof of Stake is often less expensive than buying competitive mining equipment, the financial barrier of entry is still tens of thousands of dollars, which is a subjectively high number for most people.

With that out of the way, let's discuss the impact and why its important. ConsenSys created a beautiful breaking down the impact into three important categories: sustainability, security, and scalability. We will use this category as we explore the impact.

Sustainability

Right now, there is no doubt that the power used by Proof of Work blockchains is egregious. Bitcoin swallows up the same amount of energy every year as a small developed country, like Switzerland. Ethereum is giving Bitcoin a run for its money, using an equivalent amount of energy to the Netherlands every year. To break this down on a micro level, a single Ethereum transaction uses as much power as the average U.S. household uses in one week. I think we can all identify why this is an objective problem.

In switching to Proof of Stake, there is no longer a need for computing power to supplement Ethereum's consensus mechanism. Ethereum's energy usage will drop 99.5% with this switch, bringing the average cost of a transaction down to the equivalent of creating a cup of coffee. For environmental reasons, this is obviously a big deal. This also, however, has much larger implications for blockchain technology and Web3 as a whole.

The Ethereum blockchain is the bedrock of an enormous portion of Web3 applications, cryptocurrencies, DAOs, tokens, and more. Environmental considerations prevented the proliferation of larger scale and longer term applications. Removing that barrier opens the door for endless possibilities.

Security

Recall how under Proof of Work we discussed the possibility of a 51% attack, in which someone would control 51% or more of all computing capabilities devoted to validation on the blockchain. The cost of such a feat is exorbitant but not impossible. With this switch to Proof of Stake... I don't want to call it impossible and be proven wrong, but it is hypothetically impossible. An attacker would need to own 51% or more of all Ethereum staked as a validator, which is an unfathomable amount of money.

Additionally, less money is now needed to reward miners for their 'work.' Previously, miners had to pay preliminary fixed costs to buy the necessary hardware to hash competitively. Then, there was a variable cost needed for electricity, which could rapidly fluctuate due to a variety of factors. Now, nodes have no cost of entry, so compensation does not need to be as high to cover costs. This disincentivizes poor behavior and encourages more sustainable growth.

The update further disincentivizes malicious behavior by collateralizing the Ethereum needed to become a node. If a node begins deviating from their duties or acting suspciously, their collateral can be slowly burned until behavior resumes or they leave the network. Such disincentives make attack an even more costly endeavor. The update also contains several other remediation avenues in the event that validation processes are hijacked.

Scalability

The Merge has simply removed concerns held by both developers and users around security and sustainability. They are now free to build unhindered projects that will undoubtedly change the landscape of Web3. Additionally, The Merge is the first step in a sequence of new upgrades that will promote rapid growth within the network. The groundwork will now be laid for a process called "sharding," which partitions data across the chain. Additionally, a process called "rollups" can be used to process and store data off-chain, offering services similar to a cloud service.

Jacob Stein

Hello! I'm a junior at Boston University studying computer science and political science. I have a strong interest in blockchain technology use-cases and implementation. This blog is just meant to document and explore my areas of interests. Feel free to comment, or contact me at jmstein@bu.edu.

Save my e-mail and name for the next time I comment

The Merge

Jacob Stein

Leave a Reply

Related Posts

Categories