Blog

Cryptographic Asymmetry and How To Shut Down A Cosmos-Ethereum Bridge

12/27/2023

Recently, ging3r and I (Strikeout) found a vulnerability in the Gravity Bridge. Since I wrote up the last few vulnerabilities on this blog, Ging3r wanted to use his voice to tell a story. Ging3r decided to write on my blog, becoming the first guest blogger on my site. He's got a good voice and explains the bug well. Enjoy! :)

Asymmetry

I’ve been programming and hacking for about 10 years now, and one of my favorite things in this space is the asymmetry inherently created by cryptography. The most powerful governments in the history of the world have been brought low, and forced to spend billions of dollars to find side channels and make extortive deals, all because given b^k = a mod n, it’s quite difficult to figure out k.

It truly is that simple. There is some essential hardness (in the computational complexity sense of the word) that our symbolic representation of that problem latches onto that is thus far impenetrable. Even organizations with high incentivisation to do so and billions of dollars to throw at legions of mathematicians have yet to figure out a simple way to solve such a simple problem.Yet little old me, here in my little old room, can conceptually latch onto and use those essentially hard problems, for any variety of fascinating ends. With real world consequences.

The set of real world consequences of our manipulations of those symbols has, thanks to Satoshi Nakamoto and others, been thoroughly brought into the economic realm. The mixing of ethereal, abstract mathematical concepts with our human flesh and blood economic incentivization schemes is part of what makes the blockchain space so fascinating to me. Where else in history has such a crossing of conceptual wires occurred? Did Diffie and Hellman know that one day they’d play a part in putting the not-so-metaphorical food on the literal table rof a whole army of blockchain devs, auditors, and blackhat hackers?

In this post, I’m going to explain how I (Ging3r) and Strikeout (the owner of this blog) recently discovered a bug in a Cosmos to Ethereum bridge that, while being only a chain halt, still showed some of the remarkable signs of that inherent asymmetry that I find so fascinating. Before I explain, however, there’s a few important prerequisites.

What Is Cosmos?

Cosmos Banner
Figure 1: Cosmos Ecosystem

Cosmos is one iteration of interchain technology. Instead of trying to create a whole bunch of unique chains each with their own ecosystem, Cosmos seeks to create an ecosystem where a bunch of unique chains can each do their own thing well, and use their integrated interconnectivity to communicate with all other Cosmos chains seamlessly.

Cosmos uses the idea of application specific blockchains to craft this ecosystem uniquely suited to interconnectivity. The Cosmos SDK provides all the protocols and packages and algorithms, you just build your application specific chain to do it’s application specific thing. Then turn on a few extra knobs and toggles and whistles and such and you can natively interact with all the other chains using the same protocols. I’ll save some ink for both you and I and keep my brief description brief, but you can also read strikeout’s excellent introduction to the Cosmos SDK.

What is the Gravity Bridge?

Gravity Bridge Banner
Figure 2: Gravity Bridge Banner

Of course, any up and coming blockchain project, to accomplish anything, has to offer quick and efficient bridging from Ethereum. The Gravity Bridge is one such option for bridging tokens from Ethereum into the Cosmos ecosystem.

Gravity Bridge is built primarily using the Cosmos SDK, which means that the actual governance for the chain and much of the real application logic happens Cosmos-side. All the Gravity Bridge functionality is bundled up into a plug-and-play module, usable on its own but also importable into other Cosmos blockchain projects. After all, we love interoperability in the cosmoverse. This logic runs on any number of Gravity Bridge nodes that run the orchestrator and validator functionality. The validator logic is the underlying code running consensus for the Tendermint blocks powering the Gravity Bridge Cosmos chain itself, and the orchestrator runs in tandem with the validator, signing all the important messages and interacting with both the Ethereum side and the cosmos side using the validator’s private key. The relayer functionality runs in tandem with the orchestrator and validator to do economically incentivized relaying of transaction batches, logic calls, and validator set updates (more on those later). These relationships are all shown in the overview diagram in Figure 3.

Gravity Bridge Architecture
Figure 3: Gravity Bridge Architecture

Currently, the relayer functionality is included within the orchestrator build, however, this is not required and anyone could independently run a relayer to receive rewards. Architecturally, all of the relaying behavior is economically incentivized with rewards given out for correct submission of each of the aforementioned sets of data. Important to note for this post is that if the relayers become unable to do their thing, the whole chain will cease to function because there’s no one actually submitting the correct messages to the Ethereum side of the bridge.

Speaking of which, on the Ethereum side there is a contract called Gravity.sol running on mainnet. This is the entry-point for Ethereum tokens to be bridged over to Cosmos. Any contract on Ethereum can send their tokens to this contract, where they will be locked and represented on the Cosmos side of the bridge, until the contract gets confirmation that the Cosmos representation of that token has been burned. Then the user can re-print their Ethereum tokens.

So the orchestrator, validator, and relayer, are all running on any number of nodes (I think more than 100 currently) and built very securely, with economic incentivization in the correct places to encourage fast processing. This is a pretty robust system to attempt to bring down. We tried many routes, had many ideas, and ran down many metaphorical rabbit trails, most of which turned up nothing. After pursuing many false paths for many weeks, and understanding more and more of how the system works, we had an idea.

The Real Fun Begins

As mentioned, most of the real application logic happens Cosmos-side. This is where the legitimacy of observed events is decided upon by the validators. For example, if the Gravity.sol contract emits a SendToCosmos event that says I sent 3 million Ging3rCoin to the bridge, at least ⅔ of the delegated validators must all submit a message claiming to have witnessed the same event on mainnet Ethereum. Only then will actions be taken to mint the Cosmos representation of my Ging3rCoins. This is not just for sending coins across the bridge, this is how the legitimacy of all relevant Ethereum events are decided.

The intricacies of that process are not important to the specifics of this bug. However, it’s necessary to know this process occurs in order to understand that a record of which private keys are legitimately related to validators is required to be maintained and thoroughly synced within Gravity.sol and the Cosmos chain’s store.

So how is that set of validators (this is appropriately called a Valset) kept up to sync on both sides of the chain? This is done by using the orchestrators to reach consensus on whether or not a given Valset update is allowed to pass, and then incentivizing the submission of any valid Valset updates by relayers. Anyone can trigger a request to perform a Valset update, and there’s some complicated checks that happen to validate that an update is needed, but if it is, the orchestrators will then sign over the entire set of validators (which will include any new validators and drop any old inactive ones) and their respective addresses. There are more complicated checks here, but essentially this results in each orchestrator submitting a message to the Cosmos chain called a MsgValsetConfirm.

The MsgValsetConfirm contains:

  1. A nonce
  2. The orchestrator’s Cosmos-side address
  3. The ETH address associated with the orchestrator
  4. The signature over the validator set, signed by the orchestrator

These messages are then kept in the chain store, and sorted by which orchestrator submitted them. Then, when it is time for a validator set update to happen, a relayer can collect all these ValsetConfirm messages, correlate them with the potential validator set they are signing on, and submit all the signatures and the respective validator set to the updateValset function on the Gravity.sol contract. iif successful, this will update the Valset on Gravity.sol, which will then be observed, which will trigger the Valset being updated within the Cosmos chain.

This function does a number of things. But of note to us, it checks for signature correctness, and most importantly ends up calling ECDSA.recover on the passed in signatures.

Famously, all the signature checking functions in Solidity now perform signature malleability checks to ensure that a signature can’t be used twice. I won’t go too deep into the weeds on this, but to summarize, any ECDSA signature can be manipulated without knowing the secret key used in the signing process to be valid at either of two s points on the curve. This is because of an essential symmetry in elliptic curve cryptography that allows the mathematics for a given point to work on either side of a curve. This essential elliptic curve symmetry, interestingly enough, gives us the asymmetry to halt Gravity Bridge.

ECC Curve
Figure 4: Classic ECC Curve

Notice the symmetry over the x axis in Figure 4. This is what allows for two potential s values in the signing process. Of course things look a bit different when defining that elliptic curve over a finite field, but that essential symmetry holds.

Because the Solidity contract does a check to ensure that the signatures have a valid s value, any relayer can submit an invalid signature as part of the validator set, and cause the updateValset function to revert everytime. This is all well and good since the relayer still wouldn’t get paid, and thus wouldn’t have any incentive to submit an invalid update again and again. However, that's not the end of the story.

Forced Ethereum Reverts

Strikeout realized this was an issue when he checked to see whether the same s value check (i.e., a malleability check) exists in the Cosmos code where the MsgValsetConfirm messages are validated. To spare you the code-pathy details, the important checks on the signatures are performed in the EthAddressFromSignature function, shown below in Figure 5. And, fascinatingly enough, there was no check for a modified s-value in this code:

Gravity Bridge Cosmos Signature Validation
Figure 5: Gravity Bridge Cosmos Signature Validation

This means that any orchestrator could submit a MsgValsetConfirm with a modified signature, and then every relayer would pick up that signature, and use it in all subsequent attempts to update the validator set in Gravity.sol, which would fail every single time.

If an orchestrator then at a later time wanted a Valset update to occur, they could begin using their unmodified signature to sign that message, and it would then go through. This same functionality holds for validator signing on all other Ethereum events. So, at the very least, a malicious orchestrator could gain control over what updates and transactions pass and which ones do not, and could halt key functionality of the chain if they wanted to. Thus, I proceeded to PoC this bug for submission. However, in the process of creating the PoC for this, I figured out something much worse.

Forced Relayer Crash

Deep within the rust relayer code, a check is made against the Valset before it gets submitted, to ensure it won’t get rejected. This is to ensure that money isn’t wasted on gas. Part of this involves ensuring that the signatures are all valid. If there is an invalid signature in this set, the rust relayer will completely halt and exit this main relaying loop. One of the checks for an invalid signature verifies that the s value for the signature is within the bottom half (so to speak) of the elliptic curve, which is shown below in Figure 6.

Orchestrator Signature Validation
Figure 6: Orchestrator Signature Validation

If this error is hit, a panic will occur within the main relaying loop and it will exit. Additionally, the check for potential Valset updates actually happens first within the main relayer loop. This means the code will always halt before relaying batches or logic calls or anything else.

So, if a single signature in the set is not valid, the relayer will panic, and completely stop its relaying. And, on the next loop, will again pull the invalid signature from the chain stores, panic on its validation, restart, pull the invalid signature from the chain stores… Repeat ad infinitum.

So putting it all together, we realized that a single orchestrator can submit a single message, and halt all relayer functionality, and thus halt all functionality on the bridge. On top of that, there is no built-in method for removing the MsgValsetConfirm once it has been accepted on the Cosmos chain. So, in the event of a malicious orchestrator performing this attack, the only recourse would be to manually update every relayer and orchestrator. Only then would any functionality be restored. That sounds like a lot of downtime.

Relayer Erroring Out
Figure 7: Relayer Erroring Out

Watching the PoC (shown in Figure 7 above) for this exploit cause the chain to stop, and seeing every relayer in the test setup throw a panic every time it began its main loop, reiterated, once again, one of my favorite things in the technological world: cryptographic asymmetry. A single mathematical calculation to change a value representing something defined in two possible ways, causes an extremely robustly designed bridge to come to a total halt. Cool.

One other thing I want to note, is that this vulnerability applied to all message types going from Cosmos to Ethereum via the relayer. Thus, it was still possible for a malicious orchestrator to only drop specific logic calls or batches from being relayed, however this was by far the most impactful instance of the vulnerability.

Reporting

There are two main forks of the Gravity Bridge, one of which is the currently maintained and active bridge. The other fork is maintained by a different team (as far as we can tell) and is used by a handful of other projects, one of which is called Sommelier. This was actually how we initially found out about the Gravity Bridge. Because there were two separate forks that each required different fixes and had different impacts, we reached out to both teams independently.

The Gravity Bridge team responded very quickly, and patched the bug within a couple days. This resulted in me and Strikeout receiving the first on-chain bug bounty proposal via governance. For me and strikeout being newish to the web3 space, we both thought it was awesome to see the actual technology being used to reward our bug bounty hunting efforts. Big kudos to the Gravity Bridge team and their quick response, and thank you to the GRAV community for the bug bounty!

The Sommelier team responded with a fix of this bug in the fork they use, and rewarded us for our efforts with another bug bounty as well. Big thanks to the Sommelier team for understanding the slightly different impact in their fork of the chain, and a huge thanks for the bug bounty reward!

Overall, communication with both teams went well, and all the team’s efforts to quickly understand and patch the bug are appreciated. Strikeout and I would both feel great about hacking on anything for either team in the future.

Takeaways

If you’re a web3 hacker or developer reading this, perhaps a generalizable take away is to check for relevant validation on both sides of all cross-chain functionality. If the Cosmos side code already had the same checks for signature malleability as the Ethereum code inherently did, this issue would not have been possible. Anyone approaching the bridge from a purely Solidity background might assume that, because they’re using the right function in Gravity.sol, signature malleability presents no attack vector. But if you approach the bridge with first-principles thinking, reasoning independently about validation on all sides, the bug becomes obvious. I think in general, this is where the best hacking is done. Deep dives with first-principles thinking. Processes and methodology and techniques are important, but they can only take you so far.

If you’re a reader not currently in the web3 space, think about all the other wonderful examples of fascinating interactivity between mathematics and the real world that you could find, if you took the dive. After all, there’s plenty of fascinating and brilliant bugs to go around.

Strikeout is working on a relevant blog post about some of the defense in depth measures that we discovered that proactively prevented attacks that we were working towards on Gravity Bridge. So definitely keep an eye out for that!

And never forget: think asymmetrically.

Until next time,
ging3r