Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Supporting the Smart Contract Vulnerability Research Community- 1287

Chainlink Labs    Reference →Posted 2 Years Ago
  • Chainlink is a network used by many, many blockchains for several things. It provides oracles for prices on tokens, random numbers and much more.
  • As such a major part of the ecosystem, they take security very seriously. They have the best of the best audit their software and have a very big bug bounty program on HackerOne and Immunefi. They've gotten audits from Code4rena and other top firms.
  • Trust (OG auditor) and another researcher Zach (LSR at Spearbit) found a very niche flaw in the Verified Random Function (VRF) system. When generating random numbers, the flow works as follows:
    1. Request randomness to the Chainlink contract. This emits an event that will be acted upon.
    2. A callback from Chainlink is made to deliver the random number with a proof.
  • A subtle but important thing is that the the random number sent should be the only one sent. Why is this important? If a user can force a redraw arbitrarily, then the system becomes unfair. For instance, if a user doesn't like a number, then they can just re-request the randomness until it's favorable. With bad setups, this can be an issue with Chainlink.
  • The issue is that the subscription owner role within a Chainlink can block randomness from coming in then force a redraw. This role is typically reserved for a member of the hosting DApp, making it a very privileged position.
  • The hackers were given a 300K bounty from Immunefi for the critical finding. To me, having a privileged role being able to redraw randomness doesn't feel like this big of a finding. However, considering this is Chainlink which supports many use cases, they want to ensure that in a completely decentralized application that a single role cannot abuse Chainlink. Good write up!

lateralus (CVE-2023-32407) - a macOS TCC bypass- 1286

Gergely    Reference →Posted 2 Years Ago
  • MacOS has two many things going on for its own good. It has way too many things to analyze statically. So, the author creates a tool to pick up FDA entitled apps and run a syscall trace on them. When looking for items reading files and env variables, he noticed some scary hits. The article is about a scan that led into a bug.
  • The ENV variable MTL_DUMP_PIPELINES_TO_JSON_FILE is a Metal framework variable used by various MacOS programs. It opens a file on the current application and writes data to it. Pretty simple!
  • How does this work? Courtesy of the fs_usage command:
    1. A file will be opened using the open() syscall on a temporary file.
    2. write() is called to write to this file.
    3. rename() is called on the temporary file to name it back to the path we control.
  • rename() in place is not a safe function. But why? There's a race condition that occurs between the open and copying of data. There is a classic time of check vs. time of use (TOCTOU) bug on this call. By changing the file to a symlink to something else at the right time, we can cause major havoc!
  • Even better, we can control the log data being written by catching the tempfile creation when it occurs. So, when the renaming occurs, we control the data being written in the file. Between the data controlling and the renaming TOCTOU issue, we can write to an arbitrary location with arbitrary data. Pretty neat!
  • How does the author go about exploiting this?
    1. Create a symlink that points to the Apple TCC directory.
    2. Create a directory at an attacker controlled location.
    3. Set the vulnerable ENV var to a file in our temporary directory with the vulnerable app running.
    4. Catch the open() of the temporary file in the directory and write our malicious TCC database to it.
    5. Switch the information in the symlink over and over again until the execution occurs.
    6. Wait and see if we successfully won the race.
  • With some luck, the TCC.db file was overwritten with our own! It's a pretty slick bug that exploits complexity within the rename syscall. Apple fixed this by removing most of the Metal ENV variables.

Reptar- 1285

Tavis Ormandy    Reference →Posted 2 Years Ago
  • The rep movsb instruction is a super common way to move around memory in x86. The destination, direction and amount are all set in this call, but the processor does stuff under the hood.
  • In x86, the instruction decoding is very relaxed. Sometimes, compilers use redundant prefixes to pad a single instruction to get a nice alignment boundary. There are several prefixes that can be used, such as rex, vex and evex. On i386, there are only 4 registers which were encoded in the instruction. When this was doubled to 8 registers, there was no where to go.
  • So, the rex instruction adds an additional byte to the beginning of the instruction to encode this information. If this is found before an instruction like movsb, then it's silently ignored. Well, in most cases. The fast short repeat move instruction; the feature is all about moving small (less than 128 bytes) strings quickly
  • To test for architecture level issues, the author of this post uses Oracle Serialization. This generates two programs but transforms it to include micro architecture changes like fencing instructions. If the state of the program after serializing it is different, then something weird has happened.
  • While fuzzing using this technique, they noticed that adding redundant rex.r prefix instructions to an FSRM optimized operation caused unpredictable results. For instance, branches to random locations, branches being ignored and many other weird things. Somehow, this had corrupted the state.
  • Within a few days, they found out that triggering this on multiple cores led to exceptions and halts. Within an unprivileged guest VM, this could be used to crash the computer! So, what's going on?
  • The CPU has two main components: frontend and backend. The frontend fetches, decoding and generates the ops for the backend to execute. The backend then executes these instructions. The authors of the post think that there is a miscalculation in the movsb instruction size, which leads to extra backend entries to be processed.
  • Is this exploitable? Probably! However, there is no insight into what's being processed under the hood. So, the information above is just a guess from the author. Awesome post once again!

Retrospecting Unhealthy Order Allowance Vulnerability in Perpetual Protocol- 1284

ChainLight    Reference →Posted 2 Years Ago
  • Perpetuals are a type of trading that is speculating on the price of an asset after some amount of time in the future. The price can either be bet on going up or down. The vulnerability is in the calculation process of the indexing of the pricing.
  • Typically, the index price refers to the average price of the underlying asset. The mark price is the current price offered by the exchange on the future being traded. Within the Perpetual Protocol, these are different though. Index price is the current value of the spot asset, which uses a TWAP. The Mark Price is the most recently traded price value.
  • In future exchanges, the size of the position is limited by the user's initial margin (debt placed in). Otherwise, the user would have bad debt, leading to insolvency in the protocol. So, any vulnerability that can achieve loads of bad debt is bad.
  • Perpetual protocol did not control this with the method above. Instead, they calculated the value of all positions and allowed orders based upon their index price. Since the index price is somewhat manipulable, this becomes a problem! Raising the price, shorting, then dropping the price could lead to large losses in the protocol.
  • How feasible is it to manipulate a pool? They looked at many of the pools and determined that the vMATIC-vUSD was likely the most manipulatable. The process for hitting this issue is fairly complicated with four accounts. Here's how it goes.
  • First, account 0 in creates a massive sell of spot tokens to drive the mark price to fall to 0.8A. The maximum allowed price change is 20%, due to some existing defense in depth measures.
  • Second, account 1 opens up a short position at 1.2A, again, at the maximum amount being 20% manipulation. At this point, account 2 places a long position on the price at 0.8A to a maximum 1.2A through a massive purchase of the spot token. On this step, very large unrealized profit is generated for account 2.
  • Account 3 opens a long taker at the price of 1.2A as a counterparty for account 1, executing the malicious short taker order at this price. Account 2 closes its long position to realize its profits. To me, the key is that since the price is manipulable. This results in a positive gain from both the long and the short. Doing this over and over again (once per minute) could have stolen most of the money from the protocol.
  • On Immunefi, the mediation process went south. The reasoning from Perpetual Protocol didn't make any sense and they offered 5K for a medium instead of 250K for a critical. Eventually, after months of work, they moved this to a critical with a 10K bounty. It seems like specific market conditions had to be meant for this to work but I don't fully understand them.

LayerZero's Cross-Chain Messaging Vulnerability- 1283

Heuss    Reference →Posted 2 Years Ago
  • LayerZero is a universal cross chain messaging (CCM) protocol. By having a LayerZero smart contract deployed on a chain, assets can be transferred between chains. A relayer is an entity that submits cross chain transactions. A particular user application can choose the relayer and oracle they would like to use.
  • The author goes through a previous vulnerability within LayerZero. In this post, the author noticed that when emitting an event the relayer address was not being included. So, they were curious if there were implications for this.
  • By setting the price fee to be 0 on their own oracle/relayer then switching to the default LayerZero versions, zero fees will be incurred. Being able to not pay fees is bad. Interesting bug by itself!
  • While trying to address this issue a modification was made to the protocol. When calling setConfig() function to update the oracle/relayer information, the relayer should refrain from relaying the message on this same block.
  • The relayer failed to check who sent the update to the configuration. If an attacker sends a message to a regular UA then a malicious UA calls setConfig() within the same transaction then the message will not be relayed.
  • This is very severe! The outbound nonce is incremented with each message, making it not possible to get the message relayed with more time. Originally, they reported this as a critical vulnerability. However, the development team has a way to force send transactions that were not originally relayed, bumping this to a medium severity bug.
  • A good design from the LayerZero team! I like to think they thought of threats and came up with solutions for many of these threats. In this case, they probably thought about a relayer not processing a message, dropping it entirely to build out this solution. Smart design of protocols creates situations where critical vulnerbailities are now recoverable, which is super cool.

Optimism Censorship Bug Disclosure- 1282

iosiro    Reference →Posted 2 Years Ago
  • Optimism is an L2 blockchain. The idea is that Ethereum is too slow and too expensive. So, if we rollup a large amount of transactions into a single transaction sent to Etheruem, the gas cost can be shared between them, making it such cheaper.
  • A sequencer is a program that takes in the proposed transactions and submits them to Ethereum. Of course, lots of proofs and things are done prior to this. In front of the sequencer is a load balancer that rate limits the traffic coming in.
  • To detect the number of attempts, the rate limiting is calculated based upon the signed transactions per account within a given time window. To prevent censorship, the transactions are discarded if the nonce is lower than the accounts current nonce. Source IP rate limiting is done as well.
  • Rate limiting is a great feature to prevent network spamming. However, this has logic that can be flawed as well. If developers are not careful then this feature can be used against the system. In this case, the program was not checking the chain id!
  • So, if the other chain had a nonce that was higher than Optimism, then it was valid for the rate limiting. Down the road, EIP-155 would reject the transaction though. Regardless, it would still trigger the rate limiting functionality. By taking transactions from another chain, a user could be arbitrarily rate limited indefinitely.
  • Specific accounts in the network have special permissions or are really important other parts of the ecosystem. LayerZero being taken down, censoring of bridges, ProxyAdmin changes and many, many things would be broken. Additionally, this could allow for strange edge cases in the system by choosing when transactions go through and when they don't.
  • The authors of this request rated this as critical. Considering that any user could prevent any transaction, I understand that. However, this would be identified and fixed within a few days after reviewing the logs of the proxy. The Optimism team decided this was a medium risk finding in the end. Sadly, this was marked as out of scope, which I hate.

Oh-Auth - Abusing OAuth to take over millions of accounts- 1281

Aviad Carmel - Salt Labs    Reference →Posted 2 Years Ago
  • OAuth (Open Authorization) is a standard authorization protocol. It is used all over the place with SSO providers to allow for a trusted entity, like Google or Facebook, to authenticate you to other sites. However, there are many footguns with this.
  • The flow for OAuth is as follows:
    1. User tries to login to some site. It wants proof of their identity via an SSO provider, like Facebook. So, a redirect is made there.
    2. Upon redirecting with a logged in SSO provider user, a secret is passed to the user. They are then redirected back to the main website.
    3. The website will take the secret and communicate with the website on who they are. Now, the identity can be made to get information like their email and other things from the SSO provider.
  • On Vidio, there was an issue with the verification of the access token for the redirect back to the main website. When Vidio would make a request to Facebook (the SSO provider in this case), there is an app identifier (AppID) for each app. However, it is the responsibility of the website (not Facebook) to ensure that the token belongs to their app.
  • So, by providing a token from another Facebook app that the user controlled, they can return an arbitrary email, which results in an account takeover. This same attack method worked on bukalapak as well. In the case of Grammarly, they used an auth flow that was not vulnerable to this issue by default. By brute forcing parameters they were able to find a flow that was vulnerable to the method mentioned above.
  • The SSO providers have custom attacks, which is super interesting. To me, it makes sense to force the app developer to specify the AppID instead of requiring manual verification; this is done on one of the Facebook flows already. Considering this, I'm sure many other providers and websites are vulnerable to this attack. Good vulnerability description!

A short note on AWS KEY ID- 1280

Tal Be'ery    Reference →Posted 2 Years Ago
  • With AWS access keys, there are two mandatory parts: the key id and the secret key. The format of the AWS access key is actually predictable, which is super interesting!
  • The first four characters are a prefix for the type of key. This depends if it's for a role, a certificate, a regular access key or something else.
  • After this, there is 16 bytes. If you base32 decode this you end up with 10 bytes. The account ID is encoded within the first 5 bytes of this but shifted by one bit. The author wrote a script that decodes the account given the key.
  • The rest of the 5 bytes is still unknown. I'm guessing it's random data to ensure that the key is unique.

Helping Secure BNB Chain Through Responsible Disclosure- 1279

Felix Wilhelm    Reference →Posted 2 Years Ago
  • The BNB Beacon Chain is the governance and staking layer of the BNB Chain. They use a fork of the Cosmos SDK with many modifications.
  • One of the more sensitive parts is the coin type. In the original Cosmos SDK, it uses a safe bigInt wrapper instead of native types. However, in the fork, they use the int64 type for efficiency reasons. Because of this, integer overflows and underflows are possible when not checked.
  • The message MsgSend is used for simple 1-to-1 token transfers with multiple outputs. To prevent theft, a loop is performed to ensure that the amount being sent is enough for what the user possesses. Verification is done to ensure that the inputs of the system match the outputs of the system.
  • Using integer overflows, the verification above is trivial to bypass. In particular, we can send out way more tokens than we own by making the inputs and outputs match after the overflow. This results in the ability to create tokens out of thin, breaking the blockchains security.
  • The solution was to patch their fork of the library to not allow overflows in the future. Overall, a fairly simple vulnerability in a popular project.

Rate manipulation in Balancer Boosted Pools — technical postmortem- 1278

Juani    Reference →Posted 2 Years Ago
  • Balancer V2 is a key lending and borrowing protocol with lots of interesting functionality. Within V2, arbitrary contract is capable of being a vault; this is to maximize innovation and flexibility. The batchSwap() function can be used to perform multiple swaps atomically to get the best path. This also contains a flash swap by only having to pay for the funds at the end.
  • Balancer was trying to be as capital efficient as possible. In a pool, the ratio is what calculates the price of the token. To have stability, lots of tokens are required; this ends up with a large amount of idle tokens that are doing nothing useful. They tried a few different things to fix this but settled on linear/boosted pools.
  • Lending protocols have an underlying token in exchange for the platform/LP token, such as aTokens on Aave. From this, users earn yield for liquidity. Since these are always rebasing, a wrapped version of these tokens in a Balancer vault is used. To prevent the constant wrapping and unwrapping of assets, they created linear pools.
  • Within a linear pool, there is a third token: the Balancer Pool Token (BPT). Pools that contain tokens that are also pools themselves are called composable or recursive. What's cool about this is that the BPT can be swapped like any other token in the pool itself.
  • Now, for the vulnerability! The issue existed in the common library ScalingHelpers. In DeFi security, the rounding direction is critical to get right for security. Any rounding errors should always favor the pool. To be efficient, they decided to always round down and expect that the consequence would be minimal.
  • This was true for everything except linear pools. A bunch of crazy things coming together made the rounding error significant:
    • Linear pools have zero fees when balanced and no minimum balances.
    • Initialized with pre-minted BPT, creating a near infinite supply. This is available for flash swap operations.
  • Why is this significant? The batch swaps settle at the end. Individual swaps perform calculations on scaled balances - including rates - which depend on the intermediate pool state. Since this doesn't work based upon the vaults pool balances, the math is deeply effected. There is a quick and dirty attack path:
    • Borrow BPT via a flash swap with the rate slightly greater than 1 and trade it for main and wrapped to reduce the token balances to near zero.
    • Craft a trade that exploits the rounding error from above to make the total balance equal the virtual supply. This will result the rate to 1.
    • Repay the flash swap at a new lower rate for a profit.
    This can also be done on the main and wrapped tokens as well.
  • This is where the story gets wild. While trying to get people to take the funds out of the paused and effected pools, a different vulnerability was found! The exploit from before dropped the rate; they found a rounding error to increase the rate. This exploits some decimal precision and rounding issues described above. When the rate is high, the BPT trades at a premium within the Composable Stable pool.
  • Raising the rate was fine because they couldn't get the rate back to profit from it; this was discussed during the design. However, the attacker found a way to drop the rate back down. During the initialization stage of a pool, the check is that the total supply is zero. By using the methods from above, this condition is possible to hit, recreating the initialization scenario. With this, the attacker could profit from the attack.
  • Trying to mitigate this was interesting: it's tough being a "fully decentralized" protocol. You want to be able to shut stuff off but you shouldn't be able to with fully decentralizion. Some items had a pause function, some were upgradable and a recovery mode. But, these weren't implemented in everything.
  • At the end of the article, the author reflects on many things. First, they had several audits done and this bug lurked for over 2 years without getting discovered by a whitehat. The complexity of the protocol became too much. In particular, the bootstrapping of functionality over and over again in strange ways. Overall, a fascinating postmortem about an immensely important protocol.