Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Why Careful Validation Matters: A Vulnerability Originating in Inline Assembly- 1707

Sherlock    Reference →Posted 7 Months Ago
  • Solidity adds a lot of safety checks, such as integer overflow protection, at the compiler level. Because of this, there is a special lower-level language called Yul to help write more proformiant code. These blocks, commonly used to as "inline assembly" are complicated and hard to get correct.
  • The code in the blog post contains an inline assembly call() to a function. It has some Yul code that is worth discussing further:
    1. Copy the calldata of the program into EVM memory.
    2. Patch the callata that was placed into memory. User shouldn't be able to control this parameter directly unless it's zero.
    3. Call the function target with the pointer data.
  • To prevent craziness from happening, there is validation happening on the callee contract and function selector being specified. The patching code is used to overwrite the swapAmount in the calldata via a user controlled index. This is where the fun begins!
  • To calculate the address to write, the following calculation is used, where swapAmountInDataIndex is a 32 bit integer.
    ptr + 36 (0x24) + 
      swapAmountInDataIndex * 32 (0x20). 
    
  • The offset of 36 is used to prevent overwrites to the function selector of the call, which is wise. For some reason, the swapAmountInDataIndex variable is a uin256. Unfortantely, there's an integer overflow in this calculation. When performing the multiplication on the index, this can overflow. With a specially crafted value, it's possible to wrap back around to modify the function selector that had previously been verified. An arbitrary call in the context of a Solidity smart contract is effectively game over.
  • The solution is to limit the index by the size of the calldata. This prevents the overflow and any other out of bounds access. Overall, a solid and subtle bug in a Yul optimization.

Pwning Solana for Fun and Profit - Exploiting a Subtle Rust Bug for Validator RCE and Money-Printing- 1706

Anatomist    Reference →Posted 8 Months Ago
  • This post details the discovery and exploitation of the Anza Solana validator to achieve full remote code execution (RCE) on the validator written in Rust. Besides the vulnerability description, they discuss a lot of background (that I will skip) and the process of hunting for the bug, which is probably the most interesting part to me. Solana is extremely optimized, which comes at a cost in terms of security. For significant speed enhancements, such as memory address translation, it conceals a substantial security risk. For this reason, the team was closely monitoring new changes and features that looked dangerous.
  • In Solana, all account data had to be copied directly into the VM, which is very slow. So, the feature Direct Mapping was born to allow account buffers to be directly mapped into the VM memory rather than copying the entire data to the execution runtime. Dealing with raw pointers is very scary, which is why the authors of this post decided to keep looking at it. New code is the most likely to have bugs in it.
  • In Solana, these accounts have very important permission boundaries, such as only the owner of an account being able to modify it. Originally, these were validated post execution or when creating a call for a CPI to a local version of the data before being written to the global account cache. This is an important invariant to consider later.
  • Direct Mapping had the account.data point directly to the hosts buffer for the Solana memory region. Because everything is now using a shared pointer instead of a personal pointer, the validation must happen on each write called copy-on-write. This changes a key invariant of the system. Originally, all of the data was directly read from the underlying DB implementation. So, a copy was added to memory and updated once it was written to.
  • Direct Mapping also needs to consider situations where account sizes change. To make this infrequent, there is an overallocation that occurs. When CoW operations relocate data buffers during CPI execution, the original MemoryRegions structure for the previous call still points to the old buffer. To do this, it grabs the vm-data_addr to find the memory region of the original mapping to eventually update it.
  • The vulnerability lies in bad input validation from this process - the CallerAccount.vm_data_addr is stored completely in the VM's heap memory. By modifying the AccountInfo.data pointer in VM memory before triggering a CPI call, an attacker can forge an arbitrary vm_data_addr value. This causes the wrong memory region to have its host address updated, being mapped to an arbitrary location in virtual memory.
  • Their original exploitation method was to break memory write authorization checks - a core invariant of Solana. Their first PoC made an account writable that should not have been writable, leading to a major loss of funds bug on all Solana programs. Upon pulling the newest code, the attack no longer worked because of a fix to an unrelated bug. The exploit was killed because the CPI account being set to ReadOnly.
  • The core issue still isn't patched but we just need a new exploit strategy. They decided to approach this with a binary exploitation-like approach! Accounts with different sizes could now be mapped over the top of each other. By using this fact it's possible to read or write out of bounds on the directly mapped buffers in host memory. This is limited to the range of the other account that we're using though, limiting the size of the read/write.
  • They came up with a way to take this to an arbitrary write that required 3 accounts: small buffer (SWAP), large buffer (LEVERAGE) and the exploitation account (POINTER). Here are the steps:
    1. Trigger the vulnerability to map the SWAP address over the top of the LEVERAGE address. This will allow us to read/write OOB on SWAP.
    2. Hunt for the POINTER account within the MemoryRegion. By setting a simple value on the account data, we can locate it after some searching.
    3. Replace the POINTER host_addr and set its state to writable.
    4. Write to POINTER at any location in memory that is desired. With arbitrary read/write, this is a basic exercise to get RCE.
  • I really really really like this post. It contains methodology on why and when they were looking at this section of code. It describes a deep understanding of the core invariants of the software and how they found a bug that breaks the invariants. Finally, their multiple attempts at exploitation were nice to see as well. This is one of the more "real" write ups that I have read on bug hunting. Solid work!

LET ME COOK YOU A VULNERABILITY: EXPLOITING THE THERMOMIX TM5 - 1705

Baptiste Moine - SynAckTiv     Reference →Posted 8 Months Ago
  • The Thermomix TM5 is a multifunctional kitchen appliance. In previous research, Jean-Michel Besnard found a directory traversal flaw in BusyBox's tar implementation. This article describes a lot of reverse engineering of the device and a firmware downgrade vulnerability they found in the process.
  • The device contained a strangely formatted NAND flash; after considerable effort, they found a tool to decode it. There are cook sticks that hold recipes for use with the Thermomix. Upon looking at the UDP flash drives, they find out they are encrypted. Upon examining the NAND flash firmware, a file named /opt/cookey.txt is found to contain the encryption key. After some reviewing of a customized kernel driver, they are able to decrypt locally, but cannot modify it because the data is signed.
  • The Cook Key is a special device that enables users to connect the Thermonix to WiFi, download firmware updates, and access additional recipes from a cloud service. It contains a WLAN module, an LED controller, and a UDP flash drive for the file system. The first partition contains a cs.tar and the second partition contains recipes, cloud settings, and a recovery firmware image.
  • The author wanted to emulate the Cook Key completely. Therefore, they created a custom board with all the pre-signed information. This isn't an issue by itself but it allows for a larger attack surface.
  • The version section is what we're after. This contains three values: date, comment and force_flag, with the first two being arrays. The original usage of this contained a security issue: the firmware could be downgraded by swapping firmware update file sections between versions. A classic replay issue, but this was the past, and we needed a new vulnerability by swapping these individual sections around.
  • When verifying the firmware update file, each section is encrypted using the AES-EAX mode. This combines AES-CTR for encryption and OMAC-based tag for integrity. Each section is RSA-signed, but the nonce and tag are excluded from the signature so that they can be tampered with. We know the encryption algorithm in this case, but we're unable to modify anything because of the signature. Or can we?
  • In AES-EAX decryption, the keystream is generated from the nonce N and the AES key K. PT = C0 XOR K0 in most cases. This can be rearranged to K0' = C0 XOR P0'. If we XOR our plaintext with the ciphertext, we can know what key (nonce starting value) it can be generated with. Since we control the nonce and know the key, we can reverse the encryption process to find a nonce that will match this. Neat!
  • In practice, we control the first date string but everything else will be a garbled mess. The force_flag is something that we want to be set to 1 though. By brute forcing enough keys, it's possible to set this to 1. All of this works because A) the nonce is not verified and B) the header information with the date, comment, and force_flag is a singular encrypted piece of data with nothing else in it. I find it weird that the signature is unique per section, personally.
  • Cryptography, particularly AES-CTR mode, is hard to use properly. With both encryption and signatures, this scheme looked perfect but it gave the author just enough room to workaround it. Awesome post!

Apps shouldn’t let users enter OpenSSL cipher-suite strings - 1704

Frank DENIS    Reference →Posted 8 Months Ago
  • TLS allows for a lot of configuration. Which encryption algorithms and key exchanges that can be used, hashing algorithms and more. The author of this post asks if this is the proper user experience. Their claim is that many admins "fix" (notice the double quotes) by changing the ciphers, only to make the situation worse. For instance, when BEAST and POODLE attacks were in the news people were changing to RC4. Sadly, RC4 had its own issues that nobody really knew about it.
  • The author claims that checkboxs are better than cryptic strings. These checkboxes could contain items like FIPS 140-3 approved or post-quantum or negative options like disable TLS 1.0. Each checkbox is a union of items to perform. A set of simple presets would be very useful too.
  • Why is this nice? Compared to the cryptic strings, this contains future-proof algs, easy-to-understand algorithms, and super-easy compliance. To do this correctly, the creators of the checkboxes would need to be very careful about the mappings.
  • Sometimes, the real strings are necessary. FIPS 140-3 requires NIST-approved algorithms, which aren't always possible. Forward secrecy may be a requirement that may not be doable on the checkbox approach. There are likely other edge cases as well. Overall, it's a great post on making defaults more secure.

Corruption via MathSpace on Firefox Browser- 1703

Manfred Paul - ZDI post by Hossein Lotfi    Reference →Posted 8 Months Ago
  • Browsers need to be fast - I mean, really fast. So, running JavaScript isn't always fast enough. Modern browsers perform Just-in-Time (JIT) compilations of JavaScript to native code, making it faster. This introduces an interesting yet incredibly complicated set of vulnerabilities to consider. This post is a Firefox JIT bug in the Pwn2Own competition.
  • The Ion JIT compiler uses a function called ExtractLinearSum to convert a value into a linear sum expression. For instance, (x+(2+3)) - (-3) can be transformed into x+8. This type contains three parameters: This type contains three parameters:
    1. Value node
    2. MathSpace - an enum with the three values Modulo, Infinite and Unknown that will wrap around the integer space, bail if wrapping is needed and the final one is a default value.
    3. Recursive counter for stack depth exhaustion issues
  • The function ExtractLinearSum is used multiple places in the Ion compiler, one of which is folding or simplifying the linear expressions. The function TryEliminateBoundsCheck is trying to merge bounds checks on the same object to simplify things. For instance, array[i+4]; array[i+7] will generate two bounds checks. To do this, it will create a bounds check object that can keep track of what's going on, eventually leading to a value of 7 being checked on the length.
  • Although the usage of the MathSpace is useful, it's not rigorously verified. In the case of bounds checks, this seems pretty important! Module makes sense in some math cases but doesn't make sense in the case of bounds checks - infinite does. So, what if we can find a way to make the numbers being used in this operation of type Modulo on a bounds check?
  • The following code triggers the bug when i is slightly less than 2^32: array[(i+5)|0]; array[(i+10)|0]. The |0 is used to force this to be 32 bits. The check will overflow because of the MathSpace being set to Modulo, leading to a faulty bounds check. This is only possible with really large arrays, requiring typed arrays to be practically feasible.
  • Getting the write to happen in the proper location only requires fiddling with the minimum and maximum sizes in funky ways to trick the minimum/maximum counting for the bounds. To make this useful to exploit for an OOB read or OOB write, a useful object must be found in the huge address space. They found that Map objects were nice for getting a addrOf and fakeObj primitive. Once there, exploitation is trivial.
  • It appears that this bug was found via manual source code review. Even though JavaScript engines are heavily fuzzed and reviewed, there are still great bugs lurking in unusual places. Overall, great write-up for somebody who knows nothing about browser engines!

Boredom Over Beauty: Why Code Quality is Code Security- 1701

John Saigle - Asymmetric Research     Reference →Posted 8 Months Ago
  • The Web3 space is innovative yet financially risky at the same time, due to attackers' ability to directly steal money. This innovative aspect has led to many hard-won lessons in security that need to be relearned in Web3. This post is about one of them: overall code quality. Code quality is code security.
  • NASA famously implemented their Power of Ten rules for clear guidelines in coding. NASA specifically implemented this because projects with extreme consequences for failure require rigorous code quality standards. CURL contains very serious coding guidelines as well.
  • When code is well-structured and adheres to clear patterns, security vulnerabilities become easier to identify and harder to introduce. Codebases characterized by inconsistency, complexity, and poor organization create fertile ground for security flaws.
  • Now comes the reason for the name: chase boredom instead of beauty. Most secure code is boring and simple - the JC of our company has talked about this extensively as well. Security thrives in predictability and not novelty. Besides the code, this includes docs, standards, linting, and review processes.
  • Why should we take code quality so seriously? Problems cost more to fix later. Whether it's re-architecting something, a major hack, or something else, it just costs much more later. Additionally, when developers trust their foundation and execute without fear, they can build systems that will last forever. Good read!

Uncovering the Query Collision Bug in Halo2: How a Single Extra Query Breaks Soundness- 1700

Suneal Gong - ZK Security     Reference →Posted 8 Months Ago
  • Halo2 is a zero-knowledge (ZK) proof framework based on the PLONK protocol that was originally used for Zcash. Circuits, the flow of operations and verification in a ZK proof, are structured as tables. In these tables, each column holds a sequence of values and each row represents a step in the computation. Constraints, or the limits of the circuit, are defined by querying values in these columns at specific offsets, known as Rotations.
  • Each column is encoded as a polynomial over a finite field. Querying a column at a certain Rotation corresponds to evaluating the value at a specific point. Constraints among columns are enforced using gates. The prover commits the columns using polynomial commitment schemes like KZG. The verifier will receive these commitments and verify it for correctness via black magic math.
  • Circuits have multiple columns and gates, resulting in the evaluation of polynomials at multiple commitment openings. To make this efficient, Halo2 uses a multi-point opening technique, allowing for the verifier to batch many queries into a single proof. In practice, they batch the evaluations, compute a linear combination of all values and check a single equation to ensure it's been satisfied.
  • Alright, enough of the math! What's the vulnerability!? The multi-point opening system encodes the data as Commitment,QueryPoint as the key and to a Value. This key isn't unique enough! It's possible for a "query collision" to occur, where two independent queries have the same key, even if their values are expected to be different. In the context of Halo2, the consequence is horrible: one evaluation can silently overwrite the other. This means that it's possible to forge proofs in many situations.
  • From what I can gather, this vulnerability appears to be somewhat theoretical as no live protocols could have been exploited. Regardless, the bug was super cool and entertaining to look at, even though I don't fully understand ZK math.

Inside the GMX Hack: $42 Million Vanishes in an Instant- 1698

SlowMist     Reference →Posted 8 Months Ago
  • GMX is a very large decentralized trading platform. Although it has a $5M bug bounty, it was exploited for $42M after over 2 years of being live and multiple audits. There are several reasons this likely wasn't found, such as requiring multiple vulnerabilities to be exploited.
  • There were two design flaws. The first one is around the financial manipulation vulnerability within the GLP token. This was done by opening a short position to increase the size of the Assets Under Management (AUM) instantly. This would then increase the price of GLP in a controllable fashion that could reasonably be undone. This is pretty straightforward.
  • The second issue is less simple. When creating short positions, it was possible to call increasePosition, which did NOT update the globalShortAveragePrices in the ShortsTracker contract. Later, when the execution decreases, the value is updated, though. Entries update, but exist to not update. This is not really a vulnerability by itself but a quirk of the protocol.
  • The real vulnerability is very subtle. GMX had a Postionmanager contract that controlled a lot of settings that was only callable via a GMX controlled key. One of these contracts called enableLeverage on the core code before performing any of the trades. There was a backend off-chain service that would trigger this functionality. While Keeper made this call, it was possible to redirect execution and call the GMX contract while leverage was still enabled. This is the vulnerability that makes this possible.
  • With all of that in mind, the attack can be broken down into the preparation and the triggering. First, the attacker creates a long position via a smart contract (used for reentrancy later) and a reduce-order that the Keeper would later execute. When the keeper received the reduce-order, it would call the PositionManager to enable leverage. The Orderbook would then execute executeDecreaseOrder(), update the attacks position and pass execution to the contract via the collataral token being in WETH.
  • In the attackers smart contract, enabled by the sending of ETH to the fallback function, would transfer 3000 USDC to the vault and open a 30x leverage short against WBTC using increasePosition. Because of the second design flaw, the globalShortAveragePrices were not updated. During a future call to the ShortsTracker contract, the globalShortAveragePrices would be updated. This dropped the price of WBTC to about 57x less than it should have been.
  • To exploit this price discrepancy, they used the GLP token. It would first create a large flash loan of USDC to call mintAndStakeGlp to mint a lot of GLP. Next, the attacker would call increasePosition to deposit a large amount of USDC on WBTC. This would update the globalShortSizes, resulting in AUM increasing dramatically. Finally, the attacker would call unstakeAndRedeemGlp to redeem way more tokens than they were entitled to. But why?
  • The AUM was updated but the globalShortSizes was not. When performing calculations on the trades, the manipulated value of the trade was far above the market price, making the trade appear deeply unprofitable. Naturally, this increases AUM by a lot. By doing this over and over, they got more funds from the trade of GLP than they actually should have.
  • This is a pretty crazy exploit in a popular protocol - it makes me wonder what other big protocols are hiding huge bugs. Exploiting vulnerabilities, such as the manipulation of financial instruments, it pretty complicated. I'm guessing that the attacker found the financial manipulation first and then needed to find a way to turn on leverage.
  • Eventually, all of the funds were returned to the protocol. So, why didn't they just claim the bug bounty? Since the keeper functionality was "privileged" and the offchain infra is blackbox, there's a major risk of getting rugged. SlowMist recommends better reentrancy locks be added. In reality, I feel like these were reentrancy issues across contracts (in the case of enableLeverage), making this not a great solution. In the case of the discrepancy in the price updates, I do agree, though. Great write-up explaining this super complex set of issues!

The CPIMP Attack: an insanely far-reaching vulnerability, successfully mitigated- 1692

YANNIS SMARAGDAKIS - Dedaub    Reference →Posted 8 Months Ago
  • This report is an in the wild story of attackers compromising many contracts in a subtle way. The name says it all: Clandestine Proxy In the Middle of Proxy (CPIMP).
  • Smart contract deployment of upgradable contracts typically works in two types: deploy the code and then call an initialization function. Unless specifically checked, the initialize function can be called by attackers before the real user sets malicious settings. In reality, if this happened, a legitimate developer should recognize the failure and just try again. At least, that's the argument I've been hearing for a long time. So, what's different here?
  • Attackers were able to backdoor the contracts without being noticed - real value was being accrued in these contracts for several weeks as well. The malicious actors were monitoring the intended implementation and deployment procedures. Instead of the normal flow of going from the proxy to the proxy implementation, a contract was added in the middle, similar to a MitM attack.
  • To make matters even scarier, most blockchain explorers could not tell the difference! The implementation was shown as the correct one in the explorer. Events and storage slot contests even look correct. Even the deployment showed the incorrect events. Developers just weren't being careful enough upon review.
  • Many project contracts were backdoored, such as EtherFi and Pendle. The malicious actors were waiting for the right moment to profit, but it was caught first. The authors of the post contacted SEAL 911 to start a war room. To not freak out the attackers and get them to exploit things now, it had to be coordinated. This meant getting all affected protocols into a war room at once. Although every remediation was custom, most of the funds were recovered!
  • So, how did the backdoor work? It was sophisicated with persistence, detection evasion and more. First, it added functionality to become the "Super-Admin" to override ownership for ugprades, drains and executions. This allowed the malicious owner to do whatever they wanted.
  • To make it more persistent, it restored itself in the implementation slot - this meant that not even upgrades could remove it. On L2s, if the Super Admin account had been denylisted, they had signed executions that still worked. Even crazier, they added batched direct storage writes in calls as well.
  • Some implementations contained anti-recovery protection. By reviewing balance checks before/after a call, it would prevent 90% of funds from being taken at once. That's pretty devious!
  • The coolest part about this is by far the reason EtherScan finds the wrong implementation contract - the main reason most developers were tricked. The detection of Etherscan consults multiple storage slots that are defined in various proxy implementations IN ORDER. By placing the legitimate implementation address at the old proxy implementations slot (defined by a standard), Etherscan would find the incorrect address! Amazing work and it makes me think that Etherscan should have a fat bug bounty program.
  • Overall, great research in detecting, documenting, and mitigating this vulnerability. In the future, I will be more hesitant about allowing initialization functions to be frontran. Neat!

Break into any Microsoft building: Leaking PII in Microsoft Guest Check-In- 1691

Bribes    Reference →Posted 8 Months Ago
  • While browsing Shodan one day, they noticed a subdomain associated with Microsoft - guest.microsoft.com. Once logged in via a phone number, no information was given. This seemed like it wasn't meant to be publicly accessible.
  • Looking at the Burp Suite logs, they found an interesting API relating to their previous stays: /api/v1/config/ with a JSON parameter called buildingIds. Since they had not visited any buildings, none of the information was provided, though the array of buildings was empty. By providing an ID of 1, they were able to see some building information.
  • Surprisingly, a lot of building information was provided: access codes in some of them, address/building name, parking info, GPS coordinates, QR code data, Microsoft employee emails, etc. After iterating over more IDs, they found buildings from Israel to the United States.
  • They wanted to increase the impact some more. After some more effort reversing the JavaScript, they found the API /api/v1/host. By providing an email, PII about the employee, such as phone number, office location, mailing address, and more was provided. The same issue existed on guests based upon their email as well.
  • They couldn't find any exposed APIs around explicit visits, so they tried digging further. They tried for some path traversals via secondary context vulnerabilities. After using ..%2f..%2f..%2f or ../../../ URL encoded, they were able to get an Azure functions page. But why!? The proxy was decoding the URL encoded / and being used by the actual Azure function. Neat!
  • After some directory brute forcing, they got a 500 error at /api/visits/visit/test. Eventually, they managed to get this working to retrieve a wide range of invitation and meeting information. Sadly, they got nothing for the vulnerability: it was moved to review/repo, fixed, and no payment was ever made. Regardless, it was a good set of vulns!