Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Halting Cross-chain: Axelar Network Vulnerability Disclosure- 1619

Macro Nunes    Reference →Posted 1 Year Ago
  • Axelar is a cross-chain bridging protocol. To come to an agreement on whether a cross-chain vote has happened or not, 60% of the stake on Axelar has to approve it. The voting is performed on the Axelar blockchain, which is a Cosmos SDK chain. To off-chain listener is called vald.
  • There are uptime requirements for all of the voters, called chain maintainers, of a particular chain. Validators who miss votes will be deregistered and lose rewards.
  • Once the ContractCall on an Axelar gateway contract is made, the vald program will see this and vote on the Cosmos Axelar chain. To do this, they call ConfirmGatewayTxs.
  • While reviewing the configuration settings of the Axelar chain validators, they noticed the max_body_bytes setting. This is the maximum amount of bytes that can be in a request - 1MB. If this limit is exceeded, then the Axelar node will drop the request. These settings are seldomly changed and is the default value in the official setup instructions. By forcing the ConfirmGatewayTxs to be larger than 1MB, with excessive amounts of logs, the voting transactions from vald would be rejected!
  • By itself, this isn't a huge deal. However, considering the voting penalties is where this becomes interesting. Remember, if a certain amount of votes are missed, then the chain maintainers are deregistered. The voting has no minimum quorum check. If there's a poll to vote on, even if nobody can vote or does vote, the chain maintainers will get slashed for this.
  • The author of the cost does some cost analysis. To take down all major chains, it would cost 5K. In the future, it would cost another $11 per chain to do this again. Here's the flow of the attack:
    1. Create 2 malicious transactions that make 2000 ContractCall logs on the Axelar Gateway.
    2. Call ConfirmGatewayTxs on Axelar with the txids from the events listed. At this point, vald detects the polls and tries to vote but fails because of the size of the HTTP request.
    3. Do step 2 over and over again until the chain maintainers are deregistered.
  • The impact of this is pretty severe - it stops Axelar in its tracks on all chains. Initially, Axelar rated this as medium and asked to pay 5K with this being considered a "liveness vs. security" issue, increasing this to 20K but this was still rejected. After multiple follow ups from Immunefi, this was upgraded to a critical at 50K. You gotta love Immunefi and their help towards hackers!
  • This vulnerability is a great win for public disclosure. The bug had not been fixed when this was published. However, within a few days of publishing the post, the issue was fixed by disabling auto-deregistration altogether. Without the public disclosure, this vulnerability may not have been fixed.
  • To me, the real vulnerability was the missing minimum vote check before slashing. It's weird that Axelar fixed this by removing the deregistration. I personally love this post of taking a "small thing" to abuse a fundamental design issue.

DoubleUp Roll: Double-spending in Arbitrum by Rolling It Back- 1618

Hong Kong Polytechnic    Reference →Posted 1 Year Ago
  • Arbitrum and Optimism are Optimistic Rollups. This means that they are an L2 blockchain that inherits the security of the L1 by posting all of the L2 data to the L1. There are several rolls with these blockchains:
    • Sequencer: Creates the blocks from the submitted transactions to its private mempool.
    • Batcher: Posts the transaction information to the Ethereum L1. This allows to deriving of the state of the L2 from only Ethereum, making it a rollup instead of a sidechain.
    • Publisher: Posts the L2 consensus state to Ethereum. There is also a Challenger that can submit fraud proofs is something is done wrong by the Publisher.
  • The transaction lifecycle of an L2 is as follows:
    1. User submits transaction to RPC on L2 or to specific contract on L1.
    2. Sequencer creates a block and broadcasts it to the L2. The block is soft finalized.
    3. The block information is posted to Ethereum via the Batcher role for the Data Availability part of a rollup.
    4. The posted batch is seen by the L2 nodes. This makes the block hard-finalized.
    5. If the batch is different than the current state of the L2 then a rollback is done on the soft-finalized transactions.
  • There are three scenarios where rollbacks can occur that are relevant:
    • If the block time gap is too different on the L1 compared to the L2, then the block will be rolled back. This is a security mechanism to prevent too much manipulation of the block.timestamp in Solidity.
    • In case of the Sequencer going down, transactions can be forced through the L1, after a 24 hour delay. If the transaction was never included in the L1 but was in an L2 block then this forced inclusion will trigger a rollback.
    • Invalid posted batch information to the L2 not being processed on the L1. This leads to a rollback on the L2.
  • The goal of this paper is to force trigger rollbacks to have a double spend via deposits/withdrawls or after a large amount of blocks have passed to break bridges. Using the rollback mechanisms above, this is what they do. Of course, this requires some Tomfoolery to get transactions delayed properly.
  • In the Overtime attack, the goal is to change the time on a deposit that has already been used. It works as follows:
    1. Cause a major delay in the batch processing. This is no "one" way to do this.
    2. Submit deposit to the L2.
    3. L2 accepts the deposit.
    4. Batcher is unable to submit the the deposit to the L1 because of batching delays done in step 1.
    5. Attacker initiates and gets a withdrawal accepted for their recent deposit.
    6. Time-bound mechanism springs into action! The L2 block has its block rolled back.
    7. L1 deposit is processed in the new L2 block. Now, we have two sets of funds from the same deposit. Personally, I'm confused on how the withdrawal gets processed before the redoing of the finalization on the L1.
  • In the QueueCut Attack, the liveness preservation is abused.
    1. Introduce a delay in the L2 to L1 comms by adding a bunch of transactions with incompressable data.
    2. Trigger deposit from the L1. This will soft finalization immediately.
    3. Trigger a withdrawal on the L2 with that deposit. This will be processed quickly.
    4. Because of the delay and queue, the force inclusion feature can now be used. Use this to trigger a deposit.
    5. Force inclusion negates the L2 block that had the original deposit. Hence, we have a double spend.
  • The ZipBomb attack is making data uncompressable from the perspective of the Sequencer. This leads to the L1 refuses to process the L2 block, leading to a rollback. To me, the asynchronous processing is weird. I thought that the blocks would build on each other and require a perfect order. In reality, this isn't the case, which allows for the weird ordering of things. I imagine that many of these ideas came from noticing a bad state machine vs. the rollback mechanics being able to be DoSed.
  • The fixes to these bugs are not just a couple lines of code - they are design-level fixes because of the asynchronous processing. Both Optimism and Arbitrum having a streaming mechanism to ensure that blocks can always be handled. To mitigate the Zip Bomb attack, Arbitrum now only cares about the plaintext size on submission and not the compressed size.
  • On Optimism, they added a fee prioritization structure instead of a first-come-first-server queue. Additionally, instead of the more complicated one-block-per-transaction strategy, they now use a 2-second fixed interval. As far as cross-chain bridges go, there are still some concerns. They urged products to wait until L1 inclusion instead of L2 block confirmations.

Yul Calldata Corruption — 1inch Postmortem- 1617

Omar Ganiev    Reference →Posted 1 Year Ago
  • 1Inch is a limit order swap DeFi platform. 1Inch Fusion is a gasless swap protocol built on top of the core Limit Order Protocol. This version was deprecated in 2023 but was kept alive for backwards compatibility reasons. The original implementation had over 9 audits.
  • Recently, the V1 protocol was hacked. Looking at the exploit transactions, it starts off looking very normal. Then, came a red flag: both the taker and the maker on the swap were the same. Additionally, the attacker had made over 1M USDC on the trade.
  • Upon trying to make the Settlement of the limit order, a call to resolveOrders is made to a contract. This code appeared to be very similar to the example integration and looked pretty safe. Upon closer inspection, the victim contract had not updated the protocol even when the interfaces had changed.
  • The vulnerability appears to be around lack of validation on resolver. This is a NOT intended to be a controlled element of settleOrder. In fact, it's passed in with new bytes(0) since only the contract itself should be able to set it. How was this controlled? What gives? The woes of multiple versions!
  • Much of this code is written in Yul, the Assembly of Solidity. In the Yul, there is a ptr. When writing a value called the suffix, it's written at an offset depending on the passed in interactionLength. interactionLength is a full 32 byte word, which can overflow when doing ptr + interactionOffset + interactionLength. Because of this overflow, the pointer can be decreased to write the user controlled suffix to any location! Sounds like memory corruption!
  • Here's the full flow of the attack:
    1. Create a swap order that swaps a few wei for millions. Normally, this would obviously be rejected.
    2. Specify an invalid interactionLength value to overflow to point to the resolver address.
    3. Add a fake suffix structure to overwrite the resolver address.
  • The authors of this post did the internal investigation but also did several of the audits on the protocol. So, what happened? Initially, the resolver contract code wasn't in scope for audits, so it was ignored. In March of 2023, these auditors actually found the integer overflow while assessing the scope. However, shortly after, the code was completely rewritten so they didn't feel it was necessary to callout.
  • Here's the interesting twist: this previous version of the contract had already been deployed and the auditors didn't know it. Additionally, they were unsure about the impact of the vulnerability, so they moved on from it.
  • How can we prevent this type of thing from happening in the future?
    1. Clearly defined what code is being used. It's acceptable to have multiple versions but both need to be audited.
    2. Informational findings are helpful. There may be more impact of something than an auditor initially realizes.
  • Overall, a super cool postmortem on the exploitation of this vulnerability. The vulnerability is unique and I love the analysis on how this slipped through the cracks.

Insomnihack - Pioneering Zero Days at Pwn2Own Automotive 2024- 1616

NCC Group    Reference →Posted 1 Year Ago
  • This is just a bunch of slides but a ton can still be learned from it. The target is an In-Vehicle Entertainment system that has things like Amazon Alexa and things built into it. The first part of the process was getting the code off of it to reverse engineer and getting a debuggable environment.
  • The flash chip was a BGA eMMC chip. So, they used a hot air reworking station to remove the chip, popped it into an adapter and got the firmware off of it. To get a debuggable environment, they reverse engineered a bunch of settings for a secret debug menu and added a missing 0-ohm resistor but were shut out by a good password they couldn't crack.
  • Eventually, they just live patched the running memory of the chip to change /etc/shadow to get a shell. They also tried reprogramming the chip but bricked one of their devices trying to do this.
  • The device had an insecure HTTPs certificate handling on a request to api.sports.gracenote.com. By hosting a malicious DHCP server with attacker controlled DNS, they could interact with this service. On this server, there was a directory traversal arbitrary write vulnerability that allowed for writing arbitrary files.
  • Most of the system contained a read-only filesystem. Many of the mounts were even noexec. They found that the file pkcs11.txt allowed for the configuration of shared objects with a file path. Additionally, there was a mounted USB that was missing the noexec flag.
  • Using this configuration file, it was possible to load arbitrary .so/code> libraries. Of course, they could write this library to the USB. The only problem was that this wasn't changed right away; it was set at boot time of the device. Eventually, they found that by writing to /usr/local/bin/Media in a particular way the device would reboot. To go to root, they used an n-day kernel exploit.
  • The rules of Pwn2Own are weird to me. How would the car be listening to this malicious DNS server in the first place to launch this exploit? It doesn't seem very realistic to me... Then, they have to perform the reboot themselves because no user interaction is allowed once the exploit starts. Regardless, a cool bug and a fun story on reverse engineering!

Sign in as anyone: Bypassing SAML SSO authentication with parser differentials- 1615

Peter Stockli - Github    Reference →Posted 1 Year Ago
  • The initial research for this post started with an attack vector as opposed to a real issue: ruby-saml used two different SAML libraries. During signature verification, the element is first read using the REXML parser and with Nokogiri's XML parser. If a difference could be used to trick the XPath query then it may be possible to bypass the signature verification.
  • Security Assertion Markup Language (SAML) is a framework for transporting signed-in user from an identity provider to a service provider in XML. The main part of the SAML response is the Assertion. This contains a DigestValue, SignatureValue and a Subject field for the user. Normally, the entire assertion portion is canonicalized and then compared again the Digest. Then, a signature is compared on the SignatureValue based upon this digest. These are important later for exploitation.
  • Before looking into parser differentials, they first needed to see if there was a path to exploitation. After carefully analyzing which parser makes what query, the came to the following conclusions:
    • The SAML assertion is extracted and canonicalized with Nokogiri. The hash is then compared with a value from REXML.
    • The SignedInfo element is extracted and canonicalized with Nokogiri - it is then verified against the SignatureValue, which was grabbed with REXML.
  • What's the path for this exploit? First, make it so that REXML doesn't find the Signature XML object but Nokogiri does. This SignedInfo is then compared against the SignatureValue extracted via REXML. Later on, the DigestValue needs to be compared with the one that was signed. By getting Nokogiri to canonicalize the assertion but get REXML to extract a different DigestValue than the one used on signature validation, it will bypass the check without being signed.
  • There were two known PoCs. This was initially found via a HackerOne submission using techniques from XML roundtrips vulnerabilities, since GitHub had added this library to their bounty. The author of the article found a different parsing issue from using the Trail of Bits Ruby fuzzer ruzzy.
  • There is a nice image that explains this but I'm unsure whose exploit it is. In the image, the Signature field found by Nokigiri is NOT under an Assertion. When REXML needs the digest, it uses the one under Assertion instead. Seems like the traversing of the tree was funky between the libraries.
  • Relying on two parsers is error prone. Exploitability isn't automatic in these cases but it's a good place to start. To fix this issue, they decided to use only a single library, which is great. Solid article but I wish both exploits were explained in more detail.

How We Hacked a Software Supply Chain for $50K- 1614

Lupin    Reference →Posted 1 Year Ago
  • This blog is ran by two brothers who like to hunt for bugs together. They had each found several criticals on the this target but wanted an Exceptional Vulnerability - what I'd call a super critical - on this company. They decided to look at a recent acquisition of the company; the scope was simply "anything owned by the company". Since acquisitions may not have the same security controls in place, they were hoping for some low hanging fruit.
  • To go deeper, they were curious about supply chain vulnerabilities. If you want a super-crit, this is a good way to go. Things like, dependency confusion, artifactory access and other things are great attack methods. Using a mix of these methodologies, they were hopeful of finding a super crit.
  • They did a bunch of recon around NPM and found what appeared to be a private NPM package. If you have a license, you can setup an organization that has private packages. They were hopeful of source code leakageor dependency confusion here but didn't find anything. With nothing on Github, they turned to Docker and found several unsecured Docker images. Once they pulled the Docker images, they found backend source code for the application. Knarly!
  • One of the images even had the .git folder still intact, giving them access to the complete git history of it. Under .git/config, they found an authorization bearer token. After some research, they realized that this was for Github Actions! If the token was too permissive, they may be able to manipulate the pipelines or artifacts themselves.
  • GitHub Action tokens are commonly generated automatically to allow workflows to interact with repository-pushing code but expire once the workflow completes, limiting exploitability. If the artifact containing the token is uploaded before the workflow ends, then the image can be accessed while it's still active. The Docker push was the third to last item, meaning that it may just barely be possible to use the token before it expires. A month after they did this research, some other folks used a similar method to use a Github Action token.
  • The Dockerfile had a package.json that contained a private package from the npm organization they mentioned before. To pull these, the image would have needed an npm token within the .npmrc, but this wasn't there. This was because the Dockerfile deleted the file in the last build step.
  • Is this file gone forever? No! Docker has layers that are used for efficient caching. It turns out that these layers can be accessed individually! They found a sick tool called dive for reviewing file system images of Docker. Using this, they found the private npm token that granted them read/write access to the packages.
  • With the ability to write to this internal organization npm, it was game over. Developers who ran these internal packages were now compromised. The backend web service they mentioned was also compromised. This is a super-crit! Super fun blog post!

Sanitize Client-Side: Why Server-Side HTML Sanitization is Doomed to Fail- 1613

Yaniv Nizry - Sonar Source    Reference →Posted 1 Year Ago
  • Cross site scripting (XSS) is a super common web vulnerability. If a user can include HTML into the page, then you can commonly add your own JavaScript to perform malicious actions. Sometimes, some HTML should be allowed for styling. Because of this, HTML sanitizers are super important for preventing security issues in these cases.
  • These sanitizers work by parsing the HTML input to create a structured DOM tree object. Then, parsing this DOM to ensure that nothing defined as malicious exists. This HTML sanitizing should be done on the client side in order to prevent parser differential issues. In reality, it's done on the server-side quite a bit.
  • Sonar source has found a lot of sanitizer bypasses in the past. They noticed that a group of them worked on almost all of them written in PHP. All of the bypasses were relating to comments, math, RCdata and RAWData. All of these are new HTML 5 features!
  • The built in PHP HTML parser uses an out-of-date HTML 4 specification from libxml2. If the parser used for cleaning was the same as the execution (being in the browser), this issue wouldn't have existed. It's just a standard though, how hard can this really be?
  • HTML is a constantly evolving language. New elements, attributes and features are regularly introduced. Different users are also running different versions of browsers, which causing some complications here. The author claims that the parser configuration can make a big difference as well. If scripting is enabled or not can determine how some elements are parsed.
  • Another issue surrounds parsing weird HTML. If something goes through a parser multiple times in a loop, the output may be different in the two cases. Additionally, mutation XSS can be used too.
  • The issue around PHP was never fixed. Instead, this big PHP library now has a big red warning label that it shouldn't be used for sanitizing because it doesn't support HTML5 very well. Overall, a good post around a bad practice and an interesting vulnerability in the improper usage of a library.

x/group can halt when erroring in EndBlocker - 1612

Interchain Foundation    Reference →Posted 1 Year Ago
  • A vulnerability in the Cosmos SDK group module led to a chain panic. It's well known that an error or panic in the either the begin blocker or the end blocker in Cosmos results in a chain halt.
  • From reading the patch, it appears that the only real change that was made was around error handling. If a call to k.Tally was made with an error, then an error used to be returned. If you follow this up the call chain, then this results in an error being returned to the EndBlocker call.
  • I'm unsure exactly what error could have resulted in this. If this were me, I would have saw the potential for a DoS in the EndBlocker and then looked for ways to trigger an error within the processing of a group.
  • To remediate the issue, the function doesn't return an error. Instead, it just prunes the votes, sets the status to rejected, and emits an event.

Shattering the Rotation Illusion: Part 4 - Developer Forums- 1611

Clutch Security    Reference →Posted 1 Year Ago
  • These researchers intentionally put credentials into Stack Overflow, Reddit and many other places. Most of these were exploited within a day, which is pretty interesting.

Zen and the Art of Microcode Hacking- 1610

Google    Reference →Posted 1 Year Ago
  • Microcode is code that runs during instruction execution. Much of this is in hardware, but some is small RISC instructions stored in some small storage on the chip itself. This makes bugs in the code, such as the famous Intel FDIV bug in 1994, patchable. The Microcode exists in on-die ROM but the patches are in on-die SRAM. The authors of this post were interested in how the patching process of these worked!
  • Both Intel and AMD encrypt the patches to prevent reverse engineering and use digital signatures to prevent the loading of bad microcode patches. The update contains a lot of RSA key information, encrypted data of the patch itself and an array of information for where should be patched.
  • To perform the upgrade, the following is done once the fix has been shipped:
    1. Software performing the upgrade will write the patch blob to MSR 0xc0010020 to start the microcode upgrade process.
    2. Copy the patch to the internal memory of the CPU.
    3. Checks for improper rollbacks and validates that this is indeed the proper RSA key.
    4. RSA PKCS #1 signature is decrypted using the RSA public key. The result is a padded AES CMAC hash of the patch contents.
    5. Signed hash content must match the calculated hash. Otherwise, the wrong data was sent. The CPU microcode patch version is updated.
  • The verification for the patch signatures effectively uses RSASSA-PKCS1-v1.5 algorithm. The only difference is that a non-standard hash algorithm that is prone to collisions was selected. When hashing the data to sig, it's typically padded with a constant value. The recipient can use the public key to validate the signature. To break this, a complete break of RSA or the hash function would need to be found, making this secure.
  • In sitautions where storage is an issue (like with on-die storage of a chip), a hash of the key can be stored instead. Then, when doing the signature verification, the public key is provided and hashed to validate that it's the same one as the one stored in the on-die storage. This system works because it's nearly impossible to find a collision between two plaintexts.
  • The key vulnerability is that the hash function used is CMAC. Although this works as a Message Authentication code function, it does NOT function as a secure hashing function. That's not the goal of it! In practice, this algorithm looks like a CRC that XORs the bits. Since the verification of the RSA key is required for security, being able to submit our own RSA key via a hash collision would completely compromise the microcode update process. In practice, this only requires that we know the AES key of the signature, which has to be in every CPU - it's a bad assumption to make that this will be secret forever.
  • To test this, they used old Zen 1 CPU example keys that were used up through Zen 4 CPUs. using this key, they could break the two usages of AES-CMAC: RSA public key and microcode patch contents. They were able to forge new public keys were generated the same hash as the authentic AMD key. In order to pull off this attack, a compensating block was needed with random data to align to 16 bytes. Since this is Microcode, it could cause crashes. So, they choose to attack the public key instead of the microcode patches themselves.
  • To do this, they had to generate a second preimage of the public key. They generated a large list of candidate RSA public keys that collided with the expected public key. From there, they checked if the values were easy to factor, such as being divisible by 2. After some attempts, they found a suitable key!
  • Ironically, to fix this, a microcode patch is performed to add a custom secure hash function. Using this bug, they were actually able to extract and reverse engineer previous microcode patches, such as the Zenbleed vulnerability. They created a tool that allows for the exploitation of this vulnerability on all unpatched Zen 1 through Zen 4 CPUs which will enable research into previously undocumented computer code.
  • This blog post is pretty amazing. It required a deep understanding of cryptography and a ton of reverse engineering. The blog claims it will release a post on the reverse engineering aspect in the future.