Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Hash What You Mean - 1649

Giuseppe Cocomazzi    Reference →Posted 10 Months Ago
  • The Horton principle is "mean what you sign and sign what you mean". The reference comes from a Dr. Seuss book but does have profound impact. When signing/hashing data, no changes should be possible to it. Although this sounds simple, even a single item is difficult to do correctly. With multiple items, this becomes very complex.
  • For multiple items with hashing, it's common to turn the data into a string and hash. But, this just moves the problem to the serialization. By using something like ASN.1 or JSON or Protobuf, this mostly solves the problem though. These make the representation unambiguous, which isn't so simple to do.
  • When doing this for a Merkle tree, a load of problems comes into affect. The depth, the data in the nodes and levels all matter for the tree. In a binary merkle tree, the representation of how we hash the leaves matters. What happens if it's not a perfect element of 2? Do we go from the left or the right? All of these matter for the Horton Principle.
  • Domain separation is important - add byte separators for specific bytes. For instance, adding a 0x00 for a single entry element. In CometBFT, the hash byte slices are ordered and order-preserving. In Ethereum, they do something different - all unpopulated leaves are just empty. This means that the empty hash is used a lot, making it perfectly balanced in theory. Both of these satisfy the Horton principle.
  • The Open Zeppelin library takes in an input of leaves of length L. It starts pairing from the tail and takes a double hash concatenation to disambiguate between internal nodes and single leaves. A major deviation is that the leaves are sorted prior to hashing. This means that the tree does not preserve the order of the input sequence. In the documentation, they say it's assumed that the inputs are ordered.
  • A cross-chain protocol called Omni Network ported the Open Zeppelin version for their own use in Golang. Naturally, they forgot about the optionally of the leaves pair sorting and try to always sort. This is done by the blockchain for transmission but isn't actually required by the verifier. Great.
  • The Omnichain Network merkle tree data contains a LogIndex that is a monotonically increasing value. However, this value is NOT in the data used for hashing. Combining the optionally of leaf sorting with the neglect of this value in the hash, leads to the ordering of messages not being preserved. Practically speaking, this means that differently ordered xchain.Msg sequences lead to the same Merkle Root.
  • Even more crazy is that you can change the LogIndex to be different as well. {"Ping", 1} and {"Pong", 2} is just as valid as {"Pong", 1} and {"Ping", 2}. The ordering is not kept, as we can see. The author includes a PoC.
  • How did this sneak through the cracks of the project? The author has a great quote on this: As seen too often with cryptographic constructions, too many moving parts and options become the ingredients for the proverbial "hazardous material".. With the copying of the porting from the Open Zeppelin with a weird assumption being made in the library, it was bound to lead to an issue. Great write up!

Google Cloud Account Takeover via URL Parsing Confusion- 1648

Mohamed Benchikh    Reference →Posted 10 Months Ago
  • On the Google Cloud CLI gcloud, the authentication process works using OAuth and a server that is quickly setup on the computer at localhost:50000. This means that http://localhost is actually a valid redirect_uri on OAuth! Given that we have a browser parsing doing the redirect and a backend parser validating the redirect, this becomes the perfect chance to find an account takeover via an evil redirect.
  • At first, this didn't make sense to me. Most of the time, these are static string checks and are an allowlist, giving very little wiggle room. The author of the post tried 127.0.0.1 and noticed that this worked. This meant that they were parsing the URL rather than just doing a static string check on the backend. With two parsers in play, it's time to find a difference!
  • They wrote a Python script that performed a large amount of URL mutations. Encoding trips, private IPs, weird schemas, IPv6, etc. After running their fuzzing script for a while, they found a match:
    http://[0:0:0:0:0:ffff:128.168.1.0]@[0:0:0:0:0:ffff:127.168.1.0]@attacker.com/
    
  • This URL is super weird. The @ symbol is used as the separator between the username and the password on a URL. It's actually invalid to have two of them. Chrome mitigates this edge case by encoding all non-reserved characters and earlier occurrences of reserved characters. However, the parser on the backend likely ignored the attacker.com part of the URL and grabbed the proper data from the set positions. Neat!
  • What's interesting is that this only happened when using IPv6. When using IPv4, this didn't work. A working redirect_uri is as follows: http://[::1]@[::1]@attacker.com. The server would parse the second [::1] as the server information and skip the attacker.com entirely. However, Chrome would parse attacker.com as the host.
  • Mixing this with OAuth gave the author an arbitrary redirect to steal the OAuth to be logged into the users account. That's a pretty rad bug with good visuals and background.

Exploiting the Synology DiskStation with Null-byte Writes- 1647

Jack Dates - Ret2    Reference →Posted 10 Months Ago
  • Pwn2Own is a hacking competition with fairly large prizes. In 2023, no compromises of the Synology DiskStation had been found. So, they decided to add a few non-default but first-party packages to the scope. Packages are add-ons for the device that can be installed.
  • One of the services they analyzed was the Replication Service. It has very high privileged and easy communication from the outside world. The service listens on port 5566 for the synobtrfsreplicad. The service is just a forking server that continually accepts connections from a remote client.
  • Each request takes a cmd, sequence, length and a complete data section. If the length of the data is larger than 0x10000 then an error is returned on the cmd receiving function. However, there is a case of bad error handling here. The code returns the error value from a previous function call instead of setting it to a real error. This leads to the error being ignored!
  • Directly after the error verification is a null byte write into a buffer based upon the len of the packet. This creates a relative write to anywhere in the buffer but only with a nullbyte. This really does look like a CTF challenge! The device has all mitigations enabled so this was going to be trippy.
  • To break ASLR, they abused two key points: this is a fork-server that reuses the same address space on each process and a crash in the program didn't have any affect on the rest of the service. Instead of brute forcing it straight up, they do some crazy pointer shenanigans to create useful oracles for leaking the offsets. This part is worth a read :)
  • Using the primitive from before, they are able to corrupt a heap pointer in the .bss section. Since they control this address and can force it to be freed, they are able to corrupt this chunk to perform tcache poisoning techniques. Now, they can add arbitrary contents to the tcache, giving them an arbitrary write primitive.
  • With the arbitrary write, they wrote a pointer to the GOT entry for delete to be system. When the call to delete is made with the controlled pointer for delete, it executes the bash command. This gives them RCE on the box! The patch was simply to return 1 instead of returning 0. Nice!

Exacerbating Cross-Site Scripting: The Iframe Sandwich- 1646

Cooper Young    Reference →Posted 10 Months Ago
  • In bug bounty, it's just about finding the vulnerability - it's about exploiting the vulnerability to create as much impact as possible. In the author's situation, they found XSS on a simple static website that wasn't connected very well to the rest of the application. This meant that session hijacking, account takeovers, and sensitive API calls were unlikely to work.
  • Their first exploit attempt was adding a login form to the page to trick the user into signing in and stealing the credentials. However, this requires too much interaction, making it a solid medium severity bug on its own.
  • To add more impact, they create an iFrame sandwich. In most cases, an iFrame cannot access its parent frame's contents. One exception: it can if they're on the same domain or a subdomain. Since this subdomain was for maps and showed on the main website, it could access the contents of the page, bypass SOP, use cookies, etc.
  • One question I had was how to get the main page to embed the vulnerable version of our page, since it is reflected XSS. To get around, the subdomain can be embedded into an attacker-controlled website where they specify the URL. But, this doesn't mean that the website's top-level site that we're trying to get data from is vulnerable, though.
  • The other trick is getting the parent of the iFrame to have access to the other page. To do this, an important order of operations is done:
    1. Attacker website opens up the page to do the exploitation via window.open().
    2. Attacker sets the window.location to be the target page. The parent window of the page opened in step 1 is STILL this window, even though we opened a new page.
    3. The page opened in step 1 contains an iFrame with the exploit payload in it targeting the subdomain page.
    4. The iFrame accesses the parent reference of the page, now on the website we want to exfilitrate data from. Cookies can be shown, the DOM edited... this is super powerful!
  • The end of the article discusses the security team of the product and the security researcher. The researcher's job is to write a powerful and impactful exploit; the researcher bears the burden of proof. To the security team, the PoC is the minimum impact.
  • Unfortunately, the security team deemed this out of scope since the subdomain was out of scope. They fixed the vulnerability though. Personally, if you affect an in-scope item with a vulnerability outside of the scope, you should be rewarded. Attackers do not care about "scope" - they care about impact. Fantastic blog post!

Scroll Mainnet Emergency Upgrade- 1645

Scroll Security Council    Reference →Posted 10 Months Ago
  • Scroll is a ZK EVM blockchain. Recently, they made some changes to the code that led to some pretty serious issues. One via an individual and another through Immunefi by a user named WhiteHatMage.
  • The first bug was a soundness issue in the zkEVM circuit for the auipc opcode. This function used an iterator that skipped the first element. That led to the bits of the PC being ranged checked to 8-bits instead of 6-bits. This would have allowed a malicious prover to fill in arbitrary values in the higher 2 bits of the PC, changing the flow of execution.
  • Any ZK soundness issue is bad but the exploitable impact is unknown. Since the prover and sequencer are operated by Scroll, this is unexploitable though. The fix for this vulnerability is literally swapping the order of skip(1) and enumerate(). Neat!
  • The second vulnerability was a message spoofing issue on the bridge. For the Euclid phase-2 update, they made some big changes and had a full audit done that did not uncover the issue. From being in a Discord with the author of the bug, they had automation setup to notify them of changes to contracts on Scroll. While reviewing this, hours after release, they immediately saw the issue.
  • When going from an L1, like Ethereum to an L2 such as Scroll, there is typically a bridge in between them. When going between the L2 and the L1, there was an application-level permission issue that had not been noticed. On one end of the bridge, there was an authorization check. By crafting a malicious withdraw on the L2 to the L1, the L1ScrollMessenger entity permission could be abused to make a call back into the main bridge. Since this caller is considered trusted on L2ScrollMessenger, access controls on the L2 could be bypassed, leading to an infinite mint. This was effectively a confused deputy problem.
  • This wasn't exploitable in the past because EnforcedTxGateway did not allow calls from smart contract accounts. With the change to the code, this property was changed though. Hence, it was possible to trigger this path. The explanation is somewhat short and without context so I don't fully understand the bug though. As more details come out, I'll try to update.
  • Overall, two good bugs! The second one led to a 1M payout because of the damage it could have caused; monitoring for the win. It's fascinating the stark difference between the Scroll DoS from last week and this second crazy vulnerability.

DoubleClickjacking: A New Era of UI Redressing - 1644

Paulos Yibelo    Reference →Posted 10 Months Ago
  • Clickjacking, also known as the UI Redress attack, is a mechanism to steal clicks to perform sensitive actions on a website. This is done by iFraming the victim website in the attackers website and tricking the user to clicking on particular sensitive parts of the website. With SameSite: Lax, the framed website becomes unauthenticated, making this much harder to exploit. This article is a new variant of this called Double Clickjacking.
  • The main idea is doing some sleight of hand trickery to make this possible via exploiting the small gap between the start a click and the end of a click in multiple windows. By quickly swapping between pages, it's possible to get a user to click on something in an unintended fashion. The video is the best demonstration of it but it's very fast. There are some more complications to how this works though.
    1. The attacker creates an initial webpage. This opens a window.
    2. When the new window opens up, they ask the user to "double click" on it.
    3. Upon going to this page, the new window changes the parent window's location to the target page. This means that the parent window of our page while the top window shows the double-click prompt.
    4. When the user does the double click, the mousedown causes the top window (the current page) to close.
    5. The second click lands on the exposed authorization button on the parent window. With this, access has been granted.
  • The reason this works is because of the multiple parts each click. We can use part of the click and then force it to be someone else. Any sort of one click permissions can be abused on this, such as OAuth permissions or data sharing on Google Drive. This bypasses traditional clickjacking permissions like CSPs. This also isn't just about websites - it can affect chrome extensions as well.
  • To mitigate this, the author suggests disabling critical buttons unless a gesture is detected on that page. This ensures that the actions were meant for the particular page. For longer term solutions, a header could implemented that just resets all gestures. I really like that they thought of a good protection, which many folks wouldn't do.
  • The attack is really cool! I personally don't fully understand why each step happens but it's interesting none-the-less.

The Hidden Risks of Cosmos SDK: Unmetered Functions- 1643

Oak Security    Reference →Posted 10 Months Ago
  • Blockchains have a concept known as gas. Like what you put in your car, it calculates how far you have gone. The only difference is that this one is computational complexity versus the distance on a road. In the Cosmos SDK, which is an application-specific blockchain, some of this gas handling becomes complicated and difficult to handle securely.
  • The Cosmos SDK has handlers that run at the beginning and the end of each block - BeginBlock and EndBlock respectively. Since these are not done in a particular transaction, they have unlimited gas. So, it's essential to be mindful of what gets executed in these functions when building your project.
  • The authors of the post created a Cosmos SDK blockchain locally to test how delays in these functions affected the blockchain's uptime. By adding sleeps at determinstic points within these functions, they noticed some funky things happened. Consensus would commonly timeout. Validators would miss voting windows, leading to slashing of stake.
  • A common exploitation method of this is to increase the number of items being processed in a list. Both of their examples are the processing of messages and processing of denoms in a linear list. I also wonder about the practicality of exploiting these types of issues. However, other ways exist to make these functions spend too much time.
  • They have a few suggestions to work around this. First, try to make all operations O(1). In reality, this isn't really possible, though. So, having hard upper bounds on iteration counts or run time limits is the way to go. Another option is having custom gas metered contexts that will eventually expire. Using custom gas meters has its own consequences, such as forcing rollbacks on the state if this happens because of the potential for partial operations.
  • Overall, a good and post on gas meters going wrong in Cosmos.

Proof of Nothing- 1642

Giuseppe Cocomazzi    Reference →Posted 10 Months Ago
  • The term proof is used for loosely in the blockchain industry. Originally with Bitcoin, proof of work was used as an anti-spam technique. It relies on the probabilistic assumption takes a certain amount of time to find the correct pre-image of a hash. Hashes are sufficiently random so this is fairly reasonable. Based on all previous data, there's no reason this won't work in the future. It makes this for deductive than inductive.
  • Proof of Stake was popularlized by Tendermint. "Proof" relies on a majority of cryptographic validators (two-thirds power). With proof of stake, the current block is grounded in the previous one. All though the name implies some mathematical deduction, this is NOT a regular proof. This makes all of the blocks sequential to the other blocks.
  • Light clients follow the same logic, except they start with a trusted block that is provided at the beginning of the light client. Additionally, they can skip blocks with "Non-adjacent block verification" assuming that 2/3 from the most recent trusted block have signed on this block. Giuseppe doesn't like this. Why?
  • Two large induction leaps are being made:
    • Any validator holding 1/3+ of the voting power at block height H continues to behave honestly for N blocks.
    • Validator from the first point is no longer trusted at H+N.
  • Because the validator is trusted from Height H + N blocks, if they decide to be malicious for a period of time their proof is still technically valid! It doesn't matter that they were slashed on the other chain; it's still valid from the perspective of the light client. The consequences of not having perfectly sequential block validation is not great given the argument. But, to my knowledge, no hacks surrounding this have happened yet.
  • According to the author Skipping verification for non-adjacent blocks might very well be named "Proof of Faith" or, better, "Proof of Nothing". Interesting post around the design of Tendermint light client verification!

Scroll Chain DoS via CCC Overflows in Single User Transactions + Drama- 1641

Pavel Shabarkin    Reference →Posted 10 Months Ago
  • In Scroll zkEVM rollups, transactions occur in two main steps:
    1. EVM executes all transactions, performs state transitions and then sends the transaction to the provers.
    2. zkEVM prover proves the traces.

    This second step is known for being time consuming. So, due to this, there is a limited capacity. Scroll imposes a row consumption limit of transactions per block, rejects and reogs before finalizing if this happens.
  • If transaction traces exceed the row capacity, the zk prover will fail. Obviously, an unprovable block prevents the chain from finalizing and wastes resources. Scroll knew this and implemented the Circuit Capacity Checker (CCC) in l2-geth to validate transactions before they enter the zkEVM circuit. Wow! We're looking at remediations within remediations for bugs, crazy!
  • The mining process is as follows:
    1. Transaction enter the mempool, where it's picked up by a worker.
    2. Transactions are processed based upon gas price.
    3. Each transaction is executed one by one.
    4. Commit the block. If the CCC calculations failed, then rollback the block.
  • The CCC functions as a post-sealing check rather than a pre-sealing check. This enables an attacker to send a lot of malicious transactions that exceed the CCC limit but are totally valid otherwise. This means that a lot of computations are done (wasting time and resources) only for a reorg to happen.
  • The cool part is that since there's a reorg, there's no gas cost to the attacker! So, they can create an infinite amount of high gas price transactions to always be at the front of the queue to permanently stale the blockchain. Technically, it's interesting how they abused the issue.
  • Now... for the drama. The vulnerability was reported to Scroll who decided not to fix the issue. They had received multiple similar types of vulnerabilities in CCC and understood there were likely more. So, they had completely redesigned the feature but were just waiting until the next release to upgrade it.
  • Scroll and Immunefi agreed the vulnerability was legit. For the time being, Scroll was okay accepting this risk for the users and moving on, though. Because nothing was changed, no bug payout. The author of the report published the report on Twitter and a blog-like website, including full Immunefi chat logs. Some folks believe that the whitehat is in the right, while others think they are in the wrong. It's a sticky situation.
  • From the perspective of the bug hunter, I get it - there's a live vuln and a program you did research on. From the perspective of the project, I get it - you recognized a design weakness and already fixed it. In my opinion, there's nothing on the Scroll side to do besides push their new code. So, if they're okay with the risk they're taking on with a DoS until the upgrade, that's their decision to make.
  • When pushing for remediation on a bug bounty report, we got to be patient on things. Reading the communications, the bug hunter was pushing for comms faster than the SLA of Immunefi required and was very aggressive about it. Additionally, after being offered a bounty of 1K they pushed back and asked for a bounty of 200-300K. They even pushed up the severity of the bug citing the primacy of impact by creating some reasons on why the DoS was so bad.
  • Personally, I feel like the denial of service risk commonly referred to on projects gets more credit than it should. Realistically, if this attack was launched, the chain would be back up in less than a day with a hacky-solution for this issue. The numbers and impact the author cites for Token Sell-Off & Investor Confidence Crisis are a little ridiculous to me. Sure, a DoS on the chain has impact but not 200K bug bounty worth of impact. Besides my grieves with how it was handled, the blog post is very thorough and well-explained.

How Go Mitigates Supply Chain Attacks- 1640

Filippo Valsorda - Golang    Reference →Posted 10 Months Ago
  • Most modern software has a large amount of open-source code. Because the code is constantly used and downloaded, it opens up the potential for supply chain attacks. Despite good process and technical chops, every dependency is an unavoidable trust relationship. This article discusses how Golang tries to mitigate these risks with very explicit design decisions.
  • In Golang, all builds are locked. The version of every dependency is set in the go.mod file. Only explicit updates to this file can change the dependencies, such as go get or go mod tidy. This is super important for security - the code in the repository for should be the source of truth and nothing else.
  • There is no concept of latest for dependencies. This prevents a dependency from being compromised by backdooring all of the users very quickly. Everything said above is also transitive for dependencies of dependencies as well. If a dependency is compromised, it requires a specific update as a result, giving folks time to see what's going on.
  • Version contents are guaranteed to never change - module versions are immutable. This property ensures that an existing package cannot be modified to compromise code that depends on it. So, if something is safe to run currently, we're confident it will be safe to run. A lot of cryptographic verification goes into this.
  • Another issue that I have with NPM is related to hooks or builds running code. Golang has no post-install hooks. Additionally, the built code cannot do anything until it is actually running. In all likelihood, if you installed something, you're probably going to run it, but this does add another security boundary, though.
  • Overall, a good look into supply chain security in Golang. I like to see that the developers put a lot of thought into the package manager of Golang.