Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Starknet Incident Report – January 5, 2026- 1893

Starknet    Reference →Posted 1 Month Ago
  • Starknet is an L1 that utilizes a ZK prover. The blockifier is the creator of the blocks and proofs. I imagine that they have a centralized sequencer, but I'm not sure. Recently, they experienced an outage during which new blocks could not be built.
  • The Blockifier had an issue with a complicated set of contract calls:
    1. F1 calls F2 and F2 calls F3, where F2 and F3 are the same contract.
    2. F3 changes a variable in storage.
    3. After F3 has finished, F2 changes the same variable from step 2.
    4. F2 panics. F1 catches the revert and continues execution.
  • In this case, the value of the variable should have been the original value prior to calling F2 at all. In reality, the value from F3 was used! Since this is just block production, the ZK prover still got it right. So, no weird writes after reverts in the execution layer. Even though this led to an outage, it's cool to see the prover to its job.
  • To ensure this doesn't happen again, they have some new initiatives internally. They are re-architecting the prover-compliant execution to run directly after transaction execution. If they don't match, then an auto-halt will occur. Although a crash is bad, it's better than a deep reorg. They will add better fuzzing. In reality, I doubt fuzzing would have found this bug though.

Ticket Tricking OpenSSL.org with Google Groups- 1892

Space Raccoon    Reference →Posted 1 Month Ago
  • Ticket Tricking is a technique to get OTPs or verification emails sent to a public forum so that you can "prove" you have access to a domain when you really don't. Google Groups have this risk and are the focus of this post.
  • The author of the post found a tool for scraping Google Groups. Unfortunately, it was somewhat outdated and only looked for a single hard-coded group. So, they wrote a Vibe-Coded application to find Google Group URLs, filter them, and check for public read access. After scanning from 32K raw URLs, they were left with 150+ groups.
  • One of the vulnerable instances was OpenSSL.org Slack group. The author logged in to the group using the OTP leaked on the forum. The end result is that there are serious implications to this. Many applications (except Slack) have patched vulnerable-by-default mechanisms. However, GitHub email verification, auto-join SaaS tenants and many other things are still vulnerable. Good post!

Architecting Security for Agentic Capabilities in Chrome - 1891

Nathan Parker - Chrome Security    Reference →Posted 1 Month Ago
  • Agentic browsing appears to be the future of Chrome and other web browsers. Unlike other types of attacks, prompt injection is not something that can be fully "solved" in the traditional sense. This article details how the Chrome browser is attempting to prevent indirect prompt injection from hijacking the user's browser. After reviewing built-in protections from Gemini and other agent security principles, they are adding a new feature called user alignment critic and better origin isolation.
  • The main planning model in Gemini uses page content in Chrome to determine the next action. Naturally, this is a great place for prompt injection because it may contain attacker-controlled content. They use spotlighting and train Gemini against attacks, but this still isn't enough.
  • The user alignment critic is a separate model that evaluates the output of each action. Notably, it must serve the user's end goal. So, if the user is trying to view a store's address and the planning model attempts to initiate a bank transfer, that will obviously be rejected. The critic model is only allowed to see the metadata of the result and not have any unfiltered content. In practice, this makes the critic module immune to prompt injection. This helps prevent both goal hijacking and data exfiltration.
  • The next protection is around site isolation. Agents can operate across websites, which violates this key principle. So, a prompt injection from site A could compromise site B. To address this, they are adding Agent Origin Sets, which limit the domains an Agent can access to those strictly required for the task.
  • For each task, there is a gating function that is used to decide whether domains by the planner are relevant to the task or not. The design has two types: read-only origins and read-write origins. As with the alignment critic, the gating functions are not exposed to prompt-injection risks. Users can add origins as needed to complete the task as well.
  • Part of the security belongs to the user. If you give a bot access to your bank and they steal your money, that's on you. The origins being used still need to be verified by the user. Some domains require explicit approval, such as banks and Google Password Manager, while others only require permission for the gating functions.
  • On the reactive side, they have realtime scanning of pages to detect prompt injection attacks. There's an additional classifier that detects prompt injections and will reject the page if it's usable. They even have persistent red-team bots that try to derail the agentic browser.
  • This article is great and echoes a great principle: design with security in mind. By having site isolation and the built-in critic alignment checker, derailing the Agent to perform malicious actions will be much harder. Great post!

A White Mage’s Guide to Web3 Bug Hunting- 1890

WhiteHatMage    Reference →Posted 1 Month Ago
  • WhiteHatMage was in the top 3 on both Immunefi and HackenProof for web3 bug bounties last year. This post explains how they identify projects and the realities of finding vulnerabilities in live projects with impact.
  • What makes vulnerabilities more likely? First, bugs hide within complexity. Most serious issues they find are simple mistakes, but in layered/complicated systems. Many fixes are just one-line changes. Next, innovation creates space for new bugs. When projects adopt a new approach, they are unlikely to consider attack paths correctly. Implementation-level innovation matters too. With new implementation experiments come subtle bugs. As ecosystems mature, there are fewer bugs. New chains, uncommon languages, etc. tend to have more basic issues simply because fewer people have looked at them.
  • Optimizations are at the root of a lot of evil. Heavy assembly usage, manual memory management, rewritten math expressions... Optimizations often obscure edge cases that developers did not anticipate. Next, code quality tells a story. When developers rush a feature or lack attention to detail, bugs are more common. Poor code is hard to secure. Non-functional issues, such as sloppy comments, are even noteworthy. Ignoring best practices, missing of basic security patterns like CEI are all warning signs. Projects with poor code quality are high-risk for vulnerabilities.
  • Audit reports are a useful context. Multiple critical findings are a serious red flag. Fixing every issue introduces the risk of introducing another issue or applying an improper patch. Depending on the quality of the codebase and the auditing company, they look for different types of issues.
    • Good codebase with good audits. Novel or very complex bugs.
    • Good codebase with average audits. Complex paths and known security pitfalls.
    • Average codebase with good audits. Review audit fixes and leftover weak design.
    • Average codebase with average audits. Missed but not extremely complex exploit paths.
  • Being first is super important. Right after a big launch, there's a large chance of vulnerabilities. So, the author will often speedrun basic security checks on a project if they hear about a launch. In the first few weeks, more complex attack paths may be discovered that auditors didn't identify. When a project first gets the bug bounty program, the competition is intense. So, they tend to check early and then come back for a deeper pass later.
  • The approach for finding critical bugs is very similar to my process. They focus only on critical paths; they ask themselves which invariants must hold for this to be secure. Over time, this builds intuition. After a while, they come back to a codebase. This is beneficial because they may have new techniques or the system may have changed. It's important to note that time is limited, so every decision matters.
  • They add a list of bug bounty archetypes:
    • The Digger. Goes super deep into a single program.
    • The Differ. Compares one mechanism across many different projects.
    • The Speedrunner. Reviews new programs quickly.
    • The Watchman. Monitors deployments and upgrades.
    • The Lead Hunter. Develops ideas around lesser-known attack vectors.
    • The Scavenger. Inspired by obscure writeups or little-known incidents.
    • The Scientist. Builds major tooling for analysis.
  • When choosing a bounty program, they also consider the project's reputation itself. If they are well-known for lowballing or not paying, it's not worth your time. Do they have the money to pay you in the first place? Are the rewards and scope clear? Do they take security seriously via audits, or is it just a checkbox? Once they find a single bug then report it and see how the process goes. Only after this do they look for more. For them, red flags are vague rules, low caps, prior disputes, or a lack of response.
  • A fantastic article from a fantastic security researcher. Thanks for taking the time to write up!

What AI Security Research Looks Like When It Works- 1889

Stanislav Fort    Reference →Posted 1 Month Ago
  • Aisle, the company blog authoring the post, is an AI security tool. Recently, Antrophic reported finding 500 vulnerabilities across various products. This has a problem, though: they don't discuss the severity breakdown, target selection, or maintainer response at all. At Aisle, they test ONLY against the most secure software projects with no retrospective comparisons.
  • The Aisle tool recently found twelve new vulnerabilities in OpenSSL. One of these was a buffer overflowin the CMS message parsing that could have been remotely exploitable without valid key material, with a rating of 9.8 out of 10. In five of the twelve cases, the AI system even proposed the fix.
  • Daniel Stenburg, the creator of curl, recently closed their bug bounty program due to LLM spam. They noted that AI can be effective for open-source security when used responsibly. It's an interesting perspective, given his history with the slop on his own bug bounty program. Aisle previously identified three vulnerabilities in curl, which were reported and fixed.
  • A great quote: "There's a temptation in this space to lead with big numbers. Five hundred vulnerabilities sounds impressive. But the number that actually matters is how many of those findings made the software more secure." The failure mode is now drowning maintainers in noise and declaring victory rather than actually improving the security posture. AI is collapsing the median via slop and raising the ceiling; it just depends on what side you're on.
  • Aisle has a PR review tool that appears to routinely find bugs. Daniel Stenburg even uses it on his own pull requests. They found a buffer overflow in a curl PR recently, as well as two UAFs in OpenSSL changes. The goal is to prevent vulnerabilities before they can occur. Good report on what good AI security looks like!

Median time-past as endpoint for lock-time calculations (BIPs 113)- 1888

BIPS    Reference →Posted 1 Month Ago
  • Bitcoin transactions can include a lock time, meaning they cannot be mined until that time or block. Bitcoin blocks are not required to have increasing time intervals, for some reason.
  • This creates bad incentives: miners want to make money, and the user doesn't want their transaction included yet. By creating a block with a timestamp way in the future, it's possible to get other clients to accept the transaction as valid! This is regardless of the wall time.
  • The BIP proposal suggests using the median time of the last 11 blocks to determine whether a transaction can be spent. Since it's the median, there's no averaging issue here to deal with. This is important for the CHECKLOCKTIMEVERIFY opcode. Overall, a good fix to a bizarre issue affecting Bitcoin.

A successful DOUBLE SPEND US$10000 against OKPAY in 2013- 1887

Bitcoin Forums    Reference →Posted 1 Month Ago
  • In March of 2013, an unexpected Bitcoin fork occurred, as documented in BIP 50. This was because a block with many transactions was mined. Bitcoin 0.8 nodes could process it, but pre-0.8 nodes could not. This caused a fork because pre-0.8 Bitcoin nodes accounted for about 60% of the mining power.
  • When switching to version 0.8, the upgrade now uses LevelDB instead of BerkeleyDB. BerkeleyDB had a limit on the number of transactions that could be in a block due to DB locks; this unintentionally became the new rule on the network. This limitation was removed on BerkeleyDB.
  • A user deposited $10K in BTC to OKPAY, which was included in the 0.8 fork. After some analysis, they realized that the TX was never confirmed on the 0.7 fork. They then created two transactions from the OKPAY transaction and broadcast them on the pre-0.8 fork block.
  • It's a double spend because one fork was actually being used by the payment provider, who used a different one. In reality, once the fork was detected, the payment processor should have stopped accepting Bitcoin transactions until the issue was resolved. Overall, a really interesting case of a double spend leading to stolen funds.

On the clock: Escaping VMware Workstation at Pwn2Own Berlin 2025 - 1886

synacktiv    Reference →Posted 2 Months Ago
  • The authors competed at Pwn2Own Berlin 2025 in the VMWare Workstation category. The vulnerability exists within the PVSCSI (Paravirtualized SCSI) controller emulation code. This is responsible for handling SCSI commands and forwarding them to the proper device on the machine. The guest OS splits the data into variable-sized chunks, each specifying a guest physical address to use.
  • The code copies entries via the guest driver into an internal array, then compacts it by combining nearby entries. To begin with, it has 512 segments, totaling 0x2000 bytes. If there are more than 512 entries, it allocates a 0x4000 buffer to store all entries and reallocates it for each newly added entry. The intended design is to double the size of the internal buffer when it needs to grow. The vulnerability is that the buffer allocation is statically set to 0x4000 instead of doubling each time. This leads to a very large out-of-bounds heap write. With more than 1024 entries, it's an OOB write every time.
  • The Windows 11 Low Fragmentation Heap (LFH) is where this chunk is placed. Typically, the strategy is to target different size classes to shift allocation to a less hardened allocator, but that's not possible here. Notably, the LFH heap has strict checks on chunk metadata and shuffles around allocations.
  • To exploit this vulnerability, they will need to find an object of size 0x4000 that can be directly allocated from the guest. They ended up using shaders to spray the heap since they can be freed, kept alive, or created at will. The URB objects have a length value on them that is used for writing to host memory directly. This makes them a great primitive for memory corruption.
  • To exploit this, it required a great deal of knowledge about the heap algorithm. They first filled two buckets of 16 each of shaders. After this, they freed all but one bucket in B1 to create a hole and allocated 15 URBs around it. Finally, a hole is created in B2 and we're ready for the exploit. The allocator will bounce between the two available slots in the two buckets. We use B2 to eat the bad write so that we don't corrupt the metadata of a heap chunk on another object. B1 has an object that can now be corrupted safely. This circumvents the mitigations and allows for the corrupting of OOB chunks on the heap.
  • This bug can be used to leak ASLR. Once ASLR is leaked, a fake URB structure can be created to cause havoc. For an arbitrary read, overwrite the URB.data_ptr. For an arbitrary write, corrupt URB.pipe and use a writeback mechanism to write those bytes. From there, they corrupted a callback function on a USB pipe object structure to call WinExec() because it's a CFG-whitelisted gadget.
  • The exploit was unreliable because it assumed knowledge of the heap at startup. They used some tricks to make the exploit more predictable and reliable. Their strategy was that creating a new bucket should take longer. They used this as a time-side channel to understand the current LFH state. Luckily for them, it worked first try during the contest.
  • They conducted this research over three months, evenings, and weekends. The first month was spent on reverse engineering and identifying the vulnerability. The exploitation took two months to do because of the LFH mitigations. Overall, a good post on the discovery and exploitation to win some money at Pwn2Own!

Minting Fees Out of Thin Air in zkSync Lite- 1885

Ehsan    Reference →Posted 2 Months Ago
  • zkSync Lite is a zkRollup L2 blockchain. The operator submits a proof attesting to the transition from the old root to the new root via state transitions. The L1 does not re-execute every transaction; the L1 just verifies the proof. Practically, this means that any bug that allows an invalid state transition to satisfy the circuit becomes the on-chain truth once proven.
  • zkSync Lite has operations that are processed in chunks. The circuit iterates over these chunks for verification. The first check is for state mutations, such as balances, nonces and such. The middle chunk is for pubdata consistency. The final chunk is done for fee accounting. From these separate locations in code came two definitions of valid: one for mutation validity (chunk 1) and another for tx validity (signatures, timestamps, etc.).
  • This discrepancy in valid is what causes the bug. The function ChangePubKey sets the account's L2s signing key. On the L1, the contract verifies that the pubkey change uses the nonce from the pubdata. pub_nonce equality is NOT checked in the tx validity, but IS checked within the mutation validity. When handling the fees, the validity was checked via tx validity and not both of them.
  • By putting these altogether, it's possible to create a transaction for ChangePubKeyOffchain where the transaction checker believes it's valid but the mutation doesn't believe so. On the fee accrual chunk, the fee accounting adds more fees than it should without increasing the user debit/nonce. In practice, this attack could be repeated with a malicious proof to mint infinite funds in fees. It appears that this was permissioned because of the reliance on the prover/sequencer/operator though.
  • With ZK vulnerabilities, the most common issues are around missing constraints. In this case, it was a control-flow issue with a semantic meaning mismatch that led to the vulnerability. So, the next time a complicated set of operations confuses you, maybe it confused the devs, too!

Building Agentic Infrastructure for Zero-Day Vulnerability Research- 1884

kritt.ai    Reference →Posted 2 Months Ago
  • Security research involves long hours of staring at code and is done only by a specialized group of people. With the rise of LLMs comes the ability to use AI tools to find vulnerabilities. They built a bot to think as security engineers do.
    1. Identify suspious behaviour
    2. Prove reachability of the code
    3. Prove controllability. Can the attacker influence the relevant data/state?
    4. Determine real world impact
  • Of those steps above, if any of them go wrong, then the bug won't be found. This is because it's long-form reasoning with compounding errors. Intuitive reasoning can be done locally, but it's bad globally. Precision decays the longer the chains get. The key insight is that you need checkpoints to enforce correctness and not just more tokens.
  • Instead of using better prompts, they created harnesses. This is a set of constraints, scaffolding and checks to force an agent to be systematic in its approach. They do this with the following steps:
    1. Generate hypotheses explicitly.
    2. Collect evidence before escalating confidence.
    3. Use deterministic tools when possible.
    4. Fail fast and prune dead ends.
    5. Produce artifacts a reviewer can trust
  • The post includes a great graph that explains their reasoning. First, it is an exponentially decreasing value that scales with reasoning length; the longer a chain, the worse it does. The other value on the graph is a shark tooth. For each verifiable subtask, the confidence is regained. After this, they have some good insights into what has worked for them.
  • First, the usage of deterministic tools when possible. Using CodeQL to find sinks is better than asking an LLM to do so. This is because it's deterministic and only requires the LLM to use CodeQL. Another point is that native tools work better with their home model. For instance, Claude Code works best with Opus.
  • Scanners have multiple issues. From multi-step flow identification to boundary issues, they do fail. The authors claim they use static analysis tools as much as possible and then rely on agentic reasoning to bridge the gap. This uses LLMs only when necessary, keeping things deterministic.
  • When reviewing code, not all lines are equal in terms of threat. Some repos/components only need shallow checks, while others need deep integration. By putting spend only onto difficult and promising areas, the costs stay lower and you will find more bugs.
  • The final major benefit is testing. If the code has a bug, this should be provable. Run the simulation, execute the PoC, and check whether the expected outcome occurred. This tends to remove false positives and improve confidence in an issue. Although not all tests are created equal, there's a major difference between an isolated unit type and a full simulation.
  • This bot found a max payout critical of $250K on Immunefi recently. No word on what the bug is but it's very interesting. They have other bugs on their profile as well.