Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Vibe engineering- 1748

Simon Willison    Reference →Posted 5 Months Ago
  • Vibe Coding is the practice of using an AI-assisted programmer to write all of the code without paying attention to whether it's correct or not. So, what's the term for a seasoned professional who uses AI to accelerate their work? Productivity with LLMs on non-toy projects is actually very hard to do correctly. The author proposes that we call this vibe engineering.
  • LLMs actively reward existing top-tier software engineering. By forcing the LLM to write tests, you can quickly figure out if something works or not. A good test suite really benefits the AI and you're sanity on the code. Another way to be better is to Plan in Advance. Providing the AI with a detailed plan of how you want a particular task done will significantly enhance the output.
  • Comprehensive Documentation greatly heps AI. Whether it's just a subset of a codebase, documentation for a library or something else, allowing the LLM to read relevant documentation goes a long ways.
  • LLMs are apparently amazing at Git. They can navigate the history to find the origin of bugs. Effective automation provides good impact as well. Linters and CI/CD help find bugs quickly.
  • The most important thing to me is a culture of code review at the company. This makes sure bad code isn't added at the organizational level. From a personal level, this means reviewing the code that the LLM writes for bugs and correctness.
  • A good article on what needs to be done well in order to be a good usage of LLMs when programming.

Ratchet effects determine engineer reputation at large companies- 1747

sean goedecke    Reference →Posted 5 Months Ago
  • Building a reputation within a company is hard. Many people think it's through being talented but this probably isn't the case. Many strong engineers are unrecognized and many bad engineers are very well recognized. Why is this? The author discusses the natural Ratchet Effect.
  • Initially, you have a starting point that is not the lowest, but it is the lowest. Similar to how you join Chess.com at a 1200 rating. Judgements are made quickly at this stage. At first, you're only given regular JIRA tickets, bug fixes and nothing that stands out. Over time, your team, that sees the work, sees you and you gain status.
  • Since you did a good job on the work, you are given more work with visibility from other teams. If you did a good job at this, this creates even more positive visibility. This gives you a higher status within the organization. As you repeat this process with more and more teams, you are assigned higher and higher-profile projects until you reach your limit. All of a sudden, the CEO is choosing you by name to lead a project. Why?
  • Reputation is quick to form but very slow to change. Once someone outside of your immediate circle has a good or bad opinion about you, it's likely to stay. Individual teammates may know your skills more accurately but the skip managers only see the original perspective. It's hard to get out of this.
  • Some people try to jump straight into high-profile work to prove themselves, whether they are new hires or those with a bad reputation. According to the author, this usually doesn't work. Most of the time, big projects have a lead put on it ahead of time. Second, executes have a clear picture who is good and bad; if it's a key project, they won't risk putting a bad person on it.
  • What's the best way to gain clout? Focus on small pieces of work to build a reputation. Slowly pick up the importance of things to transition to higher-profile and more visible work. Your first high-visibility project is critical to forming a reputation with more senior management so you better do well on this one. If you have a failure, slowly build back with small successes.
  • Overall, a good piece on building yourself up as a good engineer at a company.

How I influence tech company politics as a staff software engineer- 1746

sean goedecke    Reference →Posted 5 Months Ago
  • Many software engineers avoid dealing with company politics because they consider it pointless to get involved. The main reasons why this is the case are as follows:
    • Technical decisions are made for selfish reasons that cannot be influenced.
    • Many powerful stakeholders are incompetent.
    • The political game depends on private information that software engineers do not have. This sucks.
    • You play a different role. If you're not spending your time playing politics and others are, such as your manager, then you're unlikely to succeed with less effort.
  • This Hacker News comment describes the first point really well. They claim a few interesting things. First, devs/supervisors want the new hot thing on their resume. Another reason is to push the newer version of something by proving that the older version is bad, much of the time at the expense of others. Third, the organization needs the new buzzword to show to VCs or get the opportunity. Another comment claims that proposals are often accepted by everyone but then rejected by the C-Suite because they played golf with somebody that changed their opinion of it.
  • Given that politics are present in the workplace, how do we address them? We're clearly not well-equipped to do it. Their first piece of advice is to work on a high-profile project successfully. If you kicked ass in your own area, then you will have respect from others. Making this success known to those people is a challenge in its own right, though. This gives you rewards like bonuses and more clout.
  • The hard way is to drum up support for your own project. Since this doesn't align with others and you don't have the political clout, this is very hard to do.
  • Instead, let other people fight for you. When the next political initiative that lines up with your project comes out, push it hard. For example, imagine your project is pulling some existing functionality into its own service. If there's a mandate at the company for reliability, then push your project. The org will get behind your project in response to it aligning with others without much political debt.
  • This waves come and go but the executes are always excited to be doing something. So, always have an important thing lined up that matches their flavor of the month. A good quote: "Having the right idea handy at the right time is your responsibility."
  • Overall, a good post on politics and how to navigate them as engineers.

Comprehension Debt: The Ticking Time Bomb of LLM-Generated Code - 1745

codemanship    Reference →Posted 5 Months Ago
  • Before making a change to legacy code, you must understand the code. This often requires understanding why it does the things it does which may not be obvious 10+ years after code was written. Even the code written for this website by a single individual (me) 7 years ago, this can take a long time.
  • With LLMs, teams are writing code at an unprecedented rate. Some of them will review the code and make changes to what the LLM did, offsetting some of the downstream effort.
  • Others have gone with a different approach though. Some folks are checking in code that somebody else has read and barely tested. This means that teams are producing code faster than they can understand it - the author coins this "comprehension debt". if the software gets used then the odds are high that at some point the generated code will need to change.
  • The comprehension debt is the extra time it's going to take us to understand the code. Of course, if you're trying to understand somebody else's code, you already had to do this. For your own code, this will slow you down though. Not even LLMs can save you from the ever-growing mountain of "comprehension debt" in many companies.

SP1 and zkVMs: A Security Auditor's Guide- 1744

Kirk Baird - Sigma Prime    Reference →Posted 5 Months Ago
  • SP1 is a zero-knowledge virtual machine (zkVM) that enables developers to prove the execution of arbitrary programs that can be compiled to RISC-V. Most of the code that uses this is written in Rust though. The ZK circuits enable devs to write standard Rust code to generate their cryptographic proofs, instead of domain-specific languages. The goal of this post is to prime security auditors to review code that uses SP1.
  • The SPL architecture is as follows:
    1. Compile the code into a RISC-V ELF binary.
    2. Execute the program in a zkVM. This will generate the STARK proof to be used later.
    3. Optimize and verify the proof. This is the mathematical verification that the code ran as intended.
  • The system consists of two components: the prover and the verifier. The prover executes the guest program and generates the ZK proof. The verifier takes in the proof and validates the cryptographic assumptions of it. This should come from the prover but a malicious actor can submit whatever they want. If the verification succeeds, the claimed computation has occurred.
  • The system is separated into Host and Guest systems. The Host is the standard machine that executes code, such as the machine you're using to view this website. The Guest program runs inside a VM that is completely separate from everything else. No Internet access, no databases, no nothing. When reading the code, the host and guest code is somewhat intertwined, making it an important distinction.
  • The first security note is that all input data is untrusted. If input is coming from the HOST to the GUEST, then the inputs must be validated. Range checks, length checks, business logic constraints, etc. should all be done. On this note, only GUEST Code is proven - not code running on the HOST. So, if there's a check in the HOST that's not in the GUEST, you probably have a bug.
  • SP1 uses 32-bit RISC-V. When coming or using 64-bit systems, this can cause issues. For instance, integer truncations and overflows should always be checked if dealing with usize values. On top of this integration issue, many dependencies attempting to be added to SP1 compiled code were not meant to be. This can lead to similar types of integer issues, operating system calls, unsafe code, and many other weird quirks.
  • When using SP1, data can be committed to become a public output. Naturally, if we're doing zero-knowledge proofs, the public information should be carefully audited. For instance, disclosing someone's age would be inappropriate. Another issue that is weird to me is Verification Key Management. In SP1, each program generates two keys: one for the prover and another for the verifier. Each guest program must have a unique verification key derived from its binary and not allow older key versions.
  • There are cases where information cannot be computed within the proof but rather statically as part of the output. For instance, a merkle proof can be generated. The validity is determined based upon the block hash associated with it. So, the block hash must be validated separately from the program. For SP1, you would want to make this a committed value as an output for external validation.
  • The most common vulnerability is around "Underconstrained circuits". This is simply the insufficient validation of state transitions in a program. This is basic logic validation like most other things. According to the post, practical knowledge of STARKs/SNARKs isn't necessary for auditing SP1 programs, unlike other cryptographic primitives.
  • A solid introduction to reviewing SP1 programs. I feel like this demystified a lot of terminology as well, which I really appreciated.

Hacking with AI SASTs: An overview of 'AI Security Engineers' / 'LLM Security Scanners' for Penetration Testers and Security Teams - 1743

Joshua Rogers    Reference →Posted 5 Months Ago
  • The author of this post was curious about the various AI-native security scanners. They wanted to find a product on the market that could identify vulnerabilities in code during a code review today. So, they tried numerous products, learned how they worked, and came up with this blog post. Surprisingly, AI security auditors are advertised everywhere but can actually be found nowhere.
  • All of the products tested had a very similar set of offerings. Full code, branch, PR/R scans. ZeroPath has a SOC2 report generator. Some of them have hooks for things like GitHub actions, bot guidance to developers, response to PRs and IDE plugins, naturally. Finally, they all support auto-fix/remediation guidance as well.
  • The first step is to ingest all the code and index it appropriately. Once it's uploaded, the context necessary for the LLM to scan and understand the code can be attempted. Extra context for the types of issues to find could be added for scans as well.
  • The next part is more of the "secret sauce". Asking an LLM to find all vulnerabilities won't be very helpful. So, how does it find the particular code to focus on? The tool could ask for function-by-function or file-by-file analysis. Some use CodeQL permissive queries, opengrep or any other AST traversing of the application. Once it has a candidate vuln, it will perform analysis to see if it's real or not via more detailed analysis.
  • The final stage involves reporting vulnerabilities, which includes detecting false positives and de-duplication. According to them, the tools didn't report as many false positives as traditional SAST tools. Some of them were better or worse at specific languages. Some were better at particular vulnerability classes.
  • Gecko and Amplify were very bad with no real bugs found. Almanax was very inconsistent - it would sometimes find basic bugs and other times it wouldn't. It was very good at very deliberate backdoored code though. Corgea found about 80% of purposely vulnerable code that was scanned. It had about a 50% false positive rate which isn't really that bad though. The language made a huge difference on the quality for this tool.
  • ZeroPath, according to the author, found 100% of the vulnerabilities in the Corpora. Additionally, it identified legitimate bugs in real-world codebases, including curl and sudo. Most of the real-world bugs weren't security issues, but bugs nonetheless. This was the best tool of the bunch.
  • Some takeaways:
    • The biggest benefit is around surfacing inconsistencies between developer intent and the actual implementation.
    • The tools were good at finding business logic issues.
    • They may replace pentesters in the longterm, or at least supplement them. For things without millions of dollars on the line, they are already a good fit.
  • I really like the tone of the article and the perspective of seeing the AI as a helper. For instance, mentioning that while the AI does miss bugs, so do humans. The comparisons are realistic, which I appreciate. Good article!

This House is Haunted: a decade old RCE in the AION client- 1742

himazawa    Reference →Posted 6 Months Ago
  • Massive Multiplayer Online video games are still huge. One of those, made in South Korea, is AION and is the focus of this post. In the game, a player could purchase and customize a house. The Butler, who managed your house, allowed users to write custom scripts to play sounds and automate actions. Neat!
  • The scripting engine under the hood is some version of Lua. It has in a sandbox with many functions stripped out. After some debugging, they were able to find out all of the available functions defined in _G.
  • After reviewing the list, they found several that were useful for code execution. load() and loadstring() are two easy ones. Using these functions, it's possible to load in Lua bytecode that can bypass the bytecode verifier to cause memory corruption. Luckily enough, io wasn't disabled which can be used to open arbitrary processes very easily. io.popen("calc.exe"); is enough to do this, for instance.
  • There are several mechanisms to make this "no-click" besides entering the house. OnInit() will run whenever somebody enters the house. Interestingly enough, this gives you code execution on the users client and not the game server. Still pretty neat!

Taming 2,500 compiler warnings with CodeQL, an OpenVPN2 case study- 1741

Trail of Bits    Reference →Posted 6 Months Ago
  • The authors of this post were reviewing OpenVP2 when faced with a difficult challenge: it had over 2.5K compiler warnings. Could some of these be security issues though? Their goal was to limit these errors to only the ones that matter. They decided to tackle a single class of issues: numerical conversions.
  • C's relaxed type system allows for implicit numerical conversions. Not all conversions are security issues but some of them can be. Signedness, truncation and overflows are all issues that can arise from this. With this problem defined, they decided to build a CodeQL query to identify potentially problematic areas.
  • After performing all of this analysis, they determined that none of the conversions led to real issues. It's interesting to see the usage of more niche CodeQL queries to perform useful flow analysis. Good blog post!

Introducing V12 Vulnerability Hunting Engine- 1740

Zellic    Reference →Posted 6 Months Ago
  • This blog posts delves into the results of an autonomous Solidity auditor called "V12". It has a UI and makes it easy to interact with via a website. According to them, it performs at or exceeds the level of junior auditors at some firms. It can find many basic programming mistakes, some even missed by various companies. It will integrate with C4, Zellic/Zenith audits, a standalone application and a GitHub Action.
  • The mission is sane - security is a continuous battle, not a commit hash in time, and products/services should reflect this. Naturally, this doesn't replace an auditing company but it can help the service team in the long term. Finding even simple issues, like access control vulnerabilities, improving the security as a whole.
  • I appreciate that they include an Evaluation section for bugs they have found. They show several vulnerabilities from previous hacks, such as the 1Inch bug, MonoX hack and a couple of others. The 1Inch bug is slightly deceptive - this was more-so caused by a scoping issue and actually had been found by auditors.
  • The tool has competed in several live Cantina/HackenProof auditing contests. I find these most impressive, since their was no "taint" potential on the model. These are unique vulnerabilities that others found in a contest.
  • They also list several historical contests, which could potentially be tainted in the data set. For proper evaluation, the training and test sets must be completely unique. On the other contests they list, they claim V12 found enough bugs that it would have placed well in the competition. 2 out of 2 highs and 4 out of 6 issues are highlights from this section. I'm slightly skeptical about this; was their some tainting of the training data set vs. the testing data set? If this was true then how come it didn't perform as well on live contests it posted?
  • They also use this on their live audits. Many of the bugs are fairly simple, such as access control issues, reentrancy and bad error handling. They even mention this themselves, which is an interesting analysis. All of these are great things that would work great in a CI setting and as an assistant to a security researcher. As LLMs get better, I think that the vulnerabilities will become harder and harder to discreetly find but also more valuable.
  • Their perspective on who should use the tool is wise. V12 can enhance the capabilities of a great researcher but should only be used at the end. It's more of an additional layer of assurance and a source of inspiration than anything else. To inexperienced researchers, it's mostly a crutch. I'm curious to see how this plays out.

Wrong Offset: Bypassing Signature Verification in Relay - 1739

Felix Wilhelm    Reference →Posted 6 Months Ago
  • Replay is a cross-chain bridge on Solana. The original design had simple relayers, but the newer version introduces more smart contracts for managing funds. The idea is to transfer funds on one chain and receive the funds on another by order fulfillment via LPs.
  • To initiate a transfer, users must create a transfer on the source chain. On the destination chain, a TransferRequest is signed by a privileged off-chain entity known as the allocator, which releases the funds to the user.
  • To perform signature validation, the native ed25519 program is used and instruction introspection is performed. The program first reads the index of the current instruction and then fetches the previous instruction to perform validation of the signature. The native program contains a lot of information for the data being verified and offsets for exactly what data is being checked. When performing the validation on the instruction itself, it checks that the program is correct and that the signature count is one.
  • The arbitrary offsets and indexes are a powerful feature of the Solana Ed25519 program. The offsets for validation are hardcoded into the relay bridge program, though. In practice, this means that we can specify the proper public key at the hardcoded offset, but then perform the validation at a different offset! By doing this, data can be signed with a different key but still be viewed as valid.
  • The bridge didn't have very much funds at risk. Additionally, since this is a solver protocol and not an actual bridge, only in-flight funds were in the bridge at the time. Another great find by Felix in a major footgun for the Solana ecosystem.