Cloudflare got access to Mythos early and decided to write a post about it. First, they say it was much better at constructing exploit chains and generating proofs for a bug that can actually be triggered. It just ran on its own better in these ways.
They had issues with getting rejected by the model. The same request asked in a different way would sometimes succeed. Getting rejected inconsistently is a really bad thing for a tool.
Mythos shows improvement in the signal-to-noise ratio problem. In memory-unsafe languages, it seemed to find memory corruption bugs that didn't exist. The models like to hedge things by saying it might be an issue, but can't give a good confidence interval like a human.
At first, they simply tried to point the model at a repository and told it to find bugs. This didn't work very well. This is because there's A) too much context and B) the throughput of legitimate review of a section is hard. So, they decided to write a
harness to manage the overall execution. They have a few tips for this:
- Narrow the scope. "Search for command injection in this function" is much better than "find bugs".
- Adversarial Review. Have a second agent between the initial finding and queuing with a different prompt and a different model.
- Specialized agents. Have an agent find the bug and a separate agent see if the bug can be triggered legitimately.
Their harness has 8 steps: recon, targeted bug hunting, validation, gapfill from the hunter to identify more attack surface, deduplication, tracing, feedback, and reporting. These processes make sense and seem to be highly effective at finding bugs.
They have a section at the end about patch timelines being a problem... if an attacker can find and exploit bugs faster, does the SLA for patches need to be faster too? Yes, in theory, but this can turn into a maintenance problem as well. Overall, a good article on writing agentic systems and the implications for security that this will have.