Polygon is an EVM-L2 solution. It has multiple bridges to go to and from Polygon/Ethereum. One of these is the Plasma bridge, which uses transaction receipts to prove incoming/outgoing transaction information for token transfers. The user just needs to provide a proof of a withdraw event of the token on Polygon to obtain it. This event contains the amount and receiver of the transfer inside of it.
The proving process is done on a Merkle Patricia State Trie. This is a state trie that encodes data into it, making it searchable, that is used all over the EVM ecosystem. With a valid root hash and the existence proof within the state trie, it's possible to prove that the event happened. This requires that a checkpoint hash of the given block is proven.
The first vulnerability is an issue within the MerklePatriciaProof verification library. When providing data against an MPT root hash, it requires the trie's nodes and path. The recipient is an RLP-encoded transaction index. These tries have prefixes that define length extension information. The library enabled early stopping at an extension node by simply providing a shorter proof path. This creates a parsing differential between reality and what the proof sees. By doing this, the 7th parameter in the exit payload is completely controllable!
The RLPReader library had a memory corruption issue during parsing. The code has a simple memory pointer and loops over each element. For each element, it adds the length value and copies the data to get the entry. The library blindly trusts the length of a value to be valid, not considering it could go into other memory of the program. As a result, the parser can read beyond the expected bounds into memory.
Now we have two issues: an out of bounds read and a parser differential. Without the OOB read, the parser differential is unexploitable because the hash data requires as the value is random and likely cannot be parsed into a valid transaction. To get this to work, the extension node hash must be parsable as a valid receipt, which has a lot of requirements. They wrote a script to search the whole chain history for extension node hashes that would work in this case and did find a few.
The parser differential is completely separate from the exploitability of this; we just need a single valid tx for this exploit to work. Once we have the differential, we can use the out of bounds read to perform the exploit. To exploit this, we need Solidity to read from dirty memory (data that hasn't been cleaned up yet) with data that we control. Because of the order of operations, only the ERC20PredicateBurnOnly code was affected; it has a CALL opcode before the parsing that writes data to memory that is controllable. By having the parser read this data, we can control the logs that are processed.
This exploit is super clever! It required a small parsing discrepancy to begin with. After this issue occurred, it exploited a bug in the RLPReader library. The developers likely made the reasonable assumption that all data processed by the library would already have been validated. Although this is true, it doesn't account for a bug in the validation. Once you get past the initial validation, the code typically becomes much softer.