Poly Network Hack Postmortem

On July 2nd, 2023 06:47:20 PM UTC Poly Network suffered what was initially reported to be a notional $34b hack (the actual realized amounts were far less, due to most of the tokens being illiquid). The Poly team paused their smart contracts EthCrossChainManager on several chains, most notably on Metis, BSC and Ethereum. After our team reconstructed the attack, we concluded that the root cause was not a logical bug on the smart contract, but, most likely, stolen (or misused) private keys of 3 out of 4 of Poly network’s keepers (off-chain systems controlled by the team). In order to understand how the attack took place, we need to understand the architecture of Poly’s cross-chain managers.

Poly runs a network of cross-chain management contracts, allowing tokens to be “transferred” from an origin chain to a destination chain. These contracts accept proofs of token transfer changes on the origin chain, together with encoded arguments for a transaction that withdraws these tokens on the current chain.

function verifyHeaderAndExecuteTx(bytes memory proof, bytes memory rawHeader, bytes memory headerProof, bytes memory curRawHeader,bytes memory headerSig) whenNotPaused public returns (bool)
// proof = Poly chain tx merkle proof
// rawHeader = The header containing crossStateRoot to verify the above tx merkle proof
// headerSig = The converted signature variable for solidity derived from Poly chain's keepers

Main entry point allowing users to “unlock” tokens on the “destination” chain that were “locked” on the origin chain

In Poly, the operation to transfer tokens from the origin chain is referred to as “lock” and the function to retrieve the tokens is referred to “unlock”. Poly employs a system of so-called “consensus nodes” that are essentially EOAs that sign off on the “unlock” event on the destination chain, by including relevant entropy from the origin chain confirming the lock event. This entropy consists of a state root reflecting the locked tokens on the origin chain. Here’s the relevant code which checks that the “header” structure was correctly signed by the “consensus nodes”. The header contains the state root of a Merkle tree. Since the entire header is signed, so is the state root, and, by extension the entire state as witnessed by the Merkle tree.

function verifySig(bytes memory _rawHeader, bytes memory _sigList, address[] memory _keepers, uint _m) internal pure returns (bool){
        // (Dedaub comment)        
        //_rawHeader = 0x0000000000000000000000001e8bb7336ce3a75ea668e10854c6b6c9530dab7...
        //_sigList = // List of 3 signatures from 0x3dFcCB7b8A6972CDE3B695d3C0c032514B0f3825,0x4c46e1f946362547546677Bfa719598385ce56f2,0x51b7529137D34002c4ebd81A2244F0ee7e95B2C0
        //_keepers = ["0x3dFcCB7b8A6972CDE3B695d3C0c032514B0f3825","0x4c46e1f946362547546677Bfa719598385ce56f2","0xF81F676832F6dFEC4A5d0671BD27156425fCEF98","0x51b7529137D34002c4ebd81A2244F0ee7e95B2C0"]
        //_m = 3
        
        bytes32 hash = getHeaderHash(_rawHeader);

        uint sigCount = _sigList.length.div(POLYCHAIN_SIGNATURE_LEN);
        address[] memory signers = new address[](sigCount);

        // (Dedaub comment)
        //   signers = [
        //     0x4c46e1f946362547546677Bfa719598385ce56f2,
        //     0x3dFcCB7b8A6972CDE3B695d3C0c032514B0f3825,
        //     0x51b7529137D34002c4ebd81A2244F0ee7e95B2C0
        // ]
        bytes32 r;
        bytes32 s;
        uint8 v;
        for(uint j = 0; j  < sigCount; j++){
            r = Utils.bytesToBytes32(Utils.slice(_sigList, j*POLYCHAIN_SIGNATURE_LEN, 32));
            s =  Utils.bytesToBytes32(Utils.slice(_sigList, j*POLYCHAIN_SIGNATURE_LEN + 32, 32));
            v =  uint8(_sigList[j*POLYCHAIN_SIGNATURE_LEN + 64]) + 27;
            signers[j] =  ecrecover(sha256(abi.encodePacked(hash)), v, r, s);
            if (signers[j] == address(0)) return false;
        }
        return Utils.containMAddresses(_keepers, signers, _m);
    }

Function to verify signed header, which contains the very-important state root. Comments added by Dedaub.

Our team verified that the code above was correctly invoked and that the header was indeed signed by 3 of the centralized keepers, satisfying the (k-1) out of k keeper signature scheme. We also checked that the list of keepers was not modified prior to the attack. Indeed, over the span of 2 years, the list of keepers remains unchanged and consists of 4 EOAs. It is common for decentralized protocols to employ “keepers”, i.e., external systems controlled by the development team, that feed vital information into the smart contracts. This is sometimes necessary since smart contracts cannot operate autonomously, and need to be invoked externally. What’s less common, however, is to rely on 3 keepers for the end-to-end security in a high TVL cross-chain bridge.

Continuing with our investigation, assuming the attacker did not have control over 3 of the EOAs, the Merkle prover would have been the likely cause of a logical bug in the smart contracts. We therefore looked into this next.

/* @notice                  Verify Poly chain transaction whether exist or not
    *  @param _auditPath        Poly chain merkle proof
    *  @param _root             Poly chain root
    *  @return                  The verified value included in _auditPath
    */
    function merkleProve(bytes memory _auditPath, bytes32 _root) internal pure returns (bytes memory) {
        uint256 off = 0;
        bytes memory value;
        //_auditPath = 0xef20a106246297a2d44f97e78f3f402804011ce360c224ac33b87fe8b6d7b7e618c306000000000000002000000000000000000000000000000000000000000000000000000000000382fc20114c912bcc8ae04b5f5bd386a4bddd8770ae2c3111b7537327c3a369d07179d6142f7ac9436ba4b548f9582af91ca1ef02cd2f1f03020000000000000014250e76987d838a75310c34bf422ea9f1ac4cc90606756e6c6f636b4a14cd1faff6e578fa5cac469d2418c95671ba1a62fe14e0afadad1d93704761c8550f21a53de3468ba5990008f882cc883fe55c3d18000000000000000000000000000000000000000000
        (value, off)  = ZeroCopySource.NextVarBytes(_auditPath, off);

        bytes32 hash = Utils.hashLeaf(value);
        uint size = _auditPath.length.sub(off).div(33);
        bytes32 nodeHash;
        byte pos;
        for (uint i = 0; i < size; i++) {
            (pos, off) = ZeroCopySource.NextByte(_auditPath, off);
            (nodeHash, off) = ZeroCopySource.NextHash(_auditPath, off);
            if (pos == 0x00) {
                hash = Utils.hashChildren(nodeHash, hash);
            } else if (pos == 0x01) {
                hash = Utils.hashChildren(hash, nodeHash);
            } else {
                revert("merkleProve, NextByte for position info failed");
            }
        }
        require(hash == _root, "merkleProve, expect root is not equal actual root");
        return value;
    }

Poly Network Hack | Merkle prover of the Poly chain

The Merkle prover above takes as input a byte sequence (_auditPath) containing the leaf node followed by a path through the Merkle tree that proves the existence of the leaf node, given the state root (_root). Remember, this state root has already been signed by the keepers. In case you’re unfamiliar how Merkle trees work, the picture below depicts a Merkle tree, which is a cryptographically-secure data structure. The key of the algorithm rests on the fact that the root of the Merkle tree contains (transitively) entropy from all the leaf nodes in the tree. Therefore a proof (often called “witness”) can be easily constructed, and cheaply verified. We only need to trust the root of the tree, and if it’s trusted, so is anything else verified by a Merkle tree witness.

Poly Network Hack

The kicker here is that in order to simplify the exploitation scenario, the attacker made full use of the flexibility afforded by the verifier’s implementation. It turns out that the verifier allows for zero-length witnesses. Essentially, the attacker passed in the leaf node, which is exactly 240 bytes in this case, and an empty path as a proof. As it turns out in this case, the hash of the leaf node needs to correspond to the state root (hash) in order for this proof to succeed. This further adds merit to the hypothesis that the Poly chain keepers were likely compromised and signed a state root that turned out to be artificially constructed. The only information within it contained an unlock command that sends tokens to the attacker.

It is unfortunate to note that Poly network had been previously attacked by a greyhat hacker almost two years ago.

Finally, it took Poly network 7 hours to react to today’s attack, and in the meantime the attacker had orchestrated several transactions on multiple chains to exploit this.

Despite the narrative above, there is no definitive proof so far that the keys were stolen. It could have been a rugpull, or it could have been compromised off-chain software running on 3 out of 4 of the keepers. The effect is the same, as far as we can observe. What appears to be unequivocal in the Poly network hack is the fact that a logical bug was not exploited in the smart contracts carrying out the token transfers and that the keepers signed a maliciously-crafted proof. If indeed the Poly network developers confirm the attack has to do with compromised signature keys, as is likely the case, this brings to question the suitability of centralized bridges controlling so much funds. The attack also suggests less-than-perfect monitoring by the Poly network team of the underlying bridge. Had the protocol been set up with a fast monitoring solution, such as Dedaub Watchdog, this would have significantly reduced the reaction time and possibly saved some funds.

Uniswap Reentrancy Vulnerability Disclosure

By the Dedaub team

Uniswap Reentrancy Vulnerability Disclosure

Uniswap Reentrancy | Uniswap Labs recently advertised a boosted $3M bounty program for bug reports over their smart contracts, and especially the new UniversalRouter and Permit2 functionality. We submitted a bug report and received a bounty — thank you! To our knowledge, ours was the only bug report that Uniswap acted upon (i.e., the only one to have apparently resulted in a commit-fix to smart contract code and a new deployment of the UniversalRouter).

The explanation of the issue is fairly straightforward, so we begin with the verbatim text of our bug report.

Clearly, the UniversalRouter should not hold any balances between transactions, or these can be emptied by anyone (e.g., by calling dispatch with a Commands.TRANSFER, or Commands.SWEEP).

This property is dangerous when combined with any possibility of untrusted parties gaining control in the middle of a user transaction. This can happen, e.g., with tainted ERC20 tokens in Uniswap pools, or with callbacks from token transfers (e.g., 721s). E.g., simple attack scenario: a user sends to the UniversalRouter funds and calls it with 3 commands:

1) Commands.V3_SWAP_EXACT_IN, swap to a specific token
2) Commands.LOOKS_RARE_721 (or any of a large number of commands that trigger recipient callbacks), purchase NFT, send it to recipientX
3) sweep amount remaining after purchase to original caller.

In this case, recipientX can easily reenter the contract (by calling transfer or sweep inside its onERC721Received handler) and drain the entire amount, i.e., also the amount of step 3.

In essence, the UniversalRouter is a scripting language for all sorts of token transfers, swaps, and NFT purchases. One can perform several actions in a row, by supplying the right sequence of commands. These commands could include transfers to untrusted recipients. In this case, it is natural to expect that the transfer should send to the recipient only what the call parameters specify, and nothing more.

However, this is not always what would happen. If untrusted code is invoked at any point in the transfer, the code can re-enter the UniversalRouter and claim any tokens already in the UniversalRouter contract. Such tokens can, for instance, exist because the user intends to later buy an NFT, or transfer tokens to a second recipient, or because the user swaps a larger amount than needed and intends to “sweep” the remainder to themselves at the end of the UniversalRouter call. And there is no shortage of scenarios in which an untrusted recipient may be called: WETH unwrapping triggers a callback, there are tokens that perform callbacks, and tokens could be themselves untrusted, executing arbitrary code when their functions get called.

Our proof-of-concept demonstrated an easy scenario:

Attached are two files. One is a replacement (slight addition) to your foundry test file, universal-router/test/foundry-tests/UniversalRouter.t.sol. The other should be dropped into the “mock” subdirectory. If you then run your standard `forge test -vvv` you will see for the relevant test the output:
======

[PASS] testSweepERC1155Attack() (gas: 126514)
Logs:
1000000000000000000

======

The last line is a console.log from the attacker showing that she got a balance in an erc20 token, although she was only sent an ERC1155 token.

The test case is simply:

function testSweepERC1155Attack() public {
 bytes memory commands = abi.encodePacked(bytes1(uint8(Commands.SWEEP_ERC1155)));
 bytes[] memory inputs = new bytes[](1);
 uint256 id = 0;
 inputs[0] = abi.encode(address(erc1155), attacker, id, 0);
 
 erc1155.mint(address(router), id, AMOUNT);
 erc20.mint(address(router), AMOUNT);
 
 router.execute(commands, inputs);
}

That is, the attacker is being sent an erc1155 token amount, but the router also has an erc20 token amount. (In reality, this would probably be due to other transfers, to happen later.) The attacker manages to steal the erc20 amount as well.

The remedy we suggested was easy:

… add a Uniswap reentrancy lock, although inelegant: no dispatching commands while in the middle of dispatching commands.

We got immediate confirmation that the issue is being examined and will be assessed for the bounty program. A couple of weeks later, we received the bounty assessment:

we would like to award you a one-time bounty of $40,000 (payable in USDC) for your contribution. This amount includes a $30,000 bounty for the report, as well as a 33% bonus for reporting the issue during our current bonus period. We have classified the issue as medium severity and appreciate your assistance in helping us improve the safety of our users and the web3 ecosystem as a whole.

Further communication clarified that the bug was assessed to have High impact and Low likelihood: the possibility of a user sending NFTs to an untrusted recipient directly (although present in the UniversalRouter unit tests) is considered “user error”. Therefore, only much more complex (and less likely) scenarios were considered valid for Uniswap reentrancy, resulting in the “low likelihood” rating.

We want to thank Uniswap Labs for the bounty and are happy to have helped!

Latent Bugs in Billion-plus Dollar Code

You are probably safe, but be aware…

Daniel Von Fange pinged me last week:

Hey, I just realized that the xSushi reward distribution contract that’s commonly cloned around would be vulnerable to complete theft if the deposit token used was an ERC777 style that allowed rentrancy.

The message set in motion the close examination of just ~15 lines of code, handling funds in the billions.

We found not one, but two latent bugs. Both have pretty specific conditions for becoming vulnerabilities. We did our best to ascertain that current deployments are not at risk. (There was a time when an attacker could steal $60M, though.) However, this doesn’t mean there’s no risk: there may be several tokens that if you intend to stake in, one can attack you, right now, let alone what can happen with future deployments.

Take-home message:

  • be extra careful with the initial stakes of xSushi-like reward contracts
  • never deploy such contracts with ERC777 underlying tokens.

and generally:

  • be aware of reentrancy threats when interacting with ERC777 tokens.

The Code

Here is an instance of the code in question, a common snippet of a staking token contract. Such code was originally used in the xSushi staking contract and has since been extensively cloned.

contract VulnerableXToken {
    // ..

    // Pay some tokens and earn some shares.
    function enter(uint256 _amount) public {
        uint256 totalToken = token.balanceOf(address(this));
        uint256 totalShares = totalSupply();
        if (totalShares == 0 || totalToken == 0) {
            _mint(msg.sender, _amount);
        } else {
            uint256 what = _amount.mul(totalShares).div(totalToken);
            _mint(msg.sender, what);
        }
        token.transferFrom(msg.sender, address(this), _amount);
    }

    // Claim back your tokens.
    function leave(uint256 _share) public {
        uint256 totalShares = totalSupply();
        uint256 what = _share.mul(token.balanceOf(address(this))).div(totalShares);
        _burn(msg.sender, _share);
        token.transfer(msg.sender, what);
    }
    
    // ..
}

The enter function just accepts an investment in an underlying token (token, in the above) and issues shares by minting staking tokens (VulnerableXToken). The staking tokens accrue rewards, and upon leave the investor can claim back their rightful proportion of the underlying token.

Bug #1

The code looks reentrancy-safe at first glance. The external call (transferFrom) happens after all state updates (_mint). So, it seems that nothing can go wrong.

Back on Feb. 24, well-known Ethereum security engineer t11s had tweeted a warning about ERC777 tokens.

killer feature of ERC777 is its receive hooks, allowing contracts to react when receiving tokens. What is not mentioned often is the respective hooks called on the senders of the funds, which MUST be called before updating the state.

Implementation Requirement:
The token contract MUST call the 
tokensToSend hook before updating the state.
The token contract MUST call the 
tokensReceived hook after updating the state.

This means that any ERC777 token is a reentrancy death trap! The token itself violates the “effects before external calls” rule: it calls out to the sender before it commits the effects of a transfer. Any caller into the token may be maintaining the effects-before-calls rule, but it may not matter, if the token itself does not. The caller should either be agnostic to the token’s effects (i.e., never read balances) or should use reentrancy locks.

What Daniel had realized regarding xSushi-like code is that the PRE-transfer hook in an ERC777 token’s transferFrom would allow an attacker to reenter (literally: re-enter, in the above code) before any funds had been transferred to the staking contract, taking advantage of any state changes made before the call. In our case, upon reentering, the VulnerableXToken balance has changed (by the internal _mint call) but the underlying token’s balance has not: there are more shares but the same funds, so, when re-entering, shares appear to be cheaper!

Of course, the underlying token does not necessarily need to be an ERC777, as this exploit could be possible for any token that implements similar callback mechanisms to the ones described.

To summarize:

  • IF the underlying token (token, in the above code) of an xSushi-like rewards contract calls back a hook (e.g., is an ERC777, which calls tokensToSend on the sender)
  • AND the underlying token does this callback before adjusting the balance,
  • THEN one can reenter and get shares cheaper, all the way to full depletion of everyone else’s funds.

Bug #2

When I shared the code in the Dedaub internal channels, Konstantinos (one of our senior engineers) immediately commented:
“I see the bug — haven’t we encountered this in an audit before?”

Indeed we had…

… but he wasn’t seeing Daniel’s bug!!!

It was an entirely different bug, based on making the division _amount.mul(totalShares).div(totalToken) round down to zero when another user is depositing. In this way, the depositor would get zero shares, but the old holders of shares would keep the newly deposited funds.

A simple attack scenario with only two depositors (attacker and victim) would go as follows:

  • The attacker is the first person to call enter and deposits amount1 of token, getting back an equal amount of shares.
  • The next depositor comes in and tries to deposit amount2 of token. The attacker front-runs them and directly transfer to the contract any amount of token greater than (amount2–1)*amount1.
  • The attacker gets no shares in return, but they have all the shares to begin with! In this way, amount2*totalShares/totalToken rounds down to zero, leaving the next depositor with nothing, while the attacker can withdraw all the deposited token by calling leave, as they own all the shares.

To see how big an impact this bug can have, consider the first transfer to the xDVF staking token:

This was a transfer for 12.5 million DeversiFi tokens, currently valued at $5 each. An attacker could have front-run that transfer and stolen all $60M worth of tokens!

Checks for Live Vulnerabilities

To determine if there are funds under threat right now, we used the Dedaub Watchdog database to query all currently-deployed contracts on Ethereum, together with their current balances.

  • There are 239 deployed Ethereum contracts with the xSushi-like enter/leave code in publicly posted source.
  • 13 of those have staked funds right now. The highest-value are xShib ($960M), xSushi ($233M), and xDVF ($72M).
  • None of those has an ERC777 as an underlying.
  • The largest value that would have been at risk of a front-running attack during the initial deposit is the $60M in xDVF, as discussed earlier.
  • We also checked Polygon, found only 4 xSushi-like contracts, none with staked funds.

Although the above numbers should be fairly complete, it’s worth noting there may still be threats we are missing. The same code could be deployed in networks other than Ethereum; vulnerable contracts may have no published source, so our search may have missed them; our balances query may be incomplete for tokens that don’t emit the expected events upon transfers; and our Polygon scan was not exhaustive — just considered the last 200K contracts or so.

And, of course, any initial staking in any xSushi-like contract, among the current 200+ deployed or future ones, is vulnerable to front-running attacks.

Conclusion

The code we looked at is very simple and problematic only in very subtle ways, not all under its control (e.g., the ERC777 reentrancy is arguably not the xSushi code’s problem). It is likely to keep getting cloned, or to have independently arisen in other settings. (In the latter case, please let us know!)

Either way, we repeat our message for awareness:

  • be extra careful with the initial stakes of xSushi-like reward contracts
  • never deploy such contracts with ERC777 underlying tokens.

and generally:

  • be aware of reentrancy threats when interacting with ERC777 tokens.

These are attack vectors that the community should know about, lest we see one of them being exploited for heavy damages.

Mass Disclosure of Griefing Vulnerabilities

This week, with the help of @drdr_zz and @wh01s7 of SecuRing, we tackled a backlog of warnings from the Dedaub Watchdog tool, notifying around 100 holders of vulnerable accounts, with some $80M in funds exposed. (@_trvalentine had earlier produced proof-of-concept code to demonstrate that the attack is valid.)

The warnings concern griefing vulnerabilities: cases where an attacker can move the victim’s funds to a contract, but this does not confer the attacker any direct benefit — only makes life harder for the victim, up to possible loss of funds.

Although there’s not much technically novel in the vulnerabilities themselves, we decided to write this report to describe the events and the mass-disclosure method, via etherscan chat messages.

A Word of Introduction

The Dedaub Watchdog tool (built over our public contract-library.com infrastructure) continuously analyzes all deployed contracts on Ethereum. It implements some-80 different analyses and combines warnings over the code with the current state of the chain. (E.g., balances, storage contents, approvals.) It is our main workhorse for discovering vulnerabilities — in the past year, we have disclosed several high-impact vulnerabilities with exposure in the billions and received 9 bug bounties totaling ~$3M. (Bounties by DeFi SaverDinngo/FurucomboPrimitiveArmorVesperBT FinanceHarvestMultichain/AnyswapRari/Tribe DAO.)

The griefing vulnerabilities of this report are a bit below this bar: they typically represent simple, direct oversight, which could be lethal, but thankfully the attacker can’t gain much by exploiting it. It’s the kind of vulnerability that we might typically silently report and forget about it.

In this case, however, the number of potential victim accounts started mounting. We had a list of 18 vulnerable contracts that exposed atransferFrom(from any victim) to untrusted callers, while holding approvals for high-value tokens from many hundreds of victims. When we actually ran the query to cross-reference vulnerable contracts with the victims that had approved them AND had balances in the exposed token, we were stunned to find 564 threatened accounts, with a total amount of funds at risk at $80M, including one account with $76M exposed!

Griefing Vulnerability

There were 2–3 different vulnerable code patterns, but the main one is quite simple. It allows transferring the victims’s funds to a bridge contract using a function callable by anyone.

Griefing Vulnerabilities

How serious this is depends on the bridge protocol’s specifics. In the best case, it is just messy: a human needs to be involved, verify that the victim’s funds were indeed transferred inadvertently and are still at the bridge, and authorize a transaction to return the funds to the victim. In the worst case, the funds remain stuck forever. Also, such vulnerability greatly increases the griefing attack surface: the bridge contract may itself be vulnerable.

Responsible Disclosure

We contacted the main bridge protocol with vulnerable contracts, but the contract was in active use. The right course of action would be to also contact potential victims directly, especially those with current exposure (i.e., combination of balance and approval over the same token).

We had a list of 564 addresses in front of us. But we did not know the identity of the account owners.

The best-practices playbook in this case seems to be:

  • Check if the address is the owner of an ENS domain and if there is a contact there.
  • In the case of addresses with significant funds, direct contact with projects (i.e., the protocol with the vulnerable contract) which very often know the identity of their largest clients/holders.
  • Use the etherscan chat feature.

Another standard concern when notifying victims is that each victim, after securing their own funds, could become a potential attacker. After discussion, we considered this risk to be small. We lowered it to a minimum by starting with the addresses with the most funds (thus reducing the potential reward / satisfaction of the attacker).

The plan was as follows:

  1. Manually verify and try to establish the identity of the addresses with the most funds through the ENS or through the official channel of communication with project teams in which they have a large share.
  2. Automated notification of holders at risk using the etherscan chat.

Manual verification turned out to be relatively simple. The address with $76M threatened was the 3rd largest holder in the XCAD Network project. @drdr_zz spoke directly to the CEO via the official channel and after only 45 minutes, the funds were safu again.

However, there were still a lot of addresses, the identities of which were unknown. Even with the etherscan chat feature, it would be very time-consuming to write to each holder separately. As far as we know, etherscan chat does not currently offer a simple API that we can hook to. However, after a little research, we determined that it works over a websockets connection.

The mechanism is as follows:

  1. A wallet owner connects her wallet (e.g., via Metamask) to the Chat application.
  2. The application assigns a valid session cookie to the user and a temporal token to authenticate the WebSocket connection.
  3. A WebSocket connection is opened using the cookie and the token is sent to the server.
  4. The user can now send chat messages via WebSocket.

We wrote a script to automate the whole process and 98 messages were sent to inform users that had over $1k exposed:

The message sent was:

Hello,

I’m a security researcher (from SecuRing, collaborating with Dedaub) and I’m writing to let you know that your account has exposed funds to contract ADDRESS, for token “TOKEN”.

Any attacker can call a function on this contract that will cause it to transferFrom your funds to the contract. The attacker does not stand to gain from this, so the risk is perhaps not critical.

However, it is a threat, and you may have significant trouble getting your funds back if it happens.

We strongly recommend removing approvals to contract ADDRESS for the “TOKEN” token. You can use the etherscan tool to do so, or any other trusted approval remover.

(This threat was automatically identified using Dedaub’s Watchdog analysis, but the threat was confirmed manually with a proof-of-concept implementation.)

Thank you.

After half a day, we started getting back “thank you” messages.

Lessons learned

  • The etherscan chat feature is a great tool for direct contact with unknown account holders. But the response time may be unsatisfactory since the holder needs to poll their account page on etherscan, and the chat feature is not too widely known and used.
  • Token approvals of this form are arguably more of a UI problem than a smart contract problem, which is why it is important to be aware of standing approvals as an account holder (just as it is to verify transactions before signing them).

It was a pleasure to have this collaboration between Dedaub and SecuRing, and hopefully we helped improve Ethereum security a bit.

Rari Capital Vulnerability

Security researchers actively participating in Tribe DAO’s Discord security channel, raised concerns about a security issue relating to Fuse pools. The Rari Capital team executed our pre-established emergency response plan and immediately fixed the vulnerability. Because of the identification of the vulnerability, and the quick actions in response, no funds were lost. This article will address the nature and identification of the vulnerability as well as the remediation steps executed by the Rari Capital team.

While security has always been a priority in Rari Capital’s projects, Rari Capital’s capabilities have come a long way since its inception. Rari started as a fair launch project with next to no resources and a team of scrappy yet talented engineers. Today, as a highly experienced group of contributors and a part of the Tribe DAO, the Rari Capital team has exceptional resources enabling it to implement the most extensive security measures. Rari now has contributors who are some of DeFi’s top smart contract engineers, a robust network of security auditing professionals and organizations, and a thriving relationship with the white hat community. However, some of the initial Fuse contracts were written prior to having access to these resources and were not scrutinized like contracts are today.

The Vulnerability

At around 4:30 PM PT on March 3rd security researchers including @samczsun, @hritzdorf, and @YSmaragdakis (of @Dedaub) identified a vulnerability across multiple Fuse pools. Pools 0 to 32 with the exception of Pool 6 were at risk. The Rari Capital team was informed, and the admin multisig immediately paused borrowing across all Fuse pools.

The vulnerability was on an old version of the cToken and comptroller implementations. Specifically, the vulnerability was on the cEther contract which used .call.value to transfer ETH instead of .transfer like Compound. This is one of a handful of audited changes from the Compound codebase. Unfortunately audits aren’t a silver bullet and this vulnerability slipped through the cracks. It allowed for a cross-asset reentrancy upon cEther redemption where all assets in vulnerable pools could be borrowed for free. This was because the cEther state hadn’t fully been updated with the effects of the redemption before the ETH transfer. As a result all borrowable assets could have been stolen from those pools.

In late 2021, the Fuse contracts were upgraded to a version which contained a pool-wide check for reentrancy. All pools deployed after pool 32 used the upgraded contracts. Pool admins who deployed prior to the update had an option to upgrade to the latest Fuse contracts from the UI. Of the 32 pools that were deployed with this version of Fuse, only pool 6 was upgraded by its admin.

The Rari Capital Response

Upon being contacted by the researchers about the possible vulnerability in the platform, the Rari Capital team initiated processes to identify, confirm, and validate the issue. Per the incident response playbook, it was determined from a set of available actions to immediately pause borrowing globally. An extensive review of the vulnerability took place, which included a PoC, and review of available remediations options. Once the solution was developed, validated, and tested, the team acted fast to expedite the implementation of the fix.

Each Fuse at risk pool was upgraded to the latest cToken and Comptroller implementations, which prevents this or similar reentrancy vulnerabilities from being exploited. All pools were re-tested to confirm the vulnerability was remediated and borrowing was re-enabled the next morning. All of this occurred within 16 hours of identifying the vulnerability.

Future Security Measures

In the past the Rari Team has ensured that all code in production is scrutinized and goes through an extensive auditing process. In response to the identified vulnerability Rari will be taking a series of enhanced security measures. First, Rari Capital engineers are currently conducting extensive internal reviews of the Fuse codebase. Rari Capital and Fei Protocol have also merged their respective Immunefi bug bounties into one joint Tribe DAO bug bounty.

Rari greatly appreciates the relationship and collaboration with the white hat community and with many of DeFi’s top security engineers. Thank you to all who assisted in identifying and nullifying this vulnerability and to the community who continues to contribute and support as we move forward stronger than ever.

Elipmoc: Advanced Decompilation of Ethereum Smart Contracts

Smart contracts on the Ethereum blockchain greatly benefit from cutting-edge analysis techniques and pose significant challenges. A primary challenge is the extremely low-level representation of deployed contracts. We present Elipmoc, a decompiler for the next generation of smart contract analyses. Elipmoc is an evolution of Gigahorse, the top research decompiler, dramatically improving over it and over other state-of-the-art tools, by employing several high-precision techniques and making them scalable. Among these techniques are a new kind of context sensitivity (termed “transactional sensitivity”) that provides a more effective static abstraction of distinct dynamic executions; a path-sensitive (yet scalable, through path merging) algorithm for inference of function arguments and returns; and a fully context sensitive private function reconstruction process. As a result, smart contract security analyses and reverse-engineering tools built on top of Elipmoc achieve high scalability, precision and completeness.

Elipmoc improves over all notable past decompilers, including its predecessor, Gigahorse, and the state-of-the-art industrial tool, Panoramix, integrated into the primary Ethereum blockchain explorer, Etherscan. Elipmoc produces decompiled contracts with fully resolved operands at a rate of 99.5% (compared to 62.8% for Gigahorse), and achieves much higher completeness in code decompilation than Panoramix—e.g., up to 67% more coverage of external call statements—while being over 5x faster. Elipmoc has been the enabler for recent (independent) discoveries of several exploitable vulnerabilities on popular protocols, over funds in the many millions of dollars.

Read more

The Dedaub Watchdog Service

The Dedaub Watchdog is a technology-driven continuous auditing service for smart contracts.

What does this even mean? “Technology-driven”? Is this a buzzword for “automated”? Do you mean I should trust a bot for my security? (You should never trust security to just automated solutions!) And “auditing” means manual inspection, right? Is this really just auditing with tools?

Let’s answer these questions and a few more…

Watchdog brings together four major elements for smart contract security:

  • automated, deep static analysis of contract code
  • dynamic monitoring of a protocol (all interacting/newly deployed contracts, current on-chain state, past and current transactions)
  • statistical learning over code patterns in all contracts ever deployed in EVM networks (Ethereum, BSC, Avalanche, Fantom, Polygon, …)
  • human inspection of warnings raised.

All continuously updated: if a new vulnerability is discovered, the most natural question is “am I affected?” Watchdog queries are updated to detect this and warn you.

Is it effective? Let’s just say, it is exactly the technology that we have been using internally for a little over a year. It has resulted in many disclosures of vulnerabilities in deployed contracts and 9 high-value bug bounties totaling over $3M. (Bounties by DeFi SaverDinngo/FurucomboPrimitiveArmorVesperBT FinanceHarvestMultichain/AnyswapRari/Tribe DAO.)

Analysis

At Dedaub, we have audited thousands of smart contracts, comprising tens of high-value DeFi protocols, numerous libraries and Dapps. Our customers include the Ethereum Foundation, Chainlink, Immunefi, Nexus Mutual, Liquity, DeFi Saver, Yearn, Perpetual, and many more. Since 2018, we’ve been operating contract-library, which continuously decompiles all smart contracts deployed on Ethereum (plus testnets, Polygon, and soon a lot more).

But our background comes from deep program analysis. The Dedaub founders are top researchers in this space, with tens of research publications. (Here are a few recent ones, specifically on our smart contract analysis technology — including the main paper on the technology behind the Watchdog analyses.)

The Watchdog service brings together all our expertise: it captures much of the experience from years of smart contract auditing as highly-precise static analyses. It is an analysis service that goes far beyond the usual linters for mostly-ignorable coding issues. It finds real issues, with high fidelity/precision.

So, what does Watchdog analyze for? There are around 80 analyses at the time of writing, in mid-2022. By the time you read this, there will likely be several more. Here are a few important ones for illustration.

DeFi-specific analyses

  • Is there a swap action on a DEX (Uniswap, Sushiswap, etc.) that can be attacker-controlled, with respect to token, timing manipulation, or expected returned amount? Such analyses of the contract code are particularly important to combine with the current state of the blockchain (e.g., liquidity in liquidity pools) for high-value vulnerability warnings. More on that in our “dynamic monitoring”, later.
  • Are there callbacks from popular DeFi protocols (e.g., flash loans) that are insufficiently guarded and can trigger sensitive protocol actions?
  • Are there protection schemes in major protocols (e.g., Maker) that are used incorrectly (or, more typically, with subtle assumptions that may not hold in the client contract code)?

Cryptographic/permission analyses

  • Does a permit check all sensitive arguments?
  • Does cryptographic signing follow good practices, such as including the chain id in the signed data? (If not, is the same contract also deployed on testnets/other chains, so that replay attacks are likely?)
  • Can an untrusted user control (“taint”) the arguments of highly sensitive operations, such as approvetransfer, or transferFrom? If so, does the contract have actual balances that are vulnerable?

Statistical analyses

  • Compare the contract’s external API calls to the same API calls over the entire corpus of deployed contracts. Does the contract do something unusual? E.g., does it allow an external caller to control, say, the second argument of the call, whereas the majority of other contracts that make the same call do not allow such manipulation?
    Such generic, statistical inferences capture a vast array of possible vulnerabilities. These include some we have discussed above: e.g., does the contract use Uniswap, Maker, Curve, and other major protocols correctly? But statistical observations also capture many unknown vulnerabilities, use of less-known protocols, future patterns to arise, etc.

Conventional analyses

  • Watchdog certainly analyzes for well-known issues, such as overflow, reentrancy, unbounded loops, wrong use of blockhash entropy, delegatecalls that can be controlled by attackers, etc. The challenge is to make such analyses precise. Our technology does exactly that.

Yet-unknown vulnerabilities

We continuously study every new vulnerability/attack that sees the light of day, and try to derive analyses to add to Watchdog to detect (and possibly generalize) the same vulnerability in different contracts.

Monitoring

No matter how good a code analysis is, it will nearly never become “high-value” on its own. Most of the above analyses become actionable only when combined with the current state of the blockchain(s). We already snuck in a couple of examples earlier. An analysis that checks if a contract allows an untrusted caller to do a transferFrom of other accounts’ funds is much more important for contracts that have allowances from other accounts. A warning that anyone can cause a swap of funds held in the contract is much more important if the contract has sizeable holdings, so that the swap is profitable after tilting the AMM pool. An analysis that checks that a signed message does not include a chain id is much more important for contracts that are found to be deployed on multiple chains.

Combining analysis warnings with the on-chain state of the contract (and of other contracts it interacts with) is precisely the goal of Watchdog, and how it can focus on high-promise, high-value vulnerabilities.

Inspection

Automation is never the final answer for security. Security threats exist exactly because they can arise at so many levels of abstraction: from the logical, protocol, financial level, all the way down to omitting a single token in the code. Only human intelligence can offer enough versatility to recognize the potential for sneaky attacks.

This is why Watchdog is a technology-driven continuous auditing service. It can issue warnings that focus a human’s attention to the most promising parts of the code. By inspecting warnings, the human auditor can determine whether they are likely actionable and escalate to protocol maintainers.

We call Watchdog auditors “custodians”. The custodian of a protocol is not just a code auditor, but the go-to person for all warnings, all contacts to Dedaub and to other security personnel. By subscribing to Watchdog, a project gets its designated custodian who monitors warnings, knows the contact points and how to escalate reports, coordinates with any incident response team (either in place, or ad hoc, either external, or as part of Dedaub services), and ultimately advises on the project’s security needs.

In terms of software alone, Watchdog integrates two ideas to help a custodian inspect and prioritize warnings:

  • The concept of protocols: all contracts monitored are grouped into protocols, based on deployers and interactions. Any new contracts that get deployed are automatically grouped into their protocol and monitored. Reports and watchlists are easy to define to match the project’s needs.
  • Flexibility in the amount of warnings issued: Watchdog comes with different levels of service. The minimum level gets roughly a couple of hours per week of a custodian’s time. At this level, the custodian will likely only issue the highest-confidence warnings and inspect them very quickly.
    The next level of support, intended to be the middle-of-the-road offering, covers roughly two auditor-days per month. At that level, the custodian can spend significant time, at least every couple of weeks, to inspect a broader range of warnings. Watchdog supports this configurability seamlessly: it lets the custodian select warning kinds and mix them with many filters, to produce an inspection set that is optimal for covering in a given amount of time.

Contact Us … Soon

The Watchdog service has already had a handful of early institutional adopters (such as Nexus Mutual and the Fantom blockchain, both securing multiple protocols). We are currently enhancing our infrastructure and organizational capability, to launch Watchdog to broad availability (for individual protocols and not just institutional clients) by the end of 2022. You will be able to make inquiries and book a demo or live technical presentation with our team on the Dedaub Watchdog page.

Phantom Functions and the Billion-dollar No-op

By the Dedaub team

On Jan. 10 we made a major vulnerability disclosure to the Multichain project (formerly “AnySwap”). Multichain has made a public announcement that focuses on the impact on their clients and mitigation. The announcement was followed by attacks and a flashbots war. The total value of funds currently lost is around 0.5% of those directly exposed initially.

[ADVISORY: If you have ever used Multichain/Anyswap, check/revoke your approvals for vulnerable tokens. Make sure to check all chains and read the full instructions if anything is unclear.]

We will document the attacks and defense in a separate chronology, to be published after the threat is fully mitigated. This brief writeup instead intends to illustrate the technical elements of the vulnerability, i.e., the attack vector.

The attack vector is, to our knowledge, novel. The Solidity/EVM developer and security community should be aware of the threat.

In the particular case of Multichain contracts, the attack vector led to two separate, major vulnerabilities, one mainly in the WETH (“Wrapped ETH”) liquidity vault contract (an instance of AnyswapV5ERC20) and one in the router contract (AnyswapV4Router) that forwards tokens to other chains. The threat was enormous and multi-faceted — almost “as big as it gets” for a single protocol:

  • On Ethereum alone, $431M in WETH would be stolen in a single, direct transaction, from just 3 victim accounts. We demonstrated this on a local fork before the disclosure. (Balances and valuations are as of the time of original writing of this explanation, on Jan.12. The main would-be victim account, the AnySwap Fantom Bridge, was holding over $367M by itself. At the time of publication of this article, the same contract held $1.2B.)
  • The same contracts have been deployed for different tokens and on several blockchains, including Polygon, BSC, Avalanche, Fantom. (Liquidity contracts for other wrapped native tokens, such as WBNB, WAVAX, WMATIC are also vulnerable.) The risk on these other networks was later estimated at around $40M.
  • The main would-be victim account, the AnySwap Fantom Bridge, escrows tokens that have moved to the Fantom blockchain. This means that an attacker could move any sum to Fantom and then steal it back on Ethereum, together with the current $367M of the bridge (and the many tens of millions from other victims, separately). The moved tokens would still be alive (and valuable) in Fantom, or anywhere else they have since moved to. This makes the potential impact of the attack theoretically unbounded (“infinite”): any amount “invested” can be doubled, in addition to the $431M amount stolen from Ethereum victims and however much on other chains.
  • Close to 5000 different accounts had given infinite approval for WETH to the vulnerable contracts (on Ethereum). This number has since dropped substantially (especially among accounts with holdings), but there is still a threat: any WETH these accounts ever acquire is vulnerable, until approvals are revoked.

Given the above, the potential practical impact (had the vulnerability been fully exploited) is arguably in the billion-dollar range. This would have been one of the largest hacks ever—given the theoretically unbounded threat, we are not getting into more detailed comparisons.

Phantom Functions | Attack Vector

Briefly:

Callers should not rely on permit reverting for arbitrary tokens.

The call token.permit(...) never reverts for tokens that

  • do not implement permit
  • have a (non-reverting) fallback function.

Most notably, WETH — the ERC-20 representation of ETH — is one such token.

We call this pattern a phantom function— e.g., we say “WETH has a phantom permit” or “permit is a phantom function for the WETH contract”. A contract with a phantom function does not really define the function but accepts any call to it without reverting. On Ethereum, other high-valuation tokens with a phantom permit are BNB and HEX. Native-equivalent tokens on other chains (e.g., WBNB, WAVAX) are likely to also exhibit a phantom permit.

In more detail:

Smart contracts in Solidity can contain a fallback function. This is the code to be called when any function f() is invoked on a contract but the contract does not define f().

In current Solidity, fallback functions are rather exotic functionality. In older versions of Solidity, however, including fallback functions was common, because the fallback function was also the code to call when the contract received ETH. (In newer Solidity versions, an explicit receive function is used instead.) In fact, the fallback function used to be nameless: just function(). For instance, the WETH contract contains fallback functionality defined as follows:

function() public payable {
      deposit();
}
function deposit() public payable {
      balanceOf[msg.sender] += msg.value;
      Deposit(msg.sender, msg.value);
}

This function is called when receiving ETH (and just deposits it, to mint wrapped ETH with it) but, crucially, is also called when an undefined function is invoked on the WETH contract.

The problem is, what if the undefined function is relied upon for performing important security checks?

In the case of AnySwap/MultiChain code, the simplest vulnerable contract contains code such as:

function deposit() external returns (uint) {
    uint _amount = IERC20(underlying).balanceOf(msg.sender);
    IERC20(underlying).safeTransferFrom(msg.sender, address(this), _amount);
    return _deposit(_amount, msg.sender);
}
...
function depositWithPermit(address target, uint256 value, uint256 deadline, uint8 v, bytes32 r, bytes32 s, address to) external returns (uint) {
    IERC20(underlying).permit(target, address(this), value, deadline, v, r, s);
    IERC20(underlying).safeTransferFrom(target, address(this), value);
    return _deposit(value, to);
}

This means that the regular deposit path (function deposit) transfers money from the external caller (msg.sender) to this contract, which needs to have been approved as a spender. This deposit action is always safe, but it lulls clients into a false sense of security: they approve the contract to transfer their money, because they are certain that it will only happen when they initiate the call, i.e., they are the msg.sender.

The second path to depositing funds, function depositWithPermit, however, allows depositing funds belonging to someone else (target), as long as the permit call succeeds.

For ERC-20 tokens that support it, permit is an alternative to the standard approve call: it allows an off-chain secure signature to be used to register an allowance. The permitter is approving the beneficiary to spend their money, by signing the permit request. The permit approach has several advantages: there is no need for a separate transaction (spending gas) to approve a spender, allowances have a deadline, transfers can be batched, and more.

The problem in this case, as discussed earlier, is that the WETH token has a phantom permit, so the call to it is a non-failing no-op. Still, this should be fine, right? How can a no-op hurt? The permit did not take place, so no approval/allowance to spend the target’s money should exist.

Unfortunately, however, the contract already has the approvals of all clients that have ever used the first deposit path (function deposit)!

All WETH of all such clients can be stolen, by a mere depositWithPermit followed by a withdraw call. (To avoid front-running, an attacker might split these two into different transactions, so that the gain is not immediately apparent.)

Phantom Functions | Notes:

Two separate vulnerabilities are based on the above attack vector. The first was outlined above. The second, on AnySwap router contracts, is a little harder to exploit — requires impersonating a token of a specific kind. We do not illustrate in detail because the purpose of this quick writeup is to inform the community of the attack vector, rather than to illustrate the specifics of an attack.

We have exhaustively searched for other services with similar vulnerable code and exposure. This includes vulnerable contracts with approvals over tokens with phantom permits other than WETH . Although we have found other instances of the vulnerable code patterns, the contracts currently have very low or zero approvals on Ethereum. (This kind of research is exactly what our contract-library.com analysis infrastructure lets us do quickly.) On other chains, our search has not been as exhaustive, since we have no readily indexed repository of all deployed contracts. However, our best indicators suggest that there is no great exposure outside the AnySwap/Multichain contracts.

Concluding

We have been awarded Multichain’s maximum published bug bounty of $1M for each of the two vulnerability disclosures. (Thank you for the generous recognition of this extraordinary threat!)

This was an attack discovered by first suspecting the pattern and then looking for it in actual deployed contracts. Although in hindsight the attack vector is straightforward, it was far from straightforward when we first considered it. In fact, our initial exchange, at 2:30am on a Sunday, was literally:

I had a crazy idea for a vulnerability. Want to sanity check the basics?

Crazy, indeed, how this could lead to one of the largest hacks in history.

Etheria | A Six-year-old Solc Riddle

By the Dedaub team

The Assignment

A few weeks ago, we were approached with a request to work on a project unlike any we’ve had before.

Cyrus Adkisson is the creator of Etheria, a very early Ethereum app that programmatically generates “tiles” in a finite geometric world. Etheria has a strong claim to being the first NFT project, ever! It was first presented at DEVCON1 and has been around since October 2015 — six years and counting. It is as much Ethereum “history” as can get.

Cyrus heard of us as bytecode and decompilation experts. His request was simple: try to reproduce the 6-yr old deployed bytecode of Etheria from the available sources. This is a goal of no small importance: Etheria tiles can be priced in the six digits and the history of the project can only be strengthened by tying the human-readable source to the long-running binary code.

Easy, right? Just compile with a couple of different versions and settings until the bytecode matches. Heck, etherscan verifies contracts automatically, why would this be hard?

Maybe for the simple fact that Cyrus had been desperately trying for months to get matching bytecode from the sources, to no avail! Christian Reitwiessner, the creator of Solidity and solc, had been offering tips. Yet no straightforward solution had been in sight, after much, much effort.

To see why, consider:

  • The version of solc used was likely (but not definitely) 0.1.6. Only one build of that version is readily available in modern tools (Remix) but the actual build used may have been different.
  • The exact version of the source is not pinned down with 100% confidence. The source code available was committed a couple of days after the bytecode was deployed.
  • Flags and optimization settings were not known.
  • The deployed code was produced by “browser-solidity”, the precursor of Remix. Browser-solidity is non-deterministic with respect to (we thought!) blank space and comments. (“Unstable” might be a better term: adding blank space seems to affect the produced bytecode. But we’ll stick with “non-deterministic”, since it’s more descriptive to most.)
  • Solc was not deterministic even years later.

If you want to try your hand at this, now is a good time to pause reading. It’s probably safe to say that after a few hours you will be close to convinced that this is simply impossible. Too many unknowns, too much unpredictability, produced code that is tantalizingly close but seemingly never the same as the deployed code!

Dead Ends

Our challenge consisted of finding source code or a compilation setup that would produce compiled code identical to the 6-yr old deployed Etheria bytecode.

The opening team message for the project set the tone: “We are all apprehensive but also excited about this project. It looks like a great mystery, but also a very hard and tedious search that will possibly fail.” Intellectual curiosity soon overtook all involved. People were putting aside their regular weekly assignments, working overtime to try to contribute to “the puzzle”. Some were going down a rabbithole of frantic searches for days — more on that later.

Some encouraging signs emerged at the very first look of the compiled code: when reverse-engineered to high-level code (by the lifter used in contract-library.com) it produced semantically identical Solidity-like output.

But our hopes were squashed upon more careful inspection. The low-level bytecode would always have small but significant differences. Some blocks were re-used (via jumps) in the deployed version but replicated (inlined) in the compiled version. The ordering of blocks would always be a little different. Even matching the bytecode size was a challenge: with manual tries of the (non-deterministic) compilation process, we would almost never get the deployed bytecode size down to the byte, always 2–4 bytes away.

Dead ends started piling up, but every one was narrowing the search space.

  • The version of solc used was definitely 0.1.6, based on the timeline of releases. However, the exact build might have made a difference. And, in fact, our compiler was not solc but solc-js, the Javascript wrapper of solc. There are 17 different versions of solc-js v0.1.6. There are even different versions with the exact same filename — e.g., there are 4 different builds (different md5 hashes) all called soljson-v0.1.6–2015–11–03–48ffa08.js. However, no optimizations or compilation gadgets that would explain the difference were introduced in the different builds. We could see no correlation between the compiled code artifacts and the exact build of solc-js, just the occasional non-determinism.
  • Different optimization settings made too-drastic a difference. Browser-solidity did not even allow configuring optimization runs, so the only question was whether optimization was on at all, and it very clearly was, based on the deployed bytecode.
  • Non-determinism seemed to creep in, even for changes as simple as the filename used.

With so little left to try, frustration started building up. Was this just a random search? And over what? Blank space in the compiled file? Reordering of functions? Small source code changes that yielded equivalent code? Removing dead code from the source?

We tried many of these, ad hoc or systematically and the tiny but persistent differences from the deployed bytecode never went away. Private function reordering looked very promising for a little while. But a full match was nowhere to be seen.

A Breakthrough

Although still several days away from the solution, an important insight arose, after lots of trial and error.

Non-determinism was due to solc-js, the Javacript wrapper, not to individual invocations of the solc executable itself. Solc-js is using emscripten to run the native solc executable inside a Javascript engine. Emscripten back in the day was translating a binary into asm.js (not yet WASM). Something in this pipeline was introducing non-determinism.

But what kind? Since the solc executable was itself deterministic when invoked freshly, the insight was that the apparent non-determinism of solc-js depended on what had been compiled before, and not only on no-op features of the compiled file (e.g., comments, blanks, filename)! In fact, we saw blank space in the compiled file rarely make a difference in the output bytecode. However, earlier compiled files reliably affected the later output.

Christian Reitwiessner later confirmed that non-determinism was due to using native pointers (i.e., machine addresses) as keys in data structures, so that a re-run from scratch was likely to appear deterministic (objects were being allocated to the same positions on an empty address space) whereas a followup compilation depended on earlier actions.

We now had a more reliable lever to apply force to and cause shuffles in the compiled bytecode. And we could get systematic — we would basically be fuzzing the compiler! Our workhorse was the functionality below, which you can try for yourself:

https://dedaub.com/etheria/fuzzer.html

Open the dev console on your browser (F12) and hit “go”. It starts compiling random (unrelated) files before it tries the main file we aim to verify. The (pseudo-)randomization process is controlled by a seed, so that if a match is found it can be reproduced. The seed gets itself updated pseudo-randomly and the process repeats. The output looks like this:

current seed 449777 automate.js:116:11
...
compile 0 NEW soljson-v0.1.6-2015-10-16-d41f8b7.js etheria-1pt2_nov4.sol(1968d2bc81cfd17cd7fd8bfc6cbc4672) 1700cdc5e2c5fbb9f4ca49fe9fae1291 -4 5796 automate.js:72:11
--------- NEW BEST ------------- 5796 1700cdc5e2c5fbb9f4ca49fe9fae1291

Notice the highlighted parts: this compilation output is “new” (i.e., not seen with previous seeds), has a length distance of -4 from the target, and an edit distance of 5796, which is a new best.

If you observe the execution, you can appreciate how much entropy is revealed: new solutions arise with high frequency, even many hours into the execution.

This got our team tremendously excited. Our channel got filled with reports of “new bests”. 0-605 (same size, edit distance 605) gave way to 0-261. We requisitioned our largest server to run tens of parallel headless browser jobs using selenium. The 0–261 record dropped to 0–150. On every improvement we thought “it can’t get any closer!” And with many parallel jobs running for hours, we let the server crunch for the night.

Finale

The next morning found the search no closer to the goal. 0–150 was derived from several different seeds. This is just a single basic block of 5 instructions in the wrong place, which also causes some jump addresses to shift. But still, not a zero.

By the next evening, it was clear that our fuzzing was running out of entropy. A little disheartened, we tried an “entropy booster” of last resort: adding random whitespace to the top and bottom of all compiled programs. (In fact, this improved-entropy version is the one at the earlier link.) Within hours, the “new best” became 0–114! And yet the elusive zero was still to be seen. Could it be that we would never find it?

Nearly 24 hours later, with the server fuzzing non-stop during the weekend, the channel lit up:

GUYS

WE GOT IT

All that was left was cleanup,tightening, and packaging. We soon had a solution that required merely compiling two unrelated files before the final compilation of the Etheria code. We repeated the process with the compiler running in multi-file mode. We found similar seeds for present-day Remix. Everything became streamlined, optimized, easy to reproduce and verify. You can visualize a successful verification (for one compiler setup) here:

https://dedaub.com/etheria/verify.html

We notified Cyrus a couple of hours later. It was great news, delivered on a Saturday, and the joy was palpable. We had a Tuesday advising call with Christian that was quickly repurposed to be a storytelling and victory celebration. Within a few days, etherscan verified the contract bytecode manually, since solc 0.1.6 is too old to be supported in the automated workflow.

Looking back, there are a few remarkable elements in the whole endeavor. The first is the amount of random noise amplified during a non-deterministic compilation. For a very long time, the sequence of “new” unique compiled bytecodes seemed never ending. A search that now seems, clearly, feasible appeared for long to be hopeless. Another interesting observation is how quickly people got wrapped up into a needle-in-a-haystack search. The challenge of a tough riddle does that to you. Or maybe it was some ancient magic from the faraway Etheria land?