« Magnetic Mysteries—The “Simple Magnetic Over-Unity Toy” | Main | 1915 Vintage Western Electric “Candlestick” Phone »

Monday, October 18, 2021

CONTINUITY: Fat Finger—More Than 12,000 Ethereum Lost to Typos

Unlike Bitcoin public addresses, which incorporate a 32 bit checksum, the original specification for Ethereum public addresses was simply strings of 40 hexadecimal digits, for example 0xc9b83ab54c84aac4445b56a63033db3d5b017764. If somebody attempts to send funds to such an address and accidentally mistypes or transposes even a single digit of the address, the funds will be sent to an address whose private key is unknown and which is computationally intractable to discover (there are 1640≈1048 possible Ethereum addresses) and thus lost forever. Obviously, it is a poor idea to type in such an address, and errors in optical scanning, text editors, and cut and paste mechanisms all pose risks of error.

In 2018, Johannes Pfeffer decided to estimate the quantity of Ether (the name for the currency of the Ethereum system) lost by having been sent to mistyped addresses. The methodology was clever and simple: search the blockchain for pairs of addresses, both of which had received funds, but which differed only by one character. An address of such a pair which had no outgoing transactions was almost certainly a typographical error entering the other, because the probability of two such similar addresses being generated from independent known private keys is comparable to that of guessing the private key from a public address. He reported the results in “Over 12,000 Ether Are Lost Forever Due to Typos”.

As of the date of his study, 2,674 typos were found, affecting 2,053 accounts, with total funds lost amounting to ETH 12,622, which at this writing has a value in excess of US$ 47 million (when he did his study, it was “only” US$ 8.84 million). All of these funds have gone to the great bit bucket in the sky, never to be seen again.

It's odd that Ethereum addresses weren't designed from the outset to incorporate a checksum, especially since International Bank Account Numbers (IBAN) and Bitcoin addresses which pre-date Ethereum both include checksums. The reasoning appears to have been that the hexadecimal addresses would not be directly used by humans, but rather encoded forms such as the IBAN-compatible ICAP or through a domain name like system such as now exists with the Ethereum Name Service. But, in fact, Ethereum wallets and individuals went ahead and used the hexadecimal addresses without checksums, and the consequences were predictable.

In 2016, this situation became sufficiently embarrassing that Ethereum Improvement Proposal EIP-55, “Mixed-case checksum address encoding” specified a checksum of sorts, in which a hash of the original address is encoded in hexadecimal digits between “A” and “F” by writing them in upper or lower case letters. This provides an average of 15 check bits per address, which reduces the probability of an error not being detected to 0.0247%, which is around fifty times better than the two digit IBAN checksum. Almost all Ethereum clients now express addresses in this form and check any submitted address which contains mixed case hexadecimal digits. For compatibility, however, un-checksummed addresses with uniform case hexadecimal digits continue to be accepted.

It would be interesting to repeat the typo analysis and see what effect the introduction and widespread use of checksummed addresses has had on the rate and magnitude of losses to typos.

Posted at October 18, 2021 12:00