De-anonymize Bitcoin

How to De-anonymize Bitcoin

Well that’s a funny name for a topic given that the last lecture basically declared that bitcoin was just pseudoanonymous and even then reading through Freedom-to-Thinker posts, it made me think the ecosystem would bank on this fact. The amount of information that can be gleaned if all ecommerce companies use Bitcoin could be an advertiser’s dream. This lecture is once again an overview but substantially shorter than last time.

Questions answered in this Post:

  • How can all your transactions be linked together?
  • What is shared addresses?
  • What does “Idioms of use” mean?
  • How else can people be de-anonymized?
  • References to Research Papers from Lecture

Linking all transactions?

The gist of this section is that one can try to prevent having someone link all your transactions by just generating a new address each time. It is easy to recreate a new public key based on your public key. However, it is not unlinkable. If for example the multiple bitcoin from different address get combined together and spent in a single transaction, someone can infer that they are coming from the same private key. Shared spending is then evidence of joint control since the addresses can be linked transitively.

The example given by the research paper is pretty scary. This was done in a paper by Reid and Harrigan, Analysis of Anonymity in the Bitcoin System. The paper combined two networks of both the transaction and user networks from Bitcoin’s public transaction history and were able to investigate a Bitcoin theft that occurred June 2011. Looking at the paper, it’s interesting that they are able to paint a clear story where 60 transaction involving 441.83 BTC were moved on a 70 days period and it total amounted to 25,000 BTC. Based on their analysis, it seems that one is unable to hide or be anonymous in Bitcoin. It is called transaction graph analysis. From the lecturer though, there is some level of probability to determine which address is actually mapped to the user. Thus, this means a second tool is also needed.

“Idioms of Use”

Idioms of use refers to particular features used in wallet software such as each address is used only once as change. This technique was used by Meiklejohn in Fistful of Bitcoin: Characterizing Payments Among Men with No Names. Within their paper, they used two main user heuristics to do account clustering. First was to treat different public keys used as inputs to a single transaction as being controlled by the same user. (This was what the lecturer used in the first case with the tea pot for 8 BTC.) The second was the change addresses tend to be used only once and likely unknown from the actual user. This allowed them to collapse users clusters. Additionally, from 344 transactions to mining pools, wallet services, exchanges, vendors, and gambling sites, they were able to cluster s the main providers/users of the system. From their analysis, they claim that agencies with subpoena power would be capable of determine who is paying money to whom in the Silk Road wallet and other Bitcoin thefts. According to the lecturer, this was a tedious and sometimes manual process. One way to determine owners is that people would self-label them manually in Bitcoin forums. Thus they were able to find the cluster of keys associated with Mt. Gox based on their account clustering.

From these crypto currencies to mapping to real-life identities

Now the lecturer poses another thought of how to find real world users based on their addresses. One could be the self-assigned people who just give up their address in the Bitcoin forum. Likely, if there is a single address posted in the forum, they are not using a new address for every transaction. Also, there is high centralization in service providers meaning that every flow will likely pass through one of them. Thus, as Meiklejohn’s paper mentioned, if someone has the subpoena power, they are able to request from the service provider that actual real-world user identity. However, there is a second layer of information which comes from networking. Specifically, “the first node to inform you of a transaction is probably the source of it”. There is a simpler way to make it more difficult which is just use Tor. Tor is used for low-latency work like just web browsing. Thus the lecturer suggests to use Mix nets, routing protocols that make communication pass through multiple proxies thus breaking a link between the source of a request and the destination. Interestingly, the concept of mix networks also came from David Chaum. Tor is one application of this which is onion routing.

References to Research Papers from Lecture

Wrap-up

Which of the following observations would you suggest that addresses A and B may be controlled by the same user/ entity?

  • There is a transaction as input addresses.
  • Combined Shared spending and idioms of use
  • Coin join works – violates this assumption
PHP Code Snippets Powered By : XYZScripts.com