Protecting Your Privacy: Understanding Traces You Leave For An Adversary
Whenever you’re online, you’re leaving small markers of identity, several different kinds and on several different levels. This is necessary for everyday communication on the Internet – for example, if you ask a web server for a web page, that web server will need a way to send the web page you asked for back to you, so it needs some sort of address to send it to. However, people who have made it a specialty to violate other people’s privacy are experts in coupling – that’s a key word here – connecting different identity points and data points with each other, and so, may know a whole lot about you the very first time you visit a web page.
This coupling can be compared to an anonymous travel card for public transit. Most cities with public transportation will allow you to purchase cards or tokens with pre-filled travel credit – tokens that can be topped up later. But the first time you do so with a personal credit card, your travel token is not anonymous any more. When doing so, you create a log of who is using this travel token – you’re allowing for a coupling between your previously-anonymous travel token and your personal credit card – and you’re not only traceable with all future travels on that particular token, but also on any travels you’ve done in the past, before you made the mistake of using a personal credit card to top up.
This mindset is key: What data is available for an adversary to connect dots about me?
In order to have a chance at defeating such an adversary’s connecting dots, you need to first know about the dots – traces – you leave behind for connecting. You’re leaving traces in three different principal layers.
Your MAC address identifies your computer.
The first trace you leave behind is your MAC address. This is unique to your computer, or more specifically, to your network circuit. (If you have a computer with both wired and wireless networking, the two will have different MAC addresses.) But as your network circuits are generally embedded into your computer, for all intents and purposes, this can be used to identify your computer.
Just as a curiosity, should you ever see one, a MAC address looks like twelve hexadecimal digits in groups of two, written like 12:45:78:0A:BC:DF.
The MAC address is used to connect your computer to an IP address in all kinds of places with public wi-fi. But wait a minute – don’t a lot of these public hotspots ask you to identify and create an account in order to get online? Yes, they do. And that means they now know the coupling between your computer’s MAC address and your name and address which you just entered. (There’s a point to not giving out your identity but outright lying here, if you can.)
There are more traps to the MAC address – it’s also broadcast all the time from your computer while wi-fi is active. This means somebody who knows your MAC address and has access to a number of wireless access points along your typical path will be able to track your movements and know your location. Here, “computer” also means “phone with wi-fi activated”.
The MAC address stays the same even between reformats and reinstallations of operating systems, even between switches to completely different types of operating systems on the same machine. It’s in the network integrated circuits. This means that if you’ve used a computer for something where anonymity is required – whistleblowing, for example – and you wipe the computer clean of all traces, reinstall, maybe even replace the hard drive – if you identify as yourself on that computer ever again in the same network environment, maybe by logging onto Facebook or checking your mail, you’ll have created a coupling between your identity and the previously-anonymous activity.
It’s also possible to locate a wi-fi broadcast after a complete wipe. Consider, for example, if a national telco had a grid of hotspots in a hostile regime, where documents had been leaked showing human rights abuses, and that telco was ordered to report any presence of a certain MAC address? It would usually mean an instant ping, as long as that computer was merely on and had its wi-fi activated – regardless of whether it was in a previously used network setting. This is the modern equivalent of the CSI’s typical “any use of their credit card”, only much faster and wider in scope. For the record, I’m not aware of this ever having been used, but it’s technically possible, and should therefore be considered.
(It’s sometimes technically possible to change or forge the MAC address, but if you get it wrong in any way, shape, or form, the game is up, so let’s leave that out for now.)
Your IP address is your address on the net.
Your IP address is the layer above the MAC address. It is sometimes thought of as the “network” address, compared to the MAC address which is the “physical” address.
This layer is where the coupling to your identity usually takes place. An IP address is assigned to you by your Internet Service Provider, be it at a public wi-fi or via an Internet subscription of some sort. In any case, it’s worth noting that little detail – the IP address reveals the subscriber, and not the actual person communicating. When you’re in a café, what’s visible outward is typically the café’s IP address, which you share with all the other current patrons. When in a household, you typically share the address with other members of that household.
This link – the coupling between your IP address and something close enough to your actual identity – is usually where you want to aim to get anonymity and privacy. Websites and various three-letter agencies who have no business digging around in your private correspondence have two pieces of data to try tracking you down: they have an IP address and they have a timestamp. (IP addresses can be changed, reassigned, and rotated on a swift basis, so the timestamp is important.)
When this IP address leads to a public wi-fi hotspot, if the agencies are embarrassed enough by the documents you just leaked, the timestamp can often be correlated with the hotspot’s IP address assignment logs – IP addresses are assigned to MAC addresses, remember? – giving a small number of candidate MAC addresses which were active during the time when the irritating leak took place. (This leads us back to an earlier observation: if you allowed the hotspot to couple your real identity with your MAC address, the game is up at this point and you’ve been identified.)
The IP address may also lead to a regular Internet Service Provider, which may or may not – depending on legislation and their policy – surrender subscriber data for the IP address and timestamp to said agencies.
The trick, therefore, is to use an IP address that has no coupling at all to your identity. None. (We’ll look at tools for this in the next article.)
Last, there’s all the data you enter yourself – about yourself.
Websites don’t see your MAC address, but they see your IP address. (Unless you’re connecting using the newer IP address, version six, in which case they normally see your MAC address too. But this is still rather rare.)
When you’re ordering something, you fill in your name and shipping address for the merchant to ship the goods, yes? So now that site has a coupling between your IP address and your identity. If you tried to hide that coupling carefully all along, now somebody knows anyway. And if somebody knows, you should assume the data isn’t safe anymore – the most sensitive records leak by the million. If one company knows, it’s best to assume it’s public knowledge.
However, there’s also the possibility your merchant is willingly sharing your data with plenty of other merchants, even before any leak. Along with some tracking beacon they just put onto your browser, so the next merchant you shop from will have your name and address filled in already. You know, “for convenience”.
And that doesn’t even start getting into the ad networks. Have you ever noticed that you search for something once on Amazon, and then ads for that item show up on pretty much every single web page with ads that you visit? These ad networks don’t just know who you are, they know what you didn’t just buy and maybe still want.
The point here is that sites share what they know about you. One has your email, one has something else. It adds up. Every single piece of data you give away, you need to consider if it can be used against you in some manner. (For example, if you’re using an anonymous connection to log on to a whistleblower forum, and then registering there using a GMail address that looks like [email protected], you’re giving out far more data than you should be giving out – and people have been tracked and put in jail for life in hostile regimes because of that exact mistake with their GMail address on a forum.)
It’s really important to understand that any statement, promise, or obligation about how data you enter about yourself will be used isn’t worth anything at all. The company giving the promise may go bankrupt, in which case the promise is dead and the chapter 11 liquidator is trying to see everything as assets to be monetized. An agency could make a lawful seizure of the data, for example with a subpoena regarding something mildly related. Or they could just get a new CEO who doesn’t care about promises. (Assuming their current CEO does!)
Also, no matter what the purpose initially was, your data was intercepted by at least three governmental surveillance agencies as you entered it, who will use it for whatever purpose they like. They need not have intercepted the transmission if you were using https; they may just as well have placed a bug on the servers behind the decryption of the connection, or tapping private lines of that merchant’s network. They’ve been known to do so.
The only thing that matters is whether you have given an adversary an ability to connect dots in a straight line between a piece of data they have through multiple other dots and all the way up to your identity. And it’s you who must keep track of what traces you have left for adversaries to use for that purpose.
Comments are closed.
Then how are you supposed to trust a VPN provider if you shouldn’t trust anyone? They have your IP address. Do they have your MAC address when it tunnels straight through routers and firewalls, where it otherwise stops? I suppose so. And, most importantly, they have free access to all transmitted and received data since it is they who communicates with web servers on your behalf!
Sounds like the first place for a 3LA to monitor, and the most lucrative place for a dishonest CEO or employee to sell data to advertizers!