How Blacklight illuminates the murky world of ad tracking, key logging, canvas fingerprinting, Facebook pixels, and more

Posted on Sep 30, 2020 by Glyn Moody

It is hardly news that we are being tracked as we visit Web sites, and move around the Internet. As this blog has reported, it’s the basis of today’s main online business model: using information about where we go, and what we view, in order to allow advertisers to offer highly-targeted advertising based on the profile that can be constructed from that data. That’s despite the fact that such microtargeted advertising has real risks, is not wanted by the public, and isn’t even very effective. Nonetheless, it’s clear this kind of “surveillance capitalism” is not going away anytime soon. The question is: what can we do to minimize its harmful effects?

The first thing is to be aware of what is going on – which companies are tracking us, and how.There are already software tools and services out there that are designed to help do that. But in this sphere, you can’t have too much help, so it’s good to see a new online privacy inspector being launched. It’s called Blacklight, and it comes from The Markup, which describes itself as “a nonprofit newsroom that investigates how powerful institutions are using technology to change our society.” It adopts what it calls a “Show Your Work” philosophy: “Whenever possible, we will publish the underlying datasets and code that we use in our investigations, as well as a detailed methodology describing the data, its provenance and the statistical techniques used in our analysis.” The Markup also promises that is will not use third-party tracking, and will collect as little data as possible, which won’t be sold. The title is funded by its readers and a number of institutional donors. The new tool, Blacklight, emulates how users might be tracked as they browse the Web. To use it, you simply type in a Web address, and the tool scans the site for the following types of privacy violations: third-party cookies, ad trackers, key logging, session recording, canvas fingerprinting, Facebook tracking, and Google Analytics’ “Remarketing Audiences”.

Key logging refers to when a site monitors the text that you type into a input box before you hit the submit button. Session recording is even worse, and allows everything you do on a Web page – mouse movements, clicks, scrolling down and input – to be monitored and captured. Canvas fingerprinting is a way to try to identify through the unique characteristics of their browser, revealed by looking in detail at how shapes and text are rendered. It’s a way to circumvent attempts to remain anonymous online by blocking all cookies. Finally, “Remarketing Audiences” refers to a feature of Google Analytics to allow Web sites to create a custom audience list based on user behavior. The list can then be used for targeted advertising to those visitors who are included. Having created Blacklight, The Markup team then ran it on 100,000 of the most popular Web sites in order to establish how common were techniques that could be harmful to privacy. Here’s what it found:

6 percent of websites used canvas fingerprinting.
15 percent of websites loaded scripts from known session recorders.
4 percent of websites logged keystrokes.
13 percent of sites did not load any third-party cookies or tracking network requests.
The median number of third-party cookie loads was three.
The median number of ad trackers loaded was seven.
74 percent of sites loaded Google tracking technology.
33 percent of websites loaded Facebook tracking technology.
50 percent of sites used Google Analytics’ remarketing feature.
30 percent of sites used the Facebook pixel.

Another article on The Markup site provides some examples of particularly serious privacy breaches that it found in the course of its scans:

More than 100 websites serving undocumented immigrants, domestic and sexual abuse survivors, sex workers, and LGBTQ people sent data about their visitors to advertising companies.

Eighty U.S. abortion providers loaded third-party trackers on user browsers, some of them sending data to Facebook that ended up in user profiles.

Trackers from different companies were communicating with each other to confirm the identity of visitors to a website for victims of sexual violence.

Health information websites like Everyday Health and WebMD sent user data about page visits to dozens of marketing companies.

The Arizona Department of Child Safety’s page on how to report child abuse sent data about site visitors to six ad tech companies.

Helpfully, The Markup has put together a guide to what can be done to minimize the harm from these tracking techiques. It runs through the main Web browsers – Brave, Chrome, Edge, Firefox and Safari – and their various features. It points out that even though Chrome is currently the most popular browser, its privacy protections are weaker than those of the others. It also suggests installing additional software such as Privacy Badger and Ghostery, both of which aim to block and control trackers.

It can often seem a hopeless task trying to protect privacy in the digital world. The scale and power of the Internet advertising industry, which currently depends on pervasive online surveillance techniques, is daunting. Effecting change in this area is hard, and takes time. The first step is gathering information about the the scale of the problem. As well as adding to the political pressure for better privacy protection online, it allows people to see how their personal data is being gathered and by whom, and allows counter-measures to be taken. Blacklight is a welcome addition to the range of tools that are available to help people do that.

Featured image by The Markup.