Weekly Feature: NU's Tips for Poisoning Your Data

April 21, 2021

We are constantly creating data

In 2020, every person generated 1.7 megabytes of data in just 1 second (source).

Every time you send an email, like a Tweet, or stream a new TV show, you are creating data. And it’s really hard to stop.

In 2019, Kashmir Hill, then a reporter for Gizmodo, famously tried to block out five major tech companies - Amazon, Facebook, Google, Microsoft, and Apple -  first individually, then all at once (source). Over the following weeks, she realized the true extent of how intertwined these companies are in our daily lives. Want to watch a new movie or listen to new music? Netflix, Hulu, and Spotify all rely on Amazon Web Services (AWS) or Google Cloud to distribute their content. Want to push new updates to your Github repo? That’s owned by Microsoft.

Not only are we creating massive amounts of data, but we are also generating massive paychecks for Big Tech

Every year, Google gets over $120 billion from ad revenue generated by your data.

And they don’t even send you a thank you card…

So what can we do about it?

A recent paper out of Northwestern University (Go ‘Cats!), suggests that we can use our data as a collective bargaining chip. These tech giants may have powerful algorithms at their disposal, but they are worthless without enough quality data to train them.

Ph.D. students Nicholas Vincent and Hanlin Li have proposed three ways in which the public can leverage their data against Big Tech:

  • Data strikes: This involves withholding or deleting data, so a tech firm cannot use it by using privacy tools or leaving a platform completely.
  • Data poisoning: This method is our personal favorite and involves purposefully contributing meaningless or harmful data. For example, AdNauseam is a browser extension that "clicks" on every single ad that Google sends your way, confusing their advertising algorithms.
  • Conscious data contribution: This involves giving meaningful data to a competitor of a platform you want to protest. Think Tumblr over Facebook.

These methods have already seen some real-world success. In January, millions of WhatsApp users deleted their accounts when Facebook (the owner of WhatsApp) announced that WhatsApp data would be shared with the rest of the company. This mass exodus from WhatsApp eventually caused Facebook to delay its policy change.

Similarly, Google recently announced it would stop tracking individuals across the web with targeted advertising. While there is currently some skepticism about whether or not this is a meaningful change or another Google rebrand, Vincent thinks the increased use of tools like AdNauseam may have contributed to this outcome.

“AI systems are dependent on data. It’s just a fact about how they work,” Vincent says. “Ultimately, that is a way the public can gain power.”

Here at RAISO, we want to extend our congratulations to Nicholas Vincent and Hanlin Li for their exceptional work! If you’re interested in learning more about this research, you can read their full paper here.