Big Data: The Categorizing Machine

You want to know the habits of mobile phone users? Big Data. You want to reach a targeted clientele on the Web? Big Data. You want to decode the secrets of the latest on Netflix, or learn where to fix potholes in a neighbourhood? Big Data! All you need is a good algorithm, and a decent quantity of data, and the companies that analyze Big Data promise to find all sorts of answers to our questions. But who’s asking these questions? And can we trust algorithms to make decisions?

2015 is the year of Big Data. The concept of Big Data has existed for forty years already, but according to Forbes, this is the year that marks Big Data’s entry into the business world and governance. A bunch of companies are retuning their business models to reap profits from a new source of wealth: our personal data.

Big Data Mashup

Statistical analysis has always been with us. By taking surveys, or by calculating selected answers in a census form, we can estimate, more or less, the probability that a candidate will be elected, the number of car accidents annually, or even the type of individual most likely to reimburse a loan. Mistakes can be made, but numbers help uncover trends. And based on those trends, we hopefully make the right decisions.

Nowadays, we produce trends using quintillions of data points. Add this to the information collected by institutions and credit companies, browsing history tracked by cookies (episode 2), the data from our mobile phones (episode 4), 50 million photos, 40 million Tweets, and billions of documents exchanged daily. Now add the data produced by sports bracelets, “smart” objects and gadgets, and you’ll understand why “Big” is the right adjective to describe the vast expanse of available information.

However, the true revolutionary aspect of Big Data isn’t so much a question of its size, as it is the way in which all of this data can be mixed. Beyond the things it says about us (or despite us), it is the correlation and mixing of personal information that allow the behaviours of users to be predicted.

Being able to know what you say online? Who cares! But knowing the words used, and with whom you are exchanging, on what network, and at what time? Now that’s a moneymaker.

Categorization For The Win

With something as simple as a postal code, for example, average consumer income can be predicted. The Esri and Claritas agencies even claim to be able to deduce education level, lifestyle, family composition, and consumer habits from this one piece of information. Target made headlines in 2012 when it predicted a teenager’s pregnancy, before her parents were aware, based on the type of lotions, vitamins, and color of items purchased.

For these algorithms to work properly, individuals have to be put into increasingly more precise categories. And that is where discrimination lurks. Because we don’t always fit easily into a pigeonhole.

Predictions and Discrimination

As Kate Crawford stated when she was interviewed in episode 5, it is minorities, and those who are already discriminated against, that are the most affected by prediction errors. The more an individual corresponds to the “norm”, or to a predetermined category, the easier it is to take their data into consideration. But what happens when we are on the margins? What happens to those that don’t behave the way Amazon, Google, or Facebook predicts?

Facebook recently angered many of its users by strictly enforcing a section of their Terms and Conditions which insists people use their real names within the service. The purpose, says the company, was to provide a safer environment and limit hateful posts. What they didn’t account for was the deletion of accounts from the transgendered, indigenous and survivors of domestic violence whose accounts weren’t held under “real” names. This violated not only the individual rights but also the privacy of these users.

And what about prejudices and discrimination that algorithms only serve to reinforce? In 2014, Chicago police rang the doorbell of a 22 year old young man named Robert McDaniels. “We’re watching you” said one of the officers. This was the result of an algorithm developed by the Illinois Institute of Technology placing him on a list of 400 potential criminals because of crime data about his neighbourhood, the intersections where crimes occurred in the past, and his degrees of separation from people involved in crimes. It’s like science fiction. And if there was a misconception, how would it be repaired?

Take the Test

We’re not going to lie to you: it’s difficult, if not impossible, to find out how we are categorized – and even harder to avoid it altogether. It all depends on the company, the algorithm, and the information that they are after. However, some tools can give us a glimpse into the ways in which the Web categorizes us:

  • The extension called Floodwatch (link) allows us to have a quick look at all of the advertisements that target us personally over a long time period. Handy for retracing our browsing practices and how they affect our categorization!
  • Even simpler? If you are logged in to your Google account – Go to the Ad Parameters page – Does this profile resemble you? It’s up to you if you want to correct it, or you could just adopt this new identity as a form of camouflage.

Sandra Rodriguez

Spy agencies target mobile phones, app stores to implant spyware

Canada and its spying partners exploited weaknesses in one of the world’s most popular mobile browsers and planned to hack into smartphones via links to Google and Samsung app stores, a top secret document obtained by CBC News shows.

The 2012 document shows that the surveillance agencies exploited the weaknesses in certain mobile apps in pursuit of their national security interests, but it appears they didn’t alert the companies or the public to these weaknesses. That potentially put millions of users in danger of their data being accessed by other governments’ agencies, hackers or criminals.

Anti-NSA Pranksters Planted Tape Recorders Across New York and Published Your Conversations

A WOMAN AT a gym tells her friend she pays rent higher than $2,000 a month. An ex-Microsoft employee describes his work as an artist to a woman he’s interviewing to be his assistant—he makes paintings and body casts, as well as something to do with infrared light that’s hard to discern from his foreign accent. Another man describes his gay lover’s unusual sexual fetish, which involves engaging in fake fistfights, “like we were doing a scene from Batman Returns.”

These conversations—apparently real ones, whose participants had no knowledge an eavesdropper might be listening—were recorded and published by the NSA. Well, actually no, not the NSA, but an anonymous group of anti-NSA protestors claiming to be contractors of the intelligence agency and launching a new “pilot program” in New York City on its behalf. That spoof of a pilot program, as the prankster provocateurs describe and document in videos on their website, involves planting micro-cassette recorders under tables and benches around New York city, retrieving the tapes and embedding the resulting audio on their website: Wearealwayslistening.com.

The minority report: Chicago’s new police computer predicts crimes, but is it racist?

Chicago police say its computers can tell who will be a violent criminal, but critics say it’s nothing more than racial profiling.

When the Chicago Police Department sent one of its commanders to Robert McDaniel’s home last summer, the 22-year-old high school dropout was surprised. Though he lived in a neighborhood well-known for bloodshed on its streets, he hadn’t committed a crime or interacted with a police officer recently. And he didn’t have a violent criminal record, nor any gun violations. In August, he incredulously told the Chicago Tribune, “I haven’t done nothing that the next kid growing up hadn’t done.” Yet, there stood the female police commander at his front door with a stern message: if you commit any crimes, there will be major consequences. We’re watching you.

What McDaniel didn’t know was that he had been placed on the city’s “heat list” — an index of the roughly 400 people in the city of Chicago supposedly most likely to be involved in violent crime. Inspired by a Yale sociologist’s studies and compiled using an algorithm created by an engineer at the Illinois Institute of Technology, the heat list is just one example of the experiments the CPD is conducting as it attempts to push policing into the 21st century.

 

How to protect your smartphone

I won’t lie to you: it’s difficult to protect your smartphones. But after a few weeks of following trackers, I have learned a few things.

Check your smartphone

1 – Some of your apps need your personal data to function. Others don’t. To sort through them, I installed Clueful by Bitdefender. It’s an app that tells you what information is used by each of your apps. It warns you if some apps use your information without your knowledge. You are being tracked but at least you know it.

2 – Before downloading anything, make sure you actually need it. Get rid of the apps that you no longer use. Close apps running in the background. In iOS, just double-click the Home button at the bottom of the screen. In Android, you can do it by opening “Applications” under “Settings”.

3 – If you like, you can also disable geolocation services. In Android, just go to “Settings”, then “Geolocation” and disable. At the bottom of the same page, you can click on “Google Location History” to disable this function. On the iPhone, go to “Settings” then “Privacy” then “Location Services”.

4 – Ad tracking can be limited. If you use Android, you will find “Ads” under “Google Settings”. You can disable “Interest-based advertising” and re-initialise “Advertising ID”, the equivalent of a computer cookie. This method is not fool-proof, since an application that had access to your UDID will recognise your phone, but not all apps do it. The process is the same for the iPhone. You will find “Ads” in “Settings” under “Advertising”.

Be anonymous

5 – To navigate completely anonymously, you can download Tor or Orbot, developed by the Guardian Project. These services are effective but require patience, as uploading pages is slow. The Duckduckgo.com search engine promises “not to spy on you” and does not store user’s personal data.

6 – Use “Off The Record (OTR)” messaging apps. These apps don’t store any messages on any servers, so there’s nothing to snoop on. ChatSecure is a popular option.

7 – Do not connect to free Wi-Fi. If you really must use free Wi-Fi, do not access your personal accounts (email, bank account, social networks, etc.) Otherwise, install a VPN (Virtual Private Network) app which enables you to connect to the Internet securely.

8 – If you want to take it further, do not hesitate to stay informed on the Guardian Project website, which has developed tools that make it possible to make images anonymous, encrypt communications, etc. The new Courier tool makes it possible to access an uncensored Internet. With the “PANIC” button, you can uninstall it quickly.  It is available in several languages, including English, Chinese, Tibetan, Ukrainian and Russian.

Zineb Dryef

How publishers are using Facebook’s instant articles

On Wednesday, nine major publishers began publishing articles straight to Facebook under the social network’s long-anticipated product, called Instant Articles. Facebook sweetened the deal by letting publishers control the ad sales, branding and content; sell ads on the articles and keep all the revenue; and get data on their readers.

Still, publishers were mixed in their embrace: BuzzFeed and NBC News were the only ones to go all in committing to using the product. Others, like The New York Times and the Atlantic, are taking a more cautious approach. There are still plenty of unknowns, chiefly whether Facebook’s terms will remain as generous to publishers as they seem to be now.

Newer Posts
Older Posts