“I know one thing; that I know nothing.”
These words were famously attributed to Socrates in one of Plato’s accounts of the philosopher, a phrase that now comes to represent the Socratic paradox. While it is contested whether Socrates actually said these words, the meaning is still poignant, indicating that the wisest people are the ones who don’t assume to know all, who keep an open mind, and who “know when they know nothing.”
With the rise of Big Data, we now have more information at hand than ever before. Some might even say that we know more now than we ever have before — and this is dangerous thinking. Tom Goodwin, head of innovation at Zenith Media and Forbes contributor, would likely agree.
“We overestimate the importance of what we know, rather than focus on what this data makes clear we don’t actually know. The more you know, the more you know you don’t know,” he says in his post, “The Dark Side of Big Data.” “Above all I’m concerned we believe that big data is used as a cure all, we’ve somehow assumed that it will solve all our problems and I think that the reality doesn’t meet the hype and maybe it won’t for the medium term future.”
Goodwin isn’t the only person who questions whether or not our reliance on big data is healthy, and it all boils down to the fact that we have such awesome technology in our hands, and we need to make sure we use it correctly. Big data is, after all, neutral technology. It is we, those who wield such technology, that stand on the precipice of a slippery slope.
When Big Data is Bad Data
On September 12, 2016, Edd Gent asked in an article on Engineering and Technology, “as we entrust more of our lives to ‘big data’, how can we protect against the gaps and mistaken assumptions used to handle the information?” In the article, he mentions a couple of examples of how big data is not necessarily better data. For example, in Boston, Massachusetts in 2011, the municipal authority released an app called Street Bump in an attempt to employ a smarter way to detect roads that needed repair. The app used GPS to and a smartphone’s accelerometer to detect jolts and disturbances as a driver goes down the road, and sent the location of potholes back to the authority — the only problem was that the system reported a disproportionate number of potholes in wealthy neighborhoods due to oversampling of younger, more affluent citizens who were “digitally clued up enough to download and use the app in the first place.”
The private sector has their own share of problems with Big Data. According to Villanova University’s “When Big Data Doesn’t Work”, 87 percent of companies report that “bad data” pollutes their data stores, and that the biggest cause of these inaccuracies are human error (56 percent). A good example of this is in employers who use computer-driven algorithms to find, recruit, and hire job candidates online.
Bloomberg BNA reported in July that one unintentional side effect of the use of the above algorithms could be discrimination. “Vendors that promote the algorithms say using a neutral formula that eliminates the human element, at least in the early stages of searching for and recruiting candidates, reduces the risk of unlawful bias,” reports Kevin McGowan. “But others fear the algorithms, depending on how they are constructed and used, could create or perpetuate discrimination based on race, sex or other protected characteristics.”
This was explored as early as 2013, when Latanya Sweeney did a study of Google AdWords buys made by companies who provide criminal background checks as a service. She found that when somebody Googled a traditionally “black-sounding” name, the ad results that were returned were indicative of arrests at a significantly higher rate than traditionally “white-sounding” names — regardless of whether an arrest had been made on that person or not. This plays into the same type of unintentional discrimination that job applicants may face because of the blunders of big data.
Big Data is Neutral: How We Use It is What’s Important
The truth (and the problem) is that Big Data is going nowhere. When you think about the potential for human error and skewed results due to subtle algorithmic preference, it makes things like the ethical systems of self-driving cars and even predictive policing, things that are meant in their nature to protect and serve the public, questionable systems.
What we need to remember is that Big Data is a tool, and one that can be used for great evil if irresponsible use isn’t checked or scrutinized. A great recent example of this is in Trump’s Justice Department trying to force an internet hosting company to turn over information about everyone who visited a website used to organize protests during President Trump’s inauguration. This is the type of data collection that is not only unethical, but downright dangerous in the hands of a governing entity that aims to identify and possibly persecute dissidents.
On the other hand, big data can be used for extensive good. Oregon recently approved a bill that will help aid in decriminalization of drugs in that state. Part of the reason for the passing of this bill would be because of big data findings indicating that “African Americans were convicted of felony drug possession at a rate that doubles the convictions of white offenders and Native Americans were convicted of drug possession five times that of the rate of whites.”
This distinction between how data can be used is important. In Daniel Honan’s article for Big Think, titled “Big Data is Neutral: A Tool for Both Good and Evil”, he quotes Big Data expert Rick Smolen:
"Every time there’s a new tool, whether it's Internet or cell phones or anything else," Smolan points out, "all these things can be used for good or evil. Technology is neutral; it depends on how it’s used."
Honan ends his article well: “Big Data is about credit card companies making decisions on who can get credit based on who listens to rap music. That's scary. But Big Data is also about our ability to ‘measure the heartbeat of everybody on Earth simultaneously,’ as Smolan points out…”
When it comes to Big Data, perhaps we should pay more attention to the Socratic paradox. People like Smolan understand just how simultaneously dangerous and beneficial this new technology can be, and we need to be aware too. There’s no more apt statement than to say that the more data we mine, the more we realize how much we don’t know. We must tread lightly, because if there’s one thing I know...