ProjectBlog: Artificial Intelligence and Big Data

Blog: Artificial Intelligence and Big Data

Today, Big Data is predominant in our lives, professionally and personally. The application of artificial intelligence to this large amount of data generated in our society has enormous potential to benefit our society. However, along with benefits come many dangers. These raise ethical questions as to the application and use of this data in concert with artificial intelligence. With that come the ultimate question: can Artifical Intelligence ever replace human intelligence or can both co-exist together to the benefit of humanity?

Due to the ubiquity of computers, the 21st century is a digital book and many decisions can be made far more accurately by computers than a human being. To groups with enough knowledge and resources, there is a large amount of information available to just about everyone on the planet. There are many benefits to Big Data. When used in concert with big data, AI has the potential to extract valuable insights from data that we would otherwise be unable to obtain. These include providing unbiased opportunities and information e.g. for loans, insurance, education, employment, and medicine.AI has the potential to make medical diagnoses, predict and prevent crime, and advance almost every domain of human knowledge. The automation of processes can revolutionize many aspects of the human experience, as computers are much better than humans at many different tasks.

However, the interpretation of and decisions based on this data are not inherently free from bias to be taken at face value. As bias is present in our society, it stands to reason that bias must also be present in the data that is created by our society (Fig. 1). Therefore, it is important to look at all aspects of Big Data and consider the importance of following a code of ethics to ensure appropriate use and prevent misuse.

Fig. 1 IBM Big Data and Analytics Hub. (2019). Retrieved April 1, 2019.

There are several ways to reap the benefits of the union of artificial intelligence and Big Data. One way is to make sure the right people are behind the data interpretation. According to Trikha, it is important to have statisticians on hand to manage these large amounts of data that is produced as seen in Fig. 2.

Fig 2. Trikha (2015).

Dangers of Big Data

Alongside benefits however, artificial intelligence has its limitations and dangers. Biases can be found at many different stages. There are biases in the collection of data, in terms of what and how is data collected and by whom and to what end. Tufecki addresses this, when she questions why big data is “not helping us make better decisions?” Her answer is: “Big data suffers from a context loss because big data doesn’t answer the question ‘why?’” There are also biases in the interpretation. Questions must be asked as to who is interpreting the data, and why is it being collected, and for what purpose. One has to be careful of outliers interpretation. Outliers are normally defined as any data point that lies far outside the expected range for a value. These often expensive outliers are considered “black swans,” after Nassim Nicholas Taleb characterized the expression in his 2007 book “The Black Swan.” Taleb defines black swans as rare, high-impact events that seem improbable and unforeseeable but, in hindsight, are explainable and he is a huge proponent of keeping an eye on data sets that could end up with such black swan type of events.

Another bias can be in the form of who has access to the data. Does Big Data reside in the hands of a few because large funds are required to gain access to Big Data? And do individuals always have access to what data is being collected about them? E.g. teachers who were dismissed in New York based on Big Data could not gain access to the data about them or the source of data. Will this data be used unjustly against someone else, or will artificial intelligence with data be used to harm or control others. Finally, the most dangerous bias lies in the application of AI to big data. This might carry the most potential risk. As Black Mirror creator Charlie Brooker says “It’s not a technology problem we have. It’s a human one.” Sociologist Zeynep Tufekci also echoed that when she said: “We cannot outsource our moral responsibilities to machines. Hold on to human values and human ethics.”

This raises the vital consideration of accountability. How do we set accountability to ensure Big Data is being accessed and used in a beneficial manner? The accountability is also on us. We need to think about how it might impact us at an individual level and where does our responsibility and accountability lie as collectors of data or objects of data. As objects of data, we might be victims if we are not aware of the implications of whatever we put out there can be used ‘for’ or ‘against’ us. We are creating and updating our social and professional profiles with every action we take passively or actively in our daily life. The larger picture however is that Big Data is here to stay. Big Brother is watching us constantly for business, civic or legal reasons. In this context, we can no longer hide any aspects of our lives. Our past, present, and potential future is out there, whether we like it or not or choose to ignore it. Therefore, it’s important to be aware of the choices we make as well, and become conscious and vigilant in our activities. As they teach CEOs in Public Relations, it is important to think about our actions or words in terms of how they would appear as a NYT headline? Big Data makes this question applicable to the everyday person.

Fig 3. Pals, Leon. (2010).

This has led to increasing paranoia about “Big Brother” having unprecedented access to every facet of our lives and the only way to opt out of this surveillance would be not use the many forms of technology that are required to be a part of our society.

There are several ways to reap the benefits of the union of artificial intelligence and Big Data without the biases. One way is to make sure the right people are behind the data interpretation. According to Trikha, it is important to have statisticians on hand to manage these large amounts of data that is produced as seen in Fig. 2.

However, as big data and the science surrounding it continues to advance at a rapid rate, it will be difficult for legislators to enact policy to effectively regulate its use. As many policy makers do not come from a background that allows them to understand these intelligent systems, it is paramount that the designers of these systems hold themselves to a high ethical standpoint to ensure that these technologies are used in a way that is beneficial to our society. Very little is understood about the ethical implications underpinning the Big Data phenomenon. There are also significant questions of truth, control, and power in Big Data studies: researchers have the tools and the access, while social media users as a whole do not. Their data were created in highly context-sensitive spaces, and it is entirely possible that some users would not give permission for their data to be used elsewhere. Many are not aware of the multiplicity of agents and algorithms currently gathering and storing their data for future use. Researchers are rarely in a user’s imagined audience. Users are not necessarily aware of all the multiple uses, profits, and other gains that come from information they have posted. Data may be public (or semi-public) but this does not simplistically equate with full permission being given for all uses. Big Data researchers rarely acknowledge that there is a considerable difference between being in public (i.e. sitting in a park) and being public (i.e. actively courting attention) (Marwick and Boyd, 2011). Finally, the difficulty and expense of gaining access to Big Data produce a restricted culture of research findings.

The future of AI is something of an unknown. One must look no further than many contemporary entertain mediums which foretell of an apocalyptic future brought on by a nefarious, super-intelligent computers. It isn’t yet safe to entirely rule out this possibility, as even rudimentary AI often learn things or act in ways that are baffling to their creators. In 2017, an AI program was able to beat a master of Shogi and G0, despite the prediction that it would not be advanced enough to accomplish this for several years. Artificial intelligence is already being compared to nuclear weapons, as man-made constructs that have the potential to destroy civilization as we know it. The increasing automation of our society, and the massive amounts of data being generated being indecipherable to any biological entity, one must wonder what an AI system could learn from this. That is why the creation and existence of a Code of Ethics by the Association for Computing Machinery is critical when developing such technology. Specifically the following mentioned below, as they establish a standard to keep in mind when working with AI and Big Data. As mentioned by Trikha, it is only The statistician, or as I would like to see it as the ‘human element’ that will be able to consider all the ethical questions when working with such technology and information.

ACM Code of Ethics:

1.1 Contribute to society and to human well-being, acknowledging that all people are stakeholders in computing.

1.2 Avoid harm.

1.6 Respect privacy.

2.2 Maintain high standards of professional competence, conduct, and ethical practice.

2.7 Foster public awareness and understanding of computing, related technologies, and their consequences.

2.8 Access computing and communication resources only when authorized or when compelled by the public good.

3.1 Ensure that the public good is the central concern during all professional computing work.

In conclusion, today Big Data and AI can advance human progress a great deal and provide benefits on many levels medical etc. However, there needs to be checks and balances in place to ensure it’s not misused. The ethical decisions made by computer science professionals could very well determine the future of our civilization. It is important that the artificial intelligence that learn from big data are specifically designed with ethical considerations in mind. Therefore, Big Data should not replace the human factor, but work hand in hand with it. The code of ethics is the link back to humanity. In the end, you cannot leave it to the machine, you have to have a human consciousness to maintain ethical and just practice in such a complex and fast growing industry and technology.


Basulto, Dominic. May 3, 2012. Is social profiling discrimination. Washington Post, Online Blog

Post. Retrieved from

Béranger, Jérôme. (2016). Ethics in big data : the medical datasphere. London, UK : Kidlington,

Oxford :ISTE Press, Ltd. ; Elsevier.

Blanding, Michael. (2016). Man Vs. Machine: Which makes better hires? Harvard Business

School online magazine: Working knowledge Business Research for Business Leaders: Research ideas. Feb 17, 2016. Retrieved from

Chui, M., Harrysson, M., Manyika, J., Roberts,R., Chung, R., Nel, P. and van Heteren, A.

Applying Artificial Intelligence for social good. McKinsey Global Institute. November 2018. Retrieved from.

Collmann, Jeff. Matei, Sorin Adam. (2016). Ethical reasoning in big data : an exploratory

analysis. Switzerland: Springer.

Davis, Kord. (2012). Ethics of Big Data. Sebastopol, CA: O’Reilly.

Harris, L., Lee, V. K., Thompson, E. H., and Kranton, R. (2016) Exploring the Generalization

Process from Past Behavior to Predicting Future Behavior. J. Behav. Dec. Making, 29: 419–436. doi: 10.1002/bdm.1889. Retrieved from

Gladwell, Malcolm, 1963-. (2008). Outliers : the story of success. New York :Little, Brown and Co.

Lane, Julia. American Institutes for Research, Washington DC, Victoria Stodden, Columbia

University.Stefan Bender, Institute for Employment Research of the German Federal Employment Agency, Helen Nissenbaum, New York University. (2014). Privacy, big data, and the public good : frameworks for engagement. New York, NY :Cambridge University Press.

Lasana Harris, Victoria K. Lee, Elizabeth H. Thompson, Rachel Kranton, Exploring the

Generalization Process from Past Behavior to Predicting Future Behavior, First published: 12 June 2015,

Marwick, A. Boyd, D. (2010). I tweet honestly, I tweet passionately: Twitter users, context

collapse, and the imagined audience. New Media and Society. UK. Retrieved

Marwick, A. and Boyd, D. (2011) To See and Be Seen: Celebrity Practice on Twitter.

Convergence: The International Journal of Research into New Media Technologies, 17, 139–158.

Miller, Patrick. December 20, 2017. A Medium Corporation. Black Mirror Effect — Human

problems not technology problems. Retrieved from

Orwell, George, 1903–1950. (1992). Nineteen eighty-four. London: David Campbell Publishers Ltd.

Pals, Leon. “Do You Know Whos Watching You.” The Next Web, 29 July 2010,


Schmarzo, Bill. (2013). Big Data : understanding how data powers big business. Indianapolis, IN :John Wiley & Sons.

Taleb, N. N. (2007). The Black Swan: The impact of the highly improbable. New York: Random House.

Trikha, Rithika. (2015). The risky eclipse of statisticians. Hire. Hacker Rank Blog, Published 20

July 2015. Retreived April 1, 2019.

Trott, Dave. February 15, 2018. Campaign: A view from Dave Trott Big Data is blind faith.

U.K. Online advertising. Retrieved from

Tufekci, Zeynep. (June 2016). Machine intelligence makes human morals more important . Retrieved from: Techno-sociologist Zeynep Tufekci asks big questions about our

societies and our lives, as both algorithms and digital connectivity spread.

United States. Executive Office of the President. (2014). Big Data : seizing opportunities,

preserving values. Washington :White House, Executive Office of the President.

Whitaker, M. et al. (2018). AI Now Report. AI Now Institute. Dec 2018.

Source: Artificial Intelligence on Medium

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top

Display your work in a bold & confident manner. Sometimes it’s easy for your creativity to stand out from the crowd.