Blog: What is AI Bias? , How it affects
Bias trouble in AI — Definition — Bias, is very much diverse which keeps varying in a different context(domain).
So now, will come to point, What is AI Bias?
AI systems are only as good as the data we put into them. But, what if the AI algorithm is trained with bad data containing implicit racial, gender, or ideological biases.
For Example: Say you’re training an image recognition system to identify any country presidents. The historical data reveals a pattern of males, so the algorithm concludes that only males are presidents. It won’t recognise a female in that role, even though it’s a probable outcome in future elections.
It’s important to remember, though, that AI can cement misconceptions in how a task is addressed, or magnify existing inequalities. This can happen even when no one told the algorithm explicitly to treat anyone differently.
Now, How to deal with data-driven bias at an early stage of Data Engineering?
Collecting the data. There are two main ways that bias shows up in training data: either the data you collect is unrepresentative of reality, or it reflects existing prejudices. The first case might occur, for example, if a deep-learning algorithm is fed more photos of men then women. The resulting image recognition system would inevitably be worse at recognising female president. The difference is not because male as presidents is easier to classify than female. Rather, algorithms are typically trained on a large collection of data that’s not as diverse as the overall male and female president.
Preparing the data. Finally, it is possible to introduce bias during the data preparation stage, which involves selecting which attributes you want the algorithm to consider.choosing which attributes to consider or ignore can significantly influence your model’s prediction accuracy. But while its impact on accuracy is easy to measure, its impact on the model’s bias is not.
Will ethical data preprocessing assures mitigating bias completely in an AI system?
The answer is NO. Even after following best practices during data preprocessing, AI systems are prone to become bias from various unexpected sources. Because handling bias in the artificial intelligence system differs from domain to domain and type of data we deal with.
What are unexpected sources of bias in artificial intelligence, Will discuss now;
Bias through interaction
While some systems learn by looking at a set of examples in bulk, other sorts of systems learn through interaction. Bias arises based on the biases of the users driving the interaction. A clear example of this bias is Microsoft’s Tay, a Twitter-based chatbot designed to learn from its interactions with users. Unfortunately, Tay was influenced by a user community that taught Tay to be racist and misogynistic. In essence, the community repeatedly tweeted offensive statements at Tay and the system used those statements as grist for later responses.
Tay lived a mere 24 hours, shut down by Microsoft after it had become a fairly aggressive racist. While the racist rants of Tay were limited to the Twitter-sphere, it’s indicative of potential real-world implications. As we build intelligent systems that make decisions with and learn from human partners, the same sort of bad training problem can arise in more problematic circumstances.
Tay in most cases was only repeating other users’ inflammatory statements, but the nature of AI means that it learns from those interactions. It’s therefore somewhat surprising that Microsoft didn’t factor in the Twitter community’s fondness for hijacking brands’ well-meaning attempts at engagement when writing Tay. Microsoft has been contacted for comment.
Eventually, though, even Tay seemed to start to tire of the high jinks.
Here are some pictures of it.
What if we were to, instead, partner intelligent systems with people who will mentor them over time? Consider our distrust of machines to make decisions about who gets a loan or even who gets paroled. What Tay taught us is that such systems will learn the biases of their surroundings and people, for better or worse, reflecting the opinions of the people who train them.
Sometimes, decisions made by systems aimed at personalization will end up creating bias “bubbles” around us. We can look no further than the current state of Facebook to see this bias at play. At the top layer, Facebook users see the posts of their friends and can share information with them.
Unfortunately, any algorithm that uses analysis of a data feed to then present other content will provide content that matches the idea set that a user has already seen. This effect is amplified as users open, like and share content. The result is a flow of information that is skewed toward a user’s existing belief set.
While it is certainly personalized, and often reassuring, it is no longer what we would tend to think of as news. It is a bubble of information that is an algorithmic version of “confirmation bias.” Users don’t have to shield themselves from information that conflicts with their beliefs because the system is automatically doing it for them.
The impact of this information biases on the world of news is troubling. But as we look to social media models as a way to support decision making in the enterprise, systems that support the emergence of information bubbles have the potential to skew our thinking. A knowledge worker who is only getting information from the people who think like him or her will never see contrasting points of view and will tend to ignore and deny alternatives.
Sometimes the bias is simply the product of systems doing what they were designed to do. Google News, for example, is designed to provide stories that match user queries with a set of related stories. This is explicitly what it was designed to do and it does it well. Of course, the result is a set of similar stories that tend to confirm and corroborate each other. That is, they define a bubble of information that is similar to the personalization bubble associated with Facebook.
There are certain issues related to the role of news and its dissemination highlighted by this model — the most apparent one being a balanced approach to information. The lack of “editorial control” scopes across a wide range of situations. While the similarity is a powerful metric in the world of information, it is by no means the only one. Different points of view provide powerful support for decision making. Information systems that only provide results “similar to” either queries or existing documents create a bubble of their own.
The similarity bias is one that tends to be accepted, even though the notion of contracting, opposing and even conflicting points of view supports innovation and creativity, particularly in the enterprise.
you can see the source of Trump’s information regarding Google News’s perceived bias here
As you can see, CNN has a disproportionate number of articles returned when searching for “Trump” — nearly 29% of the total. In fact, left-leaning sites comprised 96% of the total results.
Conflicting goals bias
Sometimes systems that are designed for very specific business purposes end up having biases that are real but completely unforeseen.
Imagine a system, for example, that is designed to serve up job descriptions to potential candidates. The system generates revenue when users click on job descriptions. So naturally, the algorithm’s goal is to provide the job descriptions that get the highest number of clicks.
As it turns out, people tend to click on jobs that fit their self-view, and that view can be reinforced in the direction of a stereotype by simply presenting it. For example, women presented with jobs labelled as “Nursing” rather than “Medical Technician” will tend toward the first. Not because the jobs are best for them but because they are reminded of the stereotype, and then align themselves with it.
The impact of stereotype threat on behaviour is such that the presentation of jobs that fit an individual’s knowledge of a stereotype associated with them (e.g. gender, race, ethnicity) leads to greater clicks. As a result, any site that has a learning component based on click-through behaviour will tend to drift in the direction of presenting opportunities that reinforce stereotypes.
“Machine bias is made by human”.
In an ideal world, intelligent systems and their algorithms would be objective. Unfortunately, these systems are built by us and, as a result, end up reflecting our biases. By understanding the bias themselves and the source of the problems, we can actively design systems to avoid them.
Perhaps we will never be able to create systems and tools that are perfectly objective, but at least they will be less biased than we are.
Next Series will concentrate more on the practical way of building an ethical AI system of different domains with focusing on a real-world problem.
Thank you for reading this story, hope you enjoyed. Any suggestion or comments are welcome.