Blog: The significance of “edge cases” and the cost of imperfection as it pertains to AI adoption
With the advent of new tools and technologies, it’s tempting to think that the rules of work have changed or that old problems can be forgotten. This is often true, but as we use new technology, we see new manifestations of ancient problems.
“For the want of a nail the shoe was lost,
For the want of a shoe the horse was lost,
For the want of a horse the rider was lost,
For the want of a rider the battle was lost,
For the want of a battle the kingdom was lost,
And all for the want of a horseshoe nail.”
— Benjamin Franklin (Poor Richards Almanack)
The above poem uses a dated reference, for modern militaries don’t often rely on horseshoe nails. However, the spirit of this poem remains true.
Small problems often have large effects.
We see this in many applications of AI. The degree to which this applies informs which AI applications can have widespread adoption, and which remain experimental.
AI has broad capabilities, with varying levels of adoption. Each application of AI inevitably encounters scenarios in which the systems do not perform as required or as expected. We call these scenarios “edge cases,” as illustrated below.
Friend or Food?
Deep learning and convolutional neural networks are two AI techniques used extensively to perform image classification.
Humans can look at a picture and identify what the subject of the image is. Machines don’t fare too badly — in fact, Google’s “Show and Tell” algorithm can caption an image with over 93% accuracy!
Image classification failures are often amusing. Below we have some images that were classified as sloths, but if you look closely, you will notice some of them are pastries.
And here we have some difficulty distinguishing between Chihuahuas and blueberry muffins.
If you’re still not hungry, have a look at these cute dogs juxtaposed with fried chicken.
These failures are harmless and even somewhat entertaining. Even with a high failure rate, wherein 1 in 10 of images are not classified properly, these algorithms remain useful, and their failures don’t cause real damage.
A self-driving ton of high-speed metal
Self-driving vehicles have proven to have a very low failure rate. Humans are involved in car crashes 4.2 times per million miles driven, on average. Waymo cars — self-driving cars developed by Google — were involved in just over 30 minor crashes (most of which resulting from edge case failures) after driving about 5 million miles. We can conclude that autonomous vehicles are roughly as safe as human drivers.
Why then don’t we have autonomous vehicles in widespread use? The economic case is strong, but safety remains a concern.
We hold AI-driven robotic systems to a different standard because these edge cases manifest problems in the physical world, where real damage can be done.
Autonomous vehicles, and nearly all robotics, operate as components in more complex systems. A rogue autonomous freight truck doesn’t only destroy its payload; it can potentially plow through a busy intersection, disregarding crosswalks and other traffic.
These “life and death” examples serve as an illustration of how edge cases can limit, or even jeopardize the potential of AI technologies.
Systems of systems
The term “reliability engineering” sheds light on the significance of edge cases in any complex system. Reliability engineering focuses on the costs of failure caused by system downtime, mission success, and mission failure.
One case of mission success are devices that operate independently and not as subcomponents in a more complex system. Examples include the generation of search results and image classification.
Complex systems have more elaborate missions, with more interacting components. Determining the reliability of complex systems requires summing up all the failure rates of the subsystems involved in the mission. With that, we can compute the Mean Time Between Failures (MTBF) — that is, the “up-time” between two failures of a system during operation.
Imagine 50 identical warehouse robots (with an individual mission failure rate of 3%), each performing 40 missions per 24 hour day. The MTBF of the individual robot is calculated to be 20 hours. These warehouse robots are more reliable than our image classification algorithm (with a 7% mission failure rate) and may be deemed “good enough”.
However, when these 50 devices work together as part of a larger warehouse system, we determine that the warehouse system has an MTBF of only 24 minutes! This is because, over the course of 20 hours, each device can roughly be expected to fail once.
Furthermore, these AI-driven robots are likely to be working together, interdependently. We cannot ignore the effects of interdependent cascading failure, whereby a single failure is very likely to cause interdependent processes to fail as well, just as in the case of our horseshoe nail poem.
Seemingly low occurrences of these edge case failures can quickly result in significant negative outcomes. Single digit failure rates can result in maintenance and intervention costs which outweigh any benefit that is brought about by the introduction of AI technologies.
Just another tool?
Historically, tools enhance human labor. An ax without someone to swing it is not useful. An automated assembly line provides much more leverage than the ax, but will still invariably contain processes that require human laborers.
Why should AI be thought of as an exception?
A human may be able to classify thousands of images per day, unassisted. With AI classification tools, the same human can focus on resolving edge cases, effectively classifying hundreds of thousands of images per day.
Without a human in the loop to address edge cases, any AI system may not be economically useful, and may even be dangerous.
Why not accept AI tools as they are? Not as an all-purpose solution to tackle any tedious cognitive tasks, but as another tool in our toolbox, to enhance human effort.
This is what is playing out in most cases, such as airplanes, whereby autopilots do a great deal of work involved in flying, but pilots remain onboard.
Amazon’s Mechanical Turk crowdsourcing platform creates a marketplace where “human intelligence tasks” are posted, so people can get paid to perform tasks that computers are currently unable to do. Tasks include data verification, labeling objects in videos, and podcast transcription.
Innovative startups, such as Phantom Auto, Ottopia, and Cognicept Systems are developing systems to allow autonomous systems, such as self-driving vehicles, to request human assistance when edge cases are encountered.
The many dire predictions of humans being supplanted by machine intelligence and machine labor may be premature.
The economist John Maynard Keynes coined the term “technological unemployment” in 1930, stating “the unemployment caused by our discovery of means to economize the use of labour will run at a faster rate than we can find new uses for work” (Keynes, Economic Possibilities for Our Grandchildren, 1930).
There is no question that the nature of life and work have changed drastically as a result of technological advances, but humans always find new ways to be productive.
Is it really different this time? Edge cases may very well save humanity from unemployment once again.