Blog: Opinion | Artificial Intelligence cannot yet pass a high school math test – Livemint
We are bombarded almost daily with news about how much smarter Artificial Intelligence (AI) has become, and how it is being applied across a variety of potential use cases. For instance, the South American country of Ecuador has built a camera surveillance system named ECU 911 with Chinese help. Initially funded by a $240 million Chinese loan in 2012, the system has a countrywide network of 4,300 surveillance cameras, 16 regional response centres and over 3,000 government employees diligently watching video footage and responding to millions of distress calls made to the Ecuador emergency management 911 phone number every year.
ECU 911 has been credited with cutting crime and saving lives after natural disasters in a country long troubled by such issues, and so is portrayed by China as a showcase for its technological prowess and humanitarian impulses. But experts fear that ECU 911’s use of technologies such as facial recognition could normalize the sort of intrusive surveillance that is becoming increasingly common in China. More on that issue in a future column.
Meanwhile, while its advances are many, to my mind, AI researchers have not really scaled up its capabilities to anything more than increasingly sophisticated algorithms to perform the intellectually simple task of pattern recognition, whether it be by surveying and matching face scans or reading medical images. With the advances in “machine learning” and “neural models”, the computer program can refine its original algorithms to become more efficient over time as it is presented with new data. These neural and machine learning programs are built to act without additional aid from computer programmers. It is this aspect of AI that frightens us the most, since the computer has taken over all tasks without the need for human intervention.
The problem, though, is that these algorithms are fixed and no amount of machine learning-induced fine-tuning or neural modelling can allow them to change the base nature of analytical pattern recognition. A completely new algorithm would have to be written for the AI program to work on anything else, even if the problem it is now trying to solve is similar. This is a phenomenon called “catastrophic forgetting”, which I have covered before in this column.
Even so, most of us would reasonably expect, with all the hype and claims surrounding AI, that it has already reached the level of basic computational intelligence and learning that a high school student would have for similar computational and mathematical tasks.
Not so. A Google DeepMind paper published on 2 April 2019 titled Analysing Mathematical Reasoning Abilities Of Neural Models by David Saxton et al examines how the DeepMind team tried to pit its “neural network” against a high school mathematics test. The result? Well, DeepMind’s AI failed. The model it had built could only solve about 14 questions on a test designed with a total of 40 questions. This translates to 35%, which is a failing grade for students in almost any schooling system. This sort of score from a human being with average scholastic capabilities can be expected only under a few circumstances: lack of preparation, or anxiety that causes “catastrophic forgetting” at a human level.
The data that DeepMind used was based on the UK’s national school mathematics curriculums, which are very similar to India’s school level tests. The data covered algebra, arithmetic, calculus, comparisons, measurement, numbers, manipulating polynomials and probability. DeepMind first collected a data set consisting of different types of mathematics problems. Rather than crowd-sourcing (or outsourcing) to allow for manual identification of elements within the data set, DeepMind synthesized the data set to generate a larger number of training examples, to control the difficulty level and to reduce training time. The team used a free-form text format to ensure, for example, that tree diagram or graph-type questions could be accommodated in the data used to train its test-taking programme.
In analyses available on blogs and articles on the internet, several issues were pointed out about DeepMind’s school final attempt. It appears that AI cannot yet mimic the cognitive skills that humans use to solve simple mathematics questions that involve substitution.
The original researchers define its shortcomings as follows: (a) planning—for example, identifying the correct order to solve functions in mathematical equations; (b) categorizing the characters into entities such as words—which determine the question—and numbers, arithmetic operators and variables (these last three categories can together form mathematical functions); (c) exploiting working memory to store intermediate values that are arrived at during the steps taken to solve an equation; (d) using sub-algorithms such as addition and multiplication for function composition; (e) and last, generally applying a test-taking student’s classroom-acquired knowledge of rules, transformations, processes and axioms.
It turns out then, according to the research and analyses, that addressing even a simple mathematics problem involves a great deal of brainpower as people learn to automatically make sense of mathematical operations, memorize the order in which to perform them, and know how to turn word problems into equations. DeepMind acknowledges that it hadn’t corrected for linguistic variation or complexity in its data set and had not extended it for problems that include visual reasoning in areas such as geometry.
We can rest safely for now in the knowledge that AI is built only to pore over manually-labelled data, scanning for patterns and analysing them. This itself is scary enough. As far as the math test goes, better luck next time!
Siddharth Pai is founder of Siana Capital, a venture fund management company focused on deep science and tech in India.
Click here to read Siddharth Pai previous columns.