Blog: How to Build an Artificial General Intelligence
What is Artificial General Intelligence?
Before we can try to create an Artificial General Intelligence (I’ll call it AGI from now on), we have to figure out what it is. Let’s start with some basic definitions.
Wikipedia says that an AGI is “ the intelligence of a machine that could successfully perform any intellectual task that a human being can”¹.
And intelligence could be described as “ as the ability to perceive or infer information, and to retain it as knowledge to be applied towards adaptive behaviors within an environment or context”².
You may or may not agree with this definitions, but they are good enough for our purposes here.
If we want to create a system that can solve all the problems on Earth and beyond, it needs to be able to gain knowledge and to use it to increase its ability to solve other problems, and thus, become more and more intelligent in the process.
What’s the difference between AGI and the kind of AI that we have today? An AGI surpasses Artificial Narrow Intelligence (ANI), which is what current AI is, in the amount of problems that it can solve and in its ability to transfer knowledge from one task or domain to another.
If you don’t know what any of these terms mean or what I am talking about, I recommend you to read this amazing series of posts by Tim Urban explaining the different kinds of AIs and what it could mean for the human race to develop a fully-fledged AGI.
Solving All The Problems In The World
Any problem fits into one of this categories:
- Problems we have already solved as a species
- Problems that we are trying to solve
- Problems that we’ll try to solve in the future once we have the capabilities or realize that they exist
- Problems that we will never even know that they exist (Clue: This is the larger bucket)
Keeping this in mind, let’s see how this relates to our efforts to figure out what AGI is. Let’s start by calling the set of problems that our AGI can solve right now, the “Space of Solvable Problems”, which will be a subset of the problems we are trying to solve.
Our AGI should be able to work on this initially small group of challenges, and by doing so, increase the size of the set of problems it can solve.
This Space of Solvable Problems, currently corresponds to every task that can be mastered with Machine Learning today.
Expanding The Space of Solvable Problems
So, instead of thinking of AGI as an algorithm that, once turned on, will devour all of the information on the Internet in a matter of days or hours, and become so intelligent that it knows everything and far exceeds us with its intellect leaving us far behind, it will make much more sense to think of it as a piece of software that iterates over a given set of problems and increases progressively its ability to solve the problems at hand and further ones down the road, until it can solve all the problems that we face on this planet.
How can we even start to think about creating such a system? I believe a good starting point it’s to look at how we solve problems ourselves.
The key question in our quest to solve intelligence is: what do we do, that computers and other animals don’t, that enables us to increase our Space of Solvable Problems over time? Or to put it in simpler words, how do we increase our ability to solve complex problems in the long run?
The answer seems to be creative expression, culture and communication. That is, the creation, communication and preservation of useful knowledge derived from experiencing the world.
This may seem very abstract, so let’s try to see what an AI with this characteristics would look like, and how we might start building one using the technologies that we have today as a starting point.
Designing an AGI
We’ll go through some of the main problems that are stoping us from building an AGI today, and try to propose a solution that could, in theory, enable us to create the world’s first Strong AI using our current knowledge and technology.
Where We Are Today: Isolated Learners
Here are some of the very important achievements from modern AI:
This AIs are very good at what they do, they are the best players in their game or very close to the top. But you can’t have a conversation with them, or put them to work in a different task with their current knowledge.
We can do that, but nonetheless, most people specialize and dedicate their working life to a career which depends in a single set of skills. The difference between us and this examples, is that we can cooperate and communicate what we know with each other, that’s what allows us to solve almost any problem as a society. We are demanding our bots to learn various complex tasks by themselves, when we have to spend most of our lifes in a single field to become the best at it, if we can even do it.
The structure of our neocortex (the part of the brain that enables us to think, plan, solve complex problems, understand humor and more), consists of repeating “modules”, over and over again. This could suggest that for any task that we learn, we actually need another neural network, because it’s actually a unique mapping between the inputs that we get from our senses and our actions. The real challenge is how to know when to use a new module/network, and how to connect various together. It’s an infinitely complex problem, uncomputable. Our brain solves it by social interaction. It organizes our neocortex’s modules in an hierarchy, the higher, the most abstract. And to do so, it mirrors how our societies are build (and our society mirrors it back, because they are manifestations of the same phenomena, that we call human intelligence).
The best way to understand the relationship between General and Narrow AI is this: each ANI is like a single human, or module in the brain, an individual member of our society or a single neural network that it’s able to learn a little part of the world. It can learn to master some tasks, if it has the capabilities/architecture to do so and with enough practice and information.
In constrast, AGI is not something that we can code into an AI, or a task we can master, just like what makes us the most powerful species on the face of Earth is not having incredible and amazing individuals mastering and progressing a complex field of research or complicated task.
No, what makes us different and enables us to do such elaborated and intelligent things in the first place is our ability to share our findings, and build in the conclusions and knowledge of the other members of our culture and society, be they alive or long dead. What is important, is that they managed to share what they discovered with us through their work and their actions in a manner that we can learn what they discovered, without spending all the work (or computation in the case of AI) that they needed to arrive to their conclusions. This is where true intelligence lies.
In the case of the brain, the challenge resides in how to connect different networks to transfer the knowledge between them. This problem is solved in a very straightforward manner: if you can teach it, then you understand it. And if you can teach it to other persons, you can teach it to yourself in any new context.
The Feynman Technique is a learning technique that relies on this fact:
In this sense, the human being is not a general problem solver, society is. Our brain is not a perfect machine, it relies on feedback from other brains, and builds his representation of the world, himself and his knowledge out of it.
When we, as a global culture, confront a new challenge, we start by trying to wrap our heads around it. We share with each other our opinions in the subject, and, over time and collectively, we are able to set up the conditions for the right individual or group of individuals, to connect the dots, to create something new that solves it or discovers a possible way forward that will increase the probabilities of solving it in the future.
So, intelligence and the ability to solve any problem, is not a quality that each one of us possesses, but it’s something that emerges as the result of each one’s efforts to try to master a task and then coming back to the community to share what we have found.
Then, we can conclude, that if we want to build an AGI, we don’t have to build a perfect algorithm that can solve anything, anywhere, anytime, without any previous knowledge. We have to enable our current ANIs to communicate what they have learned in a specific task, in such a way, that other ANIs can use that information to increase their perfomance in a specific task or to solve a difficult problem.
Until we achieve this, our systems will always be limited. Just imagine that suddenly, all the people on Earth were unable to communicate in any way possible. Society will rapidly come to a halt, everything will start to collapse, and we’ll become, once again, a species among many in this planet, losing what once gave us the power to control everything around us. That’s were AI is today, unable to get out of his corner of the world, of the dataset in which it is trained, without the ability to share the results of his expensive computations with other learning systems.
Let’s take a look now at what is stoping us from enabling communication between ANIs, and thus, enabling the discovery and sharing of knowledge between different specific tasks.
The Problem Of Transfer Learning: Mastering A Task And Communicating Our Knowlege
Transfer learning is the process by which we are able to use the knowledge that we have gained from mastering a task to improve our perfomance in another task or domain, or to learn a new skill faster.
If we talk about Neural Networks, we could call the network that has the knowledge “Master”, and the network that wants to learn from it and apply that knowledge to a similar or a different task “Student”.
Here’s what the problem looks like:
Today, this problem remains unsolved. The best that we have achieved so far is training a neural network in a task like recognizing images of a vast and varied dataset, and then trying to apply that same network to a very similar problem.
As we have seen, this problem is one of the main obstacles for achieving AGI, so it’s very important that we try to go as far as we can in finding a possible solution.
Following the metaphor for AGI as society and ANI as individuals within it, let’s try to see how we, humans, are able to extract and communicate knowledge by watching others and expressing ourselves.
We are able to teach others what we know using one of the most ancient inventions of humankind. It’s so important and fundamental to our view of the world, that everything that we think we know, or that we are trying to grasp, lives within one of these things.
I’m talking, of course, about stories.
Once Upon A Time… : Generalization As Finding Patterns Between The Solutions Of Specific Problems
Let’s take a detour from our main topic to briefly explore how stories enable us to communicate what we learn from our experience and the experiences of others.
“The shoe that fits one person pinches another; there is no recipe for living that suits all cases” said Carl Gustav Jung, one of the most influencial psychologists of the 20th century.
To me, this means that each one of us is confronted with a different, unique and specific combination of problems. In a sense, every human is like a Narrow or Weak Intelligence, but because we can communicate with other human beings that are also trying to deal with the uniquely individual problems that they find in their lifes, we are able to find what is common between our individual narratives, and, on that basis, cooperate to conceptualize, grasp and solve complex problems by means of identifying the most important elements in each other’s experiences, and to contribute with our unique point of view, skills and strengths towards the solution of the problem at hand, and all of this while we learn from each other.
This process is identical with that of telling a compelling story.
This may seem far fetched or completely unrelated to AGI, but it’s exactly what will enable us to develop it.
Here’s where I’m trying to get at: through stories, we are able to make our implicit, embodied knowledge of the world, explicit. Not directly, not in one stroke, but by repeatedly telling stories about what we see and experience, and telling stories about those stories. That’s how, over long stretches of time, you end up with a fabric of interconnected narratives, which become the foundation, the birthplace, of something like abstract thinking, math, religion or the idea of computers.
By means of storytelling, we encapsulate useful knowledge, that then gets added to the collective corpus of creative works, ideas and mythology, and that makes it available for any human, from that moment onwards, to be able to enjoy it and learn from it. This is how culture is born and expanded.
If we can enable our machines to “narrate” each other’s experiences and extract the implicit knowledge from the solutions that they learn, not only we’ll have the best Go player in the world, we’ll have the best player being able to communicate its knowledge to ourselves and, most importantly, other machines. And in the process of communicating that knowledge, this player, could also start to gain insight of its own inner workings (maybe the baby steps of self-reflection and consciousness???).
This sounds interesting, but how can we enable our ANIs to tell stories and share their knowledge? And how exactly could this lead to an AGI? Let’s explore it.
Communicating Knowledge: Moving From Events And Experiences To Stories
Let’s come back to our main topic and see how we can try to apply the idea of storytelling to AI.
In our metaphor, ANI is like each one of us, and AGI is akin to our culture, but for that to be true, we need a last ingredient. Culture is not only the individuals that form it, but the collections of all the works that have ever been produced, shared and cosidered worthy of preserving.
In our culture, this “works” come in the form of books, movies, stories, series, songs, sculptures, dramatic interpretations, and much, much more. Basically, any form in which we express ourselves.
Let’s try to go back to the Transfer Learning problem that we mentioned before, and see how we can apply what we have learned about the way we share our knowledge of the world.
We still have the Master and the Student. The Master already knows how to solve a specific task, let’s say is playing Go. But this network doesn’t know how to represent his knowledge to itself or how to communicate it to other AIs. It will be useless to try to do something like copying the weights of this network or something similar, that wil be like trying to teach a child how to walk by copying and pasting our neural connections from our brain to his/hers.
Instead let’s look at what we do to solve this challenge. If we want to learn from someone, we can look at he or she in action, we receive not only the input of the environment, but the action that this person does in response. And also, we can know how it felt and what was her/his subjective experience.
Disclaimer: This is just an sketch or basic idea of how this could be done today, but none of this is tested and could be completely wrong. My only purpose by saying the following is to illustrate what a system like this could look like if we started to build it with our current AI. Also this part uses more complex machine learning concepts, so feel free to skip to the next section if you don’t understand much.
If we tried to model this process in a Neural Network, in a reinforcement learning environment for example, we could take as input a list like this (timestep, observation, action, reward), and maybe we could also add the output of a hidden layer to get something like the internal “experience” of the Master network.
We could then feed this data to a Recurrent Neural Network, that is part of the Student, that, let’s say, exists in a similar environment to that of the Master. Then it has to learn to find patterns in the recorded performance of the Master, that maximizes its reward, while optimizing its own weights in the rest of its architecture to learn the task at hand. And if we were to reduce the dimensionality of its representation, like in autoencoders, we could store this dense representation, as a “cultural work”, as densely packed information that contains what is common accross the individual tasks in a way that can be shared and stored.
This will perform then, a function similar to that of a story, in that it condenses a series experiences and “informed” behavior from different domains into a piece or sequence of information that can be used by other agents to solve different problems.
I’m not saying this is easy or that any of what I just said could work, what I mean is that this is something that is very important to research upon and that maybe there’s a possibility that it can be achieved with current AI, or with something very similar. Also, if we solve this, we could solve the Transfer Learning problem and maybe get AGI by enabling our machines to build a reservoir of knowledge from any given set of problems.
The Big Picture: Knowledge Extractor/Culture Generator
Let’s suppose now, that we develop a way to extract the most crucial information from the combined “behavior” of our AI agents, and that we can store this information and patterns as the “collection of works” or “culture” of a population of ANIs, this taking for granted that we know how to validate this work and we know how to select it, which I think it’s outside of the scope of this article, and that we could explore in future posts.
What we’ll have then, is a group of ANIs, that starts with our current algorithms and techniques, and that accumulate, over time, a collection of knowledge about a specific set of problems, and then, are able to transfer this knowledge between one another. As a consequence, this enables our population of AIs, to build upon this reserve of shared knowledge to tackle more complex situations, increasing what we called earlier, their Space of Solvable Problems. Becoming able, as a group, to solve tasks that are increasingly difficult and complicated.
Then, interacting together, they could be able to transform a set of problems into useful, general knowledge for other ANIs. To me, this is very close to what we might call AGI or a general problem solver. A group of machines that is able to generalize from experience and becomes more intelligent over time, just as we do.
A Culture Of Narrow Machines
The “culture” that a population of ANIs could build, would be composed by the collection of the experiences of all the agents within it, and of the patterns found in those experiences by further ANIs.
In the same way that the structure of the human brain is a reflection of our social and hierachical environment, in that the psyche could be conceived as a multitude of subpersonalities with conflicting motivations and that each one cares about solving a vital neccesity for us to survive, we could see AGI as the emergent property of a multitude of ANIs working on different problems and communicating their findings. In this sense, AGI could be seen as a kind of “ego” emerging out of these complex interactions between the simpler “subpersonalities” (ANIs).
Using Culture To Think Abstractly With Scarce Resources
One more thing I’ll like the point out, it’s that if the AIs, start to use the processed knowledge as input, then instead of having huge networks to try to think abstractly and understand complex concepts, we could have smaller networks that process this condensed knowledge in order to operate in a more “abstract space”, and instead of performing huge amounts of computations at once, trade computing resources by storage, and build more and more complex shared representations over time.
Conclusion: ANI + Culture = AGI
Finally, thank you for making it this far!
I don’t know if what I have talked about made much sense, but I think is important to talk about this topics that lie in the frontier of our understanding, and try to explore possible ways in which we could develop such a powerful technology, that could change our lives forever.
I think it will a good ending for this post to summarize very quickly what we have gone through.
We have tried to explore:
- What is an AGI and how can we conceptualize it
- The isolation of the learned knowledge of our current AIs
- The problem of transfer learning
- How we master the world’s most complex problems through storytelling
- How to turn events and experiences into stories
- How enabling our machines to tell stories creates a reservoir of shared knowledge in a population of agents
- Why and how this is similar to building an AGI and mimicking our brains and society
- And how this could be used to produce “abstract” thinking over time with relatively low amounts of computation at once
That’s a lot to cover in a single article, so hopefully it all made sense! I hope at least you may find it interesting as food for thought.
This is my attempt to explore AGI, it’s not perfect or complete in any sense.
I would love to hear what you think about the post, whether you believe something like this could work or, if you disagree or don’t believe this could work, I’m curious to hear what you have to say about it. I’m open to any suggestions on what I could improve on this exposition, so feel free to comment below.
Also, I hope, this can serve as a way to stablish a barebones framework for thinking about AGI, and to help create a theory of intelligence that includes important ideas from other fields, such as psychology, anthropology and any other science.
Again, thank you for reading this! Let’s start building a better future together.