Blog: From AI to Game Master
A personal fact about me is that I love watching games more than playing them. I don’t know when or where it stemmed from, but the action of being an observer, analyzing high level play was very intriguing to me. So when I had heard about existing AI which could beat the best of the best players in the games which I have enjoyed the most such as Dota 2 and Starcraft II, I was immediately fascinated.
How I got Interested
I was recommended this documentary on YouTube a few days ago. I didn’t a had a lot of free time that day and I was dying to waste some time, so I clicked on it. And that began my interest in AI and this topic as a whole. Before I delve into this blog, it should be noted that the technical aspects will be kept to a minimum.
For those of you who don’t know what AlphaGo is, it is an artificial intelligent(AI) which plays the game of Go at a very high level. Before this documentary, I had no background of the game of Go. I have heard of it before however, I did not understand any strategies that were being implemented nor was I able to truly gauge the effectiveness of each players move towards the endgame. Nevertheless, I, as a bystander, was encaptured.
I, as someone who has never been introduced to Go before this documentary, do not know the severity of this move. However, as I listen to Go analyst, commentators, and professionals, I understand how profound this move actually is. A supposed “bad” move in terms of following the textbook was considered beautiful and revolutionary to the best of the best Go players. Simply, why would a human play a 1/10,000 move? Even crazier than that would be, why a probability psychotic of a computer would play such a daring move? AlphaGo calculated the probability of a human playing move 37 as 1 in 10,000. It should have labeled it as a non-probabilistic move. Yet, it decided that it was the right move. What did this computer learn that allowed it to create a “signature move,” as the commentators exclaim.
Similar to move 37, move 78 was considered a 1 in 10,000 move. The commentators exclaimed Lee Sedol’s move as being a “God” move allowing him to come back from an almost impossible game and finally beat AlphaGo for the first time in their best of 5 match. The beauty in this move, from an outside point of view, isn’t how Lee Sedol chose to make this move, but instead stems from seeing him overcoming hurdles that were never introduced before such as the tactics employed by AlphaGo. It allowed him to improve on his game and evolve his strategy.
It was all very interesting, seeing things I had never been introduced to before and seeing how others reacted when everything they knew about the game they lived for was essentially changed in the span of a few days. As mentioned earlier, I do not have any background in Go therefore I did not understand the true implications of their moves. So I looked into it more and saw that people were already writing AI’s for games which I have taken a high interest in and have spent a fair portion of my time analyzing and understanding different strategies.
AlphaStar was created to challenge one of the hardest real time strategy games out there, Starcraft II. It was tested by playing very high skilled players such as TLO, a personal favorite of mine, and MaNa who are long time veterans of the game. Simply put, they lost. Badly too. In the most unorthodox of ways. In one of the games, AlphaStar created many units called “disruptors.” Many is an understatement because professionals will typically make around a fourth of what the amount AlphaStar created. Yet it still managed to win. In another game that it won, it committed entirely on a strategy that the opponent, MaNa, completely knew was coming. The reason this is unorthodox and surprising is simply from the idea that the opponent already knows it is coming. The opponent, MaNa, could have made heavy precautions to counteract the strategy, so why would AlphaStar continue with a strategy that logically wouldn’t succeed?
Separate from the unorthodox strategies that AlphaStar employs are its game mechanics. After thinking to myself that these strategies would never work against professionals, AlphaStar showed its true beauty. It’s ability to micromanage individual units from its large army is inhumane. No professional has the ability to watch and keep alive 100 different units, especially to the extent of keeping each unit alive as long as it possibly can. Maybe because it sees the map like this.
The white box at the top is the portion of the map that AlphaStar is focusing on. Nevertheless, It literally turned multiple situations that seemed impossible into its favor. Another amazing feature is its unwavering attacks in very unfavorable situations such as ramps. The reason ramps are so terrifying is because you don’t know what is at the top of the ramp if you are at the bottom. Is there a wall? Will there be a 100 units waiting for me up there? The ramp is a scary situation to approach, and unless you are confident, generally you stay away from ramps. However, AlphaStar does not waver and charges forward without thinking too much of the repercussions. Even more terrifying than this is its ability to expend the minimum amount of clicks necessary for each situation.
On average, AlphaStar has an APM that is far below a professional. If one were to look at this graph not knowing it was an AI, they would say that a mediocre player is playing. This shows the effectiveness of each of every single action AlphaStar does in it’s game.
How did it Turn out like This?
I was intrigued to know how AlphaStar became and played like this. So I found out that it begins by commencing imitation learning, which is a method to look at replays of people of different skill levels, professional or not, and imitate/ learn from them. It is then placed into something called an AlphaStar League where it competes against other AlphaStars that have learned different strategies from different replays. As they play each other, the AlphaStars update their own strategies with the goal of defeating all other strategies as effectively as possible through a method called reinforcement learning. What makes this portion interesting is that AlphaStar competitors are encouraged to specialize by adapting their personal learning objective. This means that some AlphaStars are favorable of competing against a specific race or employing a specific strategy, like cannon proxy, or building a single type of unit which was seen in the match against MaNa where too many disruptors were built. At the end of this league, 5 different versions of AlphaStar were chosen, with the least exploitable strategies, to go against the professionals. To sum this up, AlphaStar plays against multiple versions of itself to update and improve its own strategy.
Let’s Make Sense of It
Imagine a game just came out. You don’t know how to operate the game, you don’t know any of the game’s mechanics, and you don’t have any strategy for tackling a given situation. That’s exactly how it was when Starcraft first came out. There were no strategies. But eventually, people discovered a method to play. Then, someone lost to it and developed a counter measure. Then someone created a counter to that, then a counter to that, and so on. The new strategy would become the new way to play the game. Over time, these new strategies are introduced to the ladder and the result of it are the methods and strategies we play with today.
The methods of AlphaStar are bizarre. But this is from the perspective of someone who has grown with the Starcraft community to understand and improve on existing strategies to better the game. However, from the perspective of someone who has grown an interest in AI, I say that maybe in an alternate universe, AlphaStar’s style of gameplay with its odd strategies might just have been the new meta.
I looked further into AI specialized for games and found OpenAI, which is the AI built to play Dota 2, a multiplayer online battle arena, MOBA for short. Similar to AlphaStar, OpenAI showed exceptional game mechanics, seen in its reaction time against the team fight initiations from the human opponents and the envious creep blocks. Combined with some questionable, unorthodox strategies such as killing Roshan when their base is being destroyed, OpenAI was a joy to watch. To anyone who has an interest in any of these games, I highly recommend watching a pro compete against AI, you will be mesmerized by its otherworldly playing style. Probably one of the most entertaining things I saw was it’s smack talk.
Is AI Perfect?
After being fascinated by AlphaGo, impressed by AlphaStar, and finally just entertained by OpenAI, I was beginning to take an interest in the faults of these different systems. I told myself that they are good but they aren’t perfect. AlphaGo lost because Lee Sedol made a 1 in 10,000 move. AlphaStar lost to MaNa because it failed to defend its base during the most critical of times. OpenAI lost because they failed to stick together. In all these situations, the AI failed to react properly towards the extreme cases. The situations would not have led to their defeat if the AI simply continued to play like it normally does. It failed to make rational decisions. So is AI perfect? No. But it sure is amazing at playing games already.