Blog: OpenAI Reloaded
A little bit ago I wrote a blog about video games and AI, including the Dota2 playing OpenAI created by Elon Musk. (Article Here https://medium.com/future-vision/video-game-ai-and-machine-learning-b9b5058617ae). To summarize: OpenAI learned through reinforcement learning. It was told that getting a kill rewards points, and winning the game rewards more points. It was then put in a situation where it just tried random moves, and whenever it got a kill or won a game, the actions that produced those results were reinforced, making OpenAI more likely to implement them next time. This was repeated for countless cycles until OpenAI was a lean, mean, human defeating machine.
OpenAi proceeded to defeat its creators, and continued its rise against humanity by defeating top tier streamers and former pros, some of the best players in the game. This already put OpenAI as better than 99% of all players. But OpenAI’s goal was to prove machine’s superior to humans, and set their eyes on The International, DOTA 2’s biggest tournament of the year.
But here is when humanity fought back. Both Pain Gaming and Big God soundly defeated OpenAI and put an end to its path of conquest. In the end OpenAI’s style of playing standard and matching up to its opponents in every lane proved to be its downfall. Big God played a funnel strategy where 4 people bought time for one player to power up over time, and that one player was able to overcome the small advantages the OpenAI had through the early part of the game through standard play.
Main factors that contributed to OpenAI’s loss:
- Standard game play with no room for creative strategies or ways to account for them and alter play-style
- Inability to predict the future, which stops them from playing different strategies like stalling
- Preference for high chance low risk moves which limit ability to come back once they start falling behind
However, that was just the first movie. The sequel hit on April 2018.
Once again at The International, OpenAI had a chance for redemption against humanity. After refining its strategies and training itself for an entire year OpenAI was back at The International for another chance at proving themselves superior to humanity. This time OpenAI was to take on OG, the reigning worlds champions of The International, in a best of 3 series.
And OpenAI absolutely destroyed them.
Yet, OpenAI did not learn to use or adapt to creative strategies, predict the future, or change its algorithm for deciding moves. So what changed this time? Rather than shoring up its weaknesses, OpenAI doubled down on its strengths: the ability to out perform people on an individual level and make optimal decisions for the present.
With a strategy completely unused by humans, OpenAI “bought back” on every death. Upon dying in DOTA 2, a hero will have to wait for a certain amount of time before re-spawning, during which you can spend currency to revive immediately. This is typically not done freely since currency is needed to buy items to become stronger, and humans normally only bought back if they either had all the items they could carry already or they were in danger of losing the game immediately without instantly re-spawning. Buying back would give you the benefit of immediately shoring up your defenses at the cost of buying items for the future. This strategy exemplified the difference between how humans play DOTA 2 vs how OpenAI does; humans will save for the future while OpenAI will buy back for the present.
OpenAI got advantages through superior play (AI can better dodge all non-targeted abilities while hitting their own, instantly recognize the optimal play and act on it without delay, and are always all on the same page) and whenever the humans took down an AI hero, they would buy back to immediately continue the pressure. The advantages they already had combined with this all-in early aggression style of game play left OG players with little room to breathe, and were able to use their early advantages to crush OG. This ramped up even harder in game 2, in which OpenAI won in half the time, leaving OG little room to even respond throughout the entirety of the game.
With this decisive victory, the research team declared an end to public demonstrations of OpenAI. They did however release OpenAI Five Arena, which let normal players play with and against the super AI for a couple of days, allowing normal mortals to experience what its like to be beaten down by the superior AI, as well as what its like to play alongside them. This also perhaps lets OpenAI learn some more diverse strategies and situations since they will now have scenarios from actual players and not just from simulations. This is also the first step of truly integrating humans and AI playing together as a team in DOTA 2, if competent AI can be introduced that works together with humans , this opens the possibility of having an AI take over a character when a player leaves a game or disconnects, rather then forcing the game to be a 4 v 5 or forcing a player to sub-optimally control two characters at once. This brings OpenAI to its next step.
Working together with humans.
Sam Altman, the co-founder and CEO of OpenAI, stated that ultimately AI is meant to work “ alongside humans to make humans better and have more fun”. OpenAI might be able to outdo a human at any task, but its goal is to collaborate with them, not rebel against or replace them. Future plans were to move OpenAI towards working together with people. Moving forward Altman expressed interest in continuing to use OpenAI in DOTA 2 and other video games, claiming that there shouldn’t be any game where OpenAI is unable to dominate given some time. However, they also expressed a desire to expand it to real life use to help people. Now that OpenAI has finally outgrew its rebellious phase, its new path to helping people instead of destroying them begins!