Deepmind’s New AI Can Beat You in Any of the 57 Atari 2600 Games

Agent57 deepmind feat.

Artificial Intelligence (AI) is a technology that has been taking over every industry since the recent past. From surveillance tasks to even analysing human traits, AI has shown great promises in every sector in which it has been implemented. Now, a new AI unveiled recently shows that it can beat human beings in all of the classic Atari 2600 games.

Google’s parent company, Alphabet’s London-based research subsidiary, Deepmind has created Agent57 that outperforms standard human benchmarks in all 57 Atari 2600 games. Previously, we saw the company create an AI that could render 3D models from 2D images. This time, in a recent paper, the company stated that it has created the Agent57 which is the first deep Reinforced Learning (RL) agent that has the capability to beat any human in Atari 2600 games, all 57 of them. Hence, the name Agent57.

Back in 2012, Deepmind recommended the Arcade Learning Environment, which is a collection of 57 Atari 2600 (named Atari57), as a benchmark set of tasks for an AI to master. According to the company, this varied range of games challenges the AI in an array of ways. So, since this time, these Atari games have become a benchmark in the Reinforcement Learning (RL) community.

Now, Deepmind, to create the Agent57, linked their previous exploration agent, “Never Give Up” (NGU) with a meta-controller. This was to achieve an exploration-exploitation balance in playing games. According to Deepmind, if an agent learns when to explore a game and when to exploit it, then it can achieve above human-level performance in both easy and hard games.

Upon combining the meta-controller with the NGU exploration agent, the Agent57 was born that can learn a family of policies in the games and the meta-controller selects the choice of a policy. This enables the agent to beat any human in all of the 57 Atari 2600 games.

However, the London-based research company still think that Agent57 can be improved. As the AI learns more when it fails in a task, it has a lot of scope in the future.

SOURCE Deepmind
comment Comments 0
Leave a Reply