Wednesday, December 6, 2017

Another milestone in computer chess

This just in from the Deep Mind team: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

The AlphaZero algorithm is a more generic version of the AlphaGo Zero algorithm that was first introduced in the context of Go (29). It replaces the handcrafted knowledge and domain-specific augmentations used in traditional game-playing programs with deep neural networks and a tabula rasa reinforcement learning algorithm.


AlphaZero convincingly defeated all opponents, losing zero games to Stockfish and eight games to Elmo (see Supplementary Material for several example games), as well as defeating the previous version of AlphaGo Zero.


we analysed the chess knowledge discovered by AlphaZero. Table 2 analyses the most common human openings (those played more than 100,000 times in an online database of human chess games (1)). Each of these openings is independently discovered and played frequently by AlphaZero during self-play training. When starting from each human opening, AlphaZero convincingly defeated Stockfish, suggesting that it has indeed mastered a wide spectrum of chess play.

As for myself, I seem to hang pieces more frequently than I did a decade ago.

But I still love chess.

And, in that part of the world not (yet) inhabited solely by deep neural networks, That Norwegian Genius is going to play again, in London, next November: London Will Host FIDE World Chess Championship Match 2018.

No comments:

Post a Comment