Results 1 to 13 of 13

Thread: AlphaZero

  1. #1
    CC Candidate Master
    Join Date
    Nov 2008
    Location
    Perth
    Posts
    461

    AlphaZero

    The DeepMind team behind AlphaGo have now tried their hand at chess (and shogi), with a new paper out today. Learning entirely from self-play, (i.e., starting with random moves and seeing what moves tend to win games), their program ends up stronger than Stockfish 8, winning a 100-game match 64-36 (28 wins, 3 with Black, and 72 draws; time control was one minute per move). The paper says they go from no knowledge to better-than-Stockfish in less than a day, albeit using a lot of hardware.

    Like AlphaGo, and unlike other top chess programs, AlphaZero uses a neural network and Monte Carlo Tree Search. Its search involves playing simulated games against itself through to conclusion. My read (I'm not an expert!) is that these self-play games take a relatively long time to compute, and as a result, AlphaZero only evaluates 80,000 positions per second, as compared to Stockfish's 70,000,000 positions per second. But the neural network guides the search so well that it more than offsets the slower speed.

    The paper gives ten of AlphaZero's wins over Stockfish, which Chess24 has uploaded here (complete with JavaScript Stockfish evaluations! ).

    One minor annoyance is that AlphaZero played on different hardware to Stockfish -- "a single machine with 4 TPUs", the tensor processing unit having been designed by Google for efficient machine learning. I don't know how comparable it is to Stockfish "using 64 threads and a hash size of 1GB". It looks to me like AlphaZero gets to use more processing power, having been written to take advantage of such hardware.

    Still, going from random moves to perhaps the strongest chess ever is certainly something. Some mildly interesting graphs are on page 6 of the paper, showing how frequently AlphaZero played various openings over the first eight hours of its training. It always liked the English; it picked up the Caro-Kann after 2 hours but abandoned it after 6 hours. Its principal variation for 1. e4 c5 2. Nf3 d6 is a Najdorf with 6. f3.

    AlphaZero-Stockfish
    PGN Viewer
     

  2. #2
    Batoutahelius road runner's Avatar
    Join Date
    Apr 2006
    Location
    on the skin of the pale blue dot
    Posts
    11,267
    Quote Originally Posted by pappubahry View Post
    Learning entirely from self-play, (i.e., starting with random moves and seeing what moves tend to win games), their program ends up stronger than Stockfish 8, winning a 100-game match 64-36 (28 wins, 3 with Black, and 72 draws; time control was one minute per move). The paper says they go from no knowledge to better-than-Stockfish in less than a day, albeit using a lot of hardware.
    Wow that is very impressive, they must have re-written a lot of theory
    meep meep

  3. #3
    CC Candidate Master
    Join Date
    Oct 2004
    Location
    Ottawa, Canada
    Posts
    65
    Quote Originally Posted by pappubahry View Post
    The DeepMind team behind AlphaGo have now tried their hand at chess (and shogi), with a new paper out today. Learning entirely from self-play, (i.e., starting with random moves and seeing what moves tend to win games), their program ends up stronger than Stockfish 8, winning a 100-game match 64-36 (28 wins, 3 with Black, and 72 draws; time control was one minute per move).
    28-0 is pretty impressive, even considering that AlphaZero had a hardware advantage. For comparison, in games on chessgames.com, Deep Thought beat other computers 21-2 (with draws). However, the two losses were against Hans Berliner's HiTech, which also had specialised hardware. (Darryl Johansen had a win too, but his hardware is definitely too specialised to be considered in scope for this analysis).

    In addition, Deep Blue won the 1994 ACM Computer Chess International with 4 wins and one default due to a power outage, but only scored 3.5/5 in the Hong Kong 1995 World Championship (losing to Fritz). Combining all this gives a 28-3 score for Deep Thought/Deep Blue. Probably it wouldn't be too hard to find a lot more games though. (E.g., I remember that Deep Thought used to play on the American Internet Chess Server).

    So AlphaZero has a better record versus its peers (peer?) than Deep Thought/Deep Blue if draws are ignored, but not if they are included. It's hard to compare, since the level of play is much higher now; computer matches are now usually mainly draws with one side often winning most or all of the remainder (mostly with White).

  4. #4
    CC Grandmaster
    Join Date
    Apr 2006
    Location
    Melbourne, Australia
    Posts
    9,392
    The most amazing part is that AlphaGo's play appears to be quite ''human'' in nature!
    Interested in Chess Lessons?
    Email webbaron!@gmail.com for more Info!

  5. #5
    CC Grandmaster
    Join Date
    Apr 2008
    Posts
    3,197
    An amazing breakthrough! It will be fascinating to see the technology applied to other areas, such as mathematics.

  6. #6
    CC Candidate Master
    Join Date
    Sep 2013
    Location
    Sydney
    Posts
    35
    It's called AlphaZero not AlphaChess because it can play any game of this type once you add a module telling it what the rules of that game are. So it also dusted off it's own predecessor AlphaGo that beat the world's best Go player, and rubbed it in by beating the champion Shogi program (which however got in a few wins of its own).

  7. #7
    CC Candidate Master
    Join Date
    Aug 2017
    Posts
    52
    Quote Originally Posted by triplecheck View Post
    It's called AlphaZero not AlphaChess because it can play any game of this type once you add a module telling it what the rules of that game are. So it also dusted off it's own predecessor AlphaGo that beat the world's best Go player, and rubbed it in by beating the champion Shogi program (which however got in a few wins of its own).
    Visit Talkchess Forum to know more, there is where you will find some 3000 programmers.
    This was all a scam.
    Alpha played on 30 times bigger hardware than SF, 4TPUs vs 64 cores.
    4TPUs is around 1000 cores or even more.
    Alpha had simulated opening book, trained on countless top GM winning games.
    SF had very little hash.
    TC was fixed at 1 minute per move, which is again detrimental to SF, which has advanced time management.
    TPUs lack the SMP inefficiencies with more cores, so the hardware advantage was even bigger.
    Etc, etc., so basically, this was just a huge publicity stunt on the part of Google.
    Currently, Alpha is around 2800 on single core, so 400 elos below SF, and will not advanced much in the future, as, from now on, it will need advanced evaluation it will not be able to discover.
    Concerning the 4-hours issue, well, LOL, this was 48 hours ago, so now Alpha is at 5000 elo?
    Come on.

  8. #8
    CC Candidate Master
    Join Date
    Dec 2014
    Posts
    284
    Quote Originally Posted by LyudmilTsvetkov View Post
    Visit Talkchess Forum to know more, there is where you will find some 3000 programmers.
    This was all a scam.
    Alpha played on 30 times bigger hardware than SF, 4TPUs vs 64 cores.
    4TPUs is around 1000 cores or even more.
    Alpha had simulated opening book, trained on countless top GM winning games.
    SF had very little hash.
    TC was fixed at 1 minute per move, which is again detrimental to SF, which has advanced time management.
    TPUs lack the SMP inefficiencies with more cores, so the hardware advantage was even bigger.
    Etc, etc., so basically, this was just a huge publicity stunt on the part of Google.
    Currently, Alpha is around 2800 on single core, so 400 elos below SF, and will not advanced much in the future, as, from now on, it will need advanced evaluation it will not be able to discover.
    Concerning the 4-hours issue, well, LOL, this was 48 hours ago, so now Alpha is at 5000 elo?
    Come on.
    Yes, I also have similar misgivings. I would guess that the other factors contributed to the result, but that the hardware advantage was decisive.
    Southern Suburbs Chess Club (Perth)
    www.southernsuburbschessclub.org.au

  9. #9
    CC International Master
    Join Date
    Jul 2005
    Posts
    1,978
    Reminds me of Deep Blue beating Kasparov: similar story but different actors.

  10. #10
    CC Candidate Master
    Join Date
    Aug 2017
    Posts
    52
    Quote Originally Posted by Vlad View Post
    Reminds me of Deep Blue beating Kasparov: similar story but different actors.
    Precisely.
    Deep Blue would have been 2500 or so on single core.

  11. #11
    Batoutahelius road runner's Avatar
    Join Date
    Apr 2006
    Location
    on the skin of the pale blue dot
    Posts
    11,267
    Quote Originally Posted by LyudmilTsvetkov View Post
    Precisely.
    Deep Blue would have been 2500 or so on single core.
    What is the relevance of comparison on single core? Seems to just gimp applications that take advantage of multi-threading.
    meep meep

  12. #12
    CC International Master
    Join Date
    Jan 2004
    Location
    Wynyard,Tas
    Posts
    2,035
    It's impressive that AlphaZero is able to be taught the rules of anything. This at least raises the prospect of robot arbiters, which could uncontroversially enforce, say, rules about two-handed castling or dress codes.

    Less impressive is that as far as I can tell the developers simply put it back in the box after flogging a neutered Stockfish. It briefly sounds good but Stockfish, running on a bigger computer, could also beat loser-Stockfish. So it's not really an advance.

    In the spirit of genuine scientific exploration I would want to

    (a) set it a harder task to see exactly how good it is, like beating a better Stockfish, or running both opponents on identical laptops from Harvey Norman, or evaluating the famous drawn minor piece ending from Karjakin-Carlsen.

    (b) play against strong human opponents; that sounds silly but if the argument is that it is in some sense "thinking" rather than just taking advantage of its electronic advantage that might be interesting - unfortunately this would be of no marketing benefit if it won (computers already beat humans) but would be a bit of a downer if it lost, which perhaps explains why it wasn't tried.

    (c) let it teach itself for longer and try again (after first getting a banchmark against a stronger opponent) to see if it keeps getting better, rather than just making impressive strides for four hours then hitting a wall.

    I haven't seen anything (as revealed by Google) about what they intend to do next with chess, if anything - does anybody know more? Let's hope that the plan is not to simply announce that they've solved the problem and move on to teaching it soccer or line dancing.

  13. #13
    CC Candidate Master
    Join Date
    Sep 2013
    Location
    Sydney
    Posts
    35
    Quote Originally Posted by Ian Rout View Post
    Less impressive is that as far as I can tell the developers simply put it back in the box after flogging a neutered Stockfish. It briefly sounds good but Stockfish, running on a bigger computer, could also beat loser-Stockfish.
    Well, AlphaZero also beat the best available Go and Shogi programs, which it had more trouble against. By the way, how is Stockfish coming along with its Go game??

    like beating a better Stockfish, or running both opponents on identical laptops from Harvey Norman
    Stockfish, I read would gain only 10 Elo from doubling its number of processors, and another 5 from doubling again. Rather more helpful would be its tablebases and opening book, but what's the interest in playing an opponent who after every move runs over to the bookshelf and comes back and plays the recommended move. Is this AI? It's more like being a librarian. I doubt the Google team want to build their own libraries. It's not interesting to do.

    You understand that it would be running on the graphic processor of the Harvey Norman laptop?

    play against strong human opponents; that sounds silly
    I guess.

    let it teach itself for longer and try again (after first getting a banchmark against a stronger opponent) to see if it keeps getting better, rather than just making impressive strides for four hours then hitting a wall.
    Actually it hit a wall after about 2 hours and only gained about another 40 Elo in the last two. And one estimate was that it was drawing 1.25 megawatts while doing this training. You want to pay the power bill?

    I haven't seen anything (as revealed by Google) about what they intend to do next with chess, if anything - does anybody know more? .
    Probably not much. Go and Shogi seem to have prior claims to a title shot.

    move on to teaching it soccer or line dancing
    More like medical diagnostics (which they will talk about) and military drone swarm tactics (which they won't talk about).

    No, it isn't really clear how strong it is. I went looking for estimates and they varied between 3400 and 3500. Say it's 3500 - if it isn't then it will be soon, since these large machine-learning chips are new and are being rapidly developed. The ultimate limit for a chessplayer (that is, you can calculate any position right to the end) is conjectured to be between 3500 and 3600 with near certainty that it's less than 3600.

    Once you're 3500 then you are as close as dammit to perfect. Nothing ever will be able to score more than 60% against you. You can draw 80% of your games against God (maybe much better if you decide to start playing for a draw!). There's not much room for further improvement and you might as well just believe what it tells you will be the final result if you play this move here. When Stockfish dives in to a position then with 30-ply searches and tablebases it is frequently bumping along the bottom. There's not enough water under it to dive deeper.

Thread Information

Users Browsing this Thread

There are currently 2 users browsing this thread. (0 members and 2 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •