Page 1 of 2 12 LastLast
Results 1 to 15 of 27
  1. #1
    CC International Master
    Join Date
    Jul 2005
    Posts
    2,148

    performance ratings

    I was thinking another day how performance ratings are calculated and find the process less than satisfactory. The first well known issue is that average rating is used instead of doing it game by game. This one is easy to fix and this is what I believe Bill is doing in Australia.

    The second problem is that actual ratings are used in calculations. This might be not so good idea because the ratings are quite often lagging. Take example of Raymond in the last championship. His actual rating was 2050, while he performed about 2300. When the performance ratings of his opponents are calculated the above fact is not taken into account. This problem is easy to fix if approach that scientists use is applied. (Scientists have a very similar problem. They rank scientific journals. There are cross references from one journal to another. There is original ranking, which is used to calculate new ranking based on references a particular journal receives from other journals. The process is repeated until the ranking converges.)

    In performance rating terms the process can be amended in the following way. After old performance ratings are calculated, they are used as initial ratings in the process and using them new performance ratings are calculated. The process is repeated until performance ratings converge. The process will give true performance rating which does not depend on your initial rating.

  2. #2
    Reader in Slood Dynamics Rincewind's Avatar
    Join Date
    Jan 2004
    Location
    The multiverse
    Posts
    21,567
    Quote Originally Posted by drug
    I was thinking another day how performance ratings are calculated and find the process less than satisfactory. The first well known issue is that average rating is used instead of doing it game by game. This one is easy to fix and this is what I believe Bill is doing in Australia.
    Can you expand on how this works?
    So einfach wie möglich, aber nicht einfacher - Albert Einstein

  3. #3
    Illuminati Bill Gletsos's Avatar
    Join Date
    Jan 2004
    Location
    Sydney
    Posts
    16,544
    Quote Originally Posted by Rincewind
    Can you expand on how this works?
    It involves not only game by game but also iteration. I explained how it works in post http://www.chesschat.org/showpost.ph...1&postcount=39
    The Force can have a strong influence on the weak-minded.

  4. #4
    Reader in Slood Dynamics Rincewind's Avatar
    Join Date
    Jan 2004
    Location
    The multiverse
    Posts
    21,567
    Quote Originally Posted by Bill Gletsos
    It involves not only game by game but also iteration. I explained how it works in post http://www.chesschat.org/showpost.ph...1&postcount=39
    Thanks. I think game-by-game is a bit of a misnomer as taking an average is also game-by-game in the same sense. What you are doing is matching true expected total scores, rather than averaging the rating of opponents.

    From a purist point of view, it's a pity that iteration is required (although the function shoud be monotonic which greatly simplifies the task of root finding).

    Have you looked for an explicit form? Refresh my memory, is the expected score calculated using a normal or logistic distribution?
    So einfach wie möglich, aber nicht einfacher - Albert Einstein

  5. #5
    CC International Master
    Join Date
    Jul 2005
    Posts
    2,148
    Quote Originally Posted by Bill Gletsos
    It involves not only game by game but also iteration. I explained how it works in post http://www.chesschat.org/showpost.ph...1&postcount=39
    I hope it is clear that your iteration is defferent from my iteration. If I correctly understand your post, your iteration is needed because in the first place you take an average rating, which is an approximate for calculating the performance. My iteration comes because the true performance rating should not depend on the initial rating.

    p.S. A stupid question - I hope you do not iterate by your hands. It will take me about 15 minutes to write a computer program that can do that.

  6. #6
    Illuminati Bill Gletsos's Avatar
    Join Date
    Jan 2004
    Location
    Sydney
    Posts
    16,544
    Quote Originally Posted by drug
    I hope it is clear that your iteration is defferent from my iteration.
    It is to me. I cannot speak for others.
    Quote Originally Posted by drug
    If I correctly understand your post, your iteration is needed because in the first place you take an average rating, which is an approximate for calculating the performance. My iteration comes because the true performance rating should not depend on the initial rating.
    Of course there are situations where you may not get convergence.
    Quote Originally Posted by drug
    p.S. A stupid question - I hope you do not iterate by your hands.
    No. By program.
    Quote Originally Posted by drug
    It will take me about 15 minutes to write a computer program that can do that.
    You can also do it via Excel.
    The Force can have a strong influence on the weak-minded.

  7. #7
    Illuminati Bill Gletsos's Avatar
    Join Date
    Jan 2004
    Location
    Sydney
    Posts
    16,544
    Quote Originally Posted by Rincewind
    Thanks. I think game-by-game is a bit of a misnomer as taking an average is also game-by-game in the same sense. What you are doing is matching true expected total scores, rather than averaging the rating of opponents.

    From a purist point of view, it's a pity that iteration is required (although the function shoud be monotonic which greatly simplifies the task of root finding).

    Have you looked for an explicit form? Refresh my memory, is the expected score calculated using a normal or logistic distribution?
    Logistic, however in my example both logistic and normal round to 2626.
    Logistic is 2626.2 and normal 2625.8.
    Last edited by Bill Gletsos; 23-01-2006 at 02:11 PM.
    The Force can have a strong influence on the weak-minded.

  8. #8
    CC International Master
    Join Date
    Jul 2005
    Posts
    2,148
    Quote Originally Posted by Bill Gletsos
    Of course there are situations where you may not get convergence.
    Do not agree. The process of performance rating calculation is equivalent to multiplying a vector by a matrix. If you repeat it - it means you multiply by the same matrix again.

    So what you have is x=A^n*y, where y is a vector of initial ratings. A is a matrix that gives performance rating when you multiply initial ratings by it. It is to the power of n because we repeat the process potentially infinite number of times.

    There is a very well known theorem in mathematics (unfortunately I forgot its name), which is saying that in the limit when n goes to infinity x has to converge to a unique vector, which is the eigenvector for matrix A.

  9. #9
    Illuminati Bill Gletsos's Avatar
    Join Date
    Jan 2004
    Location
    Sydney
    Posts
    16,544
    Quote Originally Posted by drug
    Do not agree.
    Sorry, I should have been clearer.
    There are situations where I dont believe they meaningfully converge.
    Quote Originally Posted by drug
    The process of performance rating calculation is equivalent to multiplying a vector by a matrix. If you repeat it - it means you multiply by the same matrix again.

    So what you have is x=A^n*y, where y is a vector of initial ratings. A is a matrix that gives performance rating when you multiply initial ratings by it. It is to the power of n because we repeat the process potentially infinite number of times.

    There is a very well known theorem in mathematics (unfortunately I forgot its name), which is saying that in the limit when n goes to infinity x has to converge to a unique vector, which is the eigenvector for matrix A.
    I know the one you mean. Its name escapes me too.


    Now perhaps I am wrong but I seem to recall seeing an example where they do not meaningfully converge.
    If I remember it correctly it goes something like the following.

    Players A and B draw with each and both beat player C and player D.
    Player C in turn draws with Player D.

    All start out nominally rated at 1500.
    Their performance ratings wont meaningfully converge.

    The reason why is as follows:

    A=B+C+D
    B=A+C+D
    C-A-B=D
    D-A-B=C

    In the above scenario with A, B, C and D you effectively have two groups (A,B) , (C,D) and everyone in the first group beats everyone they play in the second group. So this effectively 100% / 0% and (A,B) keep rising and (C,D) keep falling.

    Eventually they will converge at the point where (A,B)'s rating is around 920 points higher than (C,D). However are they meaningful.

    It is essentially the problem of how much stronger is Player X than Player Y when X has 100% score against Y over a number of games. All you know from the logistic scale is that X is possibly 920 points higher than Y or if using the normal scale 735 higher, however how much higher is unknown.
    Last edited by Bill Gletsos; 23-01-2006 at 02:43 PM.
    The Force can have a strong influence on the weak-minded.

  10. #10
    CC International Master
    Join Date
    Jul 2005
    Posts
    2,148
    Quote Originally Posted by Bill Gletsos
    Sorry, I should have been clearer.
    There are situations where I dont believe they meaningfully converge.
    I know the one you mean. Its name escapes me too.


    Now perhaps I am wrong but I seem to recall seeing an example where they do not meaningfully converge.
    If I remember it correctly it goes something like the following.

    Players A and B draw with each and both beat player C and player D.
    Player C in turn draws with Player D.

    All start out nominally rated at 1500.
    Their performance ratings wont meaningfully converge.

    The reason why is as follows:

    A=B+C+D
    B=A+C+D
    C-A-B=D
    D-A-B=C

    In the above scenario with A, B, C and D you effectively have two groups (A,B) , (C,D) and everyone in the first group beats everyone they play in the second group. So this effectively 100% / 0% and (A,B) keep rising and (C,D) keep falling.

    Eventually they will converge at the point where (A,B)'s rating is around 920 points higher than (C,D). However are they meaningful.

    It is essentially the problem of how much stronger is Player X than Player Y when X has 100% score against Y over a number of games. All you know from the logistic scale is that X is possibly 920 points higher than Y or if using the normal scale 735 higher, however how much higher is unknown.
    Yep, that is right. The same is true for scietific journals however as i mentioned it is implemented. The probablity of this event in a reasonable 11-round competition is pretty small.

    Say, for the recent championship do you believe this problem is relevant?

  11. #11
    Illuminati Bill Gletsos's Avatar
    Join Date
    Jan 2004
    Location
    Sydney
    Posts
    16,544
    In fact try this scenario.

    Players A and B draw with each and both beat player C.
    Player C in turn draws with Player D.

    All start out nominally rated at 1500.
    Their performance ratings wont meaningfully converge.

    The reason why is as follows:

    A=B+C
    B=A+C
    C-A-B=D
    D=C

    These dont seem to converge at all.
    The ratings for all 4 players just continue to drop.
    Last edited by Bill Gletsos; 23-01-2006 at 04:04 PM.
    The Force can have a strong influence on the weak-minded.

  12. #12
    Illuminati Bill Gletsos's Avatar
    Join Date
    Jan 2004
    Location
    Sydney
    Posts
    16,544
    Quote Originally Posted by drug
    Yep, that is right.
    Thanks for confirming that.
    Quote Originally Posted by drug
    The same is true for scietific journals however as i mentioned it is implemented. The probablity of this event in a reasonable 11-round competition is pretty small.
    True, provided you dont have a number of withdrawls who only played a few games and also played each other.
    I've also seen it happen enough in swiss events with many new players who are mostly unrated and very weak.

    Quote Originally Posted by drug
    Say, for the recent championship do you believe this problem is relevant?
    No.
    Last edited by Bill Gletsos; 23-01-2006 at 04:06 PM.
    The Force can have a strong influence on the weak-minded.

  13. #13
    Reader in Slood Dynamics Rincewind's Avatar
    Join Date
    Jan 2004
    Location
    The multiverse
    Posts
    21,567
    Quote Originally Posted by Bill Gletsos
    Logistic, however in my example both logistic and normal round to 2626.
    Logistic is 2626.2 and normal 2625.8.
    Well logistic is probably easier to work with. In that case it looks like you are dealing with a polynomial of degree n where n is the number of games in the calculation. Once n is reasonably large (greater than 4 or 5) you are doing iterative calculations anyway. With just a little bit of effort you could probably express this polynomial for a general n and then it would be a piece of cake to write a root-finding routine that converges very quickly using something as simple as Newton's method.

    However an explicit form would seem to be unobtainable.

    Out of interest what is a typical number of iterations currently required?
    So einfach wie möglich, aber nicht einfacher - Albert Einstein

  14. #14
    Reader in Slood Dynamics Rincewind's Avatar
    Join Date
    Jan 2004
    Location
    The multiverse
    Posts
    21,567
    Actually, expressing the thing as a polynomial is not a requirement to use Newton's method. You just need something which is differentiable and the starting function is is that. Here is a brief description of the method please let me know if you want further details...

    First thing is to normalise the opponent coefficients. call these a[r]

    a[r] = 10 ^ (rating/400)

    for r = 1..n

    Then if E is the expected score you are matching to you define a function y such that

    y = sum(1 / (1 + x * a[r]), r=1..n) - E

    and

    y' = - sum(a[r] / (1 + x * a[r])^2)

    Then you set x[0] equal to a first guess which you calculate this way (this is the standard performance rating)...

    set a = 10 ^ (averagerating/400)
    the set x[0] = (numberofgames - E) / (E * a)

    Then proceed as normal for Newton method...

    x[i+1] = x[i] - y(x[i])/y'(x[i])

    to convert x[i] into a rating use the formula

    rating = -400 * log(x[i])

    Using this method and your example data I get the right rating to one decimal place in one iteration. In three iterations I get the exact answer to the accuracy of single floating point in Excel (2626.191843)

    (Edit: fixed a couple of errors in the above 28-Jan-2006)
    So einfach wie möglich, aber nicht einfacher - Albert Einstein

  15. #15
    Reader in Slood Dynamics Rincewind's Avatar
    Join Date
    Jan 2004
    Location
    The multiverse
    Posts
    21,567
    NB Convergence is not guaranteed by this method and the system I outline above using the example data seems to not converge if you use a very low value of initial rating guess. Best single value seems to be a starting value of x[0] = 0 (equivalent to a rating guess of +infinity). Using this you get absolute converge to the accuracy of Excel in 5 iterations. However using the initial guess as outlined above is probably still best.
    So einfach wie möglich, aber nicht einfacher - Albert Einstein

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. A Ratings Calculator with a Database??
    By jase in forum Ratings Arena
    Replies: 1
    Last Post: 22-01-2014, 05:17 PM
  2. Old ratings used for the next ratings
    By Candy-Cane in forum Ratings Arena
    Replies: 1
    Last Post: 22-06-2005, 08:56 PM
  3. Junior performance and training
    By auriga in forum General Chess Chat
    Replies: 58
    Last Post: 20-12-2004, 09:34 PM
  4. ACF September 2004 Ratings
    By Bill Gletsos in forum Ratings Arena
    Replies: 172
    Last Post: 09-11-2004, 12:45 PM
  5. ACF March 2004 Ratings
    By Bill Gletsos in forum Ratings Arena
    Replies: 310
    Last Post: 14-04-2004, 03:58 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •