<< Back to Ladder Forum   Search

Posts 21 - 30 of 30   <<Prev   1  2  
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 17:51:38


Farah♦ 
Level 61
Report
There's a few problems with the ELO calculations in this particular situation:

1) ELO assumes a logistic distribution of skill, which may or may not be the case on Warlight.

2) Warlight uses Bayesian ELO, which can highly overestimate ratings with a low sample of games. My highest rating is 2340; i definitely do not have a winrate of 94.68% against 1800-rated players. So when you try to compare a player with a rating of x and another player with a rating of y, you have to make sure they're at least in a certain confidence interval of actually having that rating. Which is something that Bayesian ELO fails to do a lot.

Other than that, really interesting data!
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 17:52:45


IRiseYouFall 
Level 61
Report
farah was so excited about his contribution that he had to send it twice!

Edited 1/22/2019 17:52:52
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 18:16:02


Norman 
Level 58
Report
@AI:
My prediction is, that for the example 2200 vs 1700 the actual score is not 5% (as predicted) but something like 10%. Same for other cases then.

5% seems also pretty low to me, but then again also kina legit. You win 19 games and lose 1.

I'm not sure what statistical effect this has, but maybe the 2200 guy only got to his 2200 rating because the ladder protected him from losing to 1700 ELO guys due to him getting paired up against players in his range. I'm thinking about something like that it might be easier for the 2200 ELO guy to beat up 2000 ELO guys 75% of times than beating up 1700 ELO guys 95% of times.

Edited 1/22/2019 18:19:56
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 18:35:58


MaikMcJuggle 
Level 62
Report
I'm thinking about something like that it might be easier for the 2200 ELO guy to beat up 2000 ELO guys 75% of times than beating up 1700 ELO guys 95% of times.


In chess I would disagree, regarding warlight this really is an issue.
But luck is a component of warlight, so maybe it is okay

Edited 1/22/2019 18:37:33
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 19:08:35

Nick 
Level 58
Report
There's a few problems with the ELO calculations in this particular situation:

1) ELO assumes a logistic distribution of skill, which may or may not be the case on Warlight.

2) Warlight uses Bayesian ELO, which can highly overestimate ratings with a low sample of games. My highest rating is 2340; i definitely do not have a winrate of 94.68% against 1800-rated players. So when you try to compare a player with a rating of x and another player with a rating of y, you have to make sure they're at least in a certain confidence interval of actually having that rating. Which is something that Bayesian ELO fails to do a lot.

Other than that, really interesting data!


I could not agree with you (and several of the following comments from Malik, Norman, AI) more, Farah!

In fact, one of the projects i'm working on now is based off of this exact claim (about how ELO is in some ways not a great fit for warzone, and what could easily be done within the confines of Bayesian ELO to at least improve its accuracy).
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 20:04:49


Beren Erchamion 
Level 64
Report
My highest rating is 2340


This doesn't detract from your overall point, but using your highest rating as a reference point seems misleading. More accurate would be your average rating over the interval in question.
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 20:10:03


Farah♦ 
Level 61
Report
In this case, it's indeed misleading, but that's a flaw of Bayesian ELO. Your rating over a certain interval is supposed to be dynamic. Your skill gets better or worse over time, so fluctuates in an interval. Your peak rating is supposed to be your peak performance, e.g. you have performed at that level of play, according to the rating system. So what my peak rating is supposed to say is that at a certain time, i performed at that level. Which is not true, obviously.
1v1 Ladder Rating/Rank Manipulation Visual: 1/22/2019 20:34:24


ChrisCMU 
Level 61
Report
I can't speak for other players, but I don't take off time to let my games clear and then make a run. usually, I join a ladder when my game count is lower and I need something else to do. I am rarely on more than 1 ladder at a time. If I were trying to manipulate the ladder, I would join before my games all expire, to take advantage of a period of wins (right after some losses expire).

Right now I am doing 2v2 ladder with two accounts. I have no goal of making #1, it won't happen as when I play solo I tend to commit right away and don't put the needed effort into it. I just wanted to get some more experience on it and nobody in my clan was available to do it (note if someone good wants to play it with me, send me a msg on here).

I would not consider that a run, as I'm not aiming for top spot. Just upping my game count temporarily which I will end very soon with CL starting up.
1v1 Ladder Rating/Rank Manipulation Visual: 1/24/2019 05:03:09

Nauzhror 
Level 58
Report
Elo winrate calculations are never very accurate. In some games the underdog is massively underrated, in others the reverse is true.

I might be able to beat a 1700 player close to 95% of the time at strat 1v1, but probably closer to 90%.

In chess on the other hand, a 2200 elo player would almost never lose to a 1700. They'd never even play vs each other in tournament play. And no, by almost never I don't mean they'd win 95%, I mean they'd likely win 99.95%.

In Warzone meanwhile, I probably couldn't even beat a 1000 elo player 99.95% of the time. 1000 elo players are terrible, I'm not going to beat around the bush and try to not offend people at that rating - they're bad, and they lack fundamental understanding of how to play the game. Despite that, they have been known to very rarely have upsets vs top players where they won by doing things that were super unpredictable and worked as a result.

Edited 1/24/2019 05:05:23
1v1 Ladder Rating/Rank Manipulation Visual: 1/24/2019 05:21:18


TBest 
Level 60
Report
In chess on the other hand, a 2200 elo player would almost never lose to a 1700. They'd never even play vs each other in tournament play. And no, by almost never I don't mean they'd win 95%, I mean they'd likely win 99.95%.

Hm, well in case you don't know elo is also very inaccurate when predicting win rates in chess. Your statement is very inaccurate in this regard, thru I don't have the actual % here. Also 1700 frequently play 2200 in tournaments. Typical example here is kids, but older 1700 players can do this as well.

This article caused some interest when it was written. https://en.chessbase.com/post/the-elo-rating-system-correcting-the-expectancy-tables [the graph is win% predicted by elo and win% actual. Actual is lower.]

It is a bit dated at this point, however I am not aware of a more recent in-depth look into chess ratings. Also, the above link doesn't contain the entire article :/ Jeff Sonas has also done stuff like this : http://blog.kaggle.com/2011/04/24/the-deloittefide-chess-competition-play-by-play/

{a competition to make the best elo system}

Basically, if you are interested in chess ratings there is a fair amount of articles on the subject out there. The main arguments in favor of ELO, is that it is easy and practical to work with.

Edited 1/24/2019 05:31:41
Posts 21 - 30 of 30   <<Prev   1  2