<< Back to Ladder Forum | Discussion is locked - replying not allowed   Search

Posts 11 - 30 of 50   <<Prev   1  2  3  Next >>   
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 03:22:14


Perrin3088 
Level 49
Report
the problem in all likeliness will never be solved for middle grounds players.. any new players in the ladder would affect the current players, and be affected themselves similarly to how Nozone and BP are currently being affected. Say you're an average player at 1500 score with a history, then someone new joins that does sub par, and losses his first 7 games.. all of the sudden, with an average score, playing people only as good as you *1500 ish* your rank drops drastically due to it being unable to correctly place the newcomer...

the retroactive ratings would probably work better for people that have history.. IE, anyone that hasn't been in the ladder for at least a month/X games, the games with them are done without retroactive ratings enabled..
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 03:48:52


crafty35a 
Level 3
Report
Well I've been playing around with the rating tool for a while, and I think I've nailed down what the issue is. I believe this tool is tailored towards calculating ratings for a set of players that each have a constant, unchanging playing strength. Why do I think this? First of all, notice the names of the "players" listed in the provided examples (http://remi.coulom.free.fr/Bayesian-Elo): Comet B.68, Dragon 4.7.5, Gandalf 4.32h, etc. These are all fairly well known chess engines (essentially AI programs that play chess).

Logically, it would absolutely make sense to retroactively adjust ratings based on the future performance of opponents, if the "players" were actually specific versions of chess engines. Why? Because these engines have a constant, unchanging strength level. Say a chess engine plays one game today, and another 99 games over the next six months. If we want to calculate the strength of the chess engine at the time of the first game played, every single one of the 100 games should be considered with equal weighting, because *the strength of a chess engine does not change over time*!

But with a human player, that of course is not true. I think this is the fundamental flaw with using this method to rate human players.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 04:51:07


NecessaryEagle 
Level 59
Report
and why is NoZone's rank higher than FBG-Dragon's?
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 04:52:37


NecessaryEagle 
Level 59
Report
the way it looks right now is that loosing to a good opponent is better than wining against a bad opponent, so if your first couple games dropped you on the ratings, then it's harder to rise because you don't get placed with higher players
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 05:33:51

The Impaller 
Level 9
Report
It does seem to be that way, however that may be corrected as it rediscovers who is a good player and so forth.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 05:51:03


Perrin3088 
Level 49
Report
it seems to me like the early games modify your rating too much.. IE, you shouldn't be able to drop to 1300/raise to 1700 in just a couple of games.. so as to keep new players more average until their real potential is proven..
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 07:15:12


Perrin3088 
Level 49
Report
Fizzer, why didn't you just put
offset 1500
on the page you linked us so it would automatically show the warlight rating?
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 07:24:21

Fizzer 
Level 64

Warzone Creator
Report
I never noticed the offset command. I'll add that in - thanks!
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 07:36:36


Perrin3088 
Level 49
Report
I also think that when we get a more established ladder, to solve my earlier fear, *3 posts up i think* we could implement a removerare X command before the elo command... it would make it so that new players would have to get at least X games before they influence the ladder, which imho could help keep the average range of players more steady... but ofc' idk, it will always be partly unsteady as long as new people come in so..
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 07:43:52

Fizzer 
Level 64

Warzone Creator
Report
Perrin: I was just thinking the same thing. I was thinking it would be good even now - if the rankings that are displayed now are meaningless, they shouldn't be displayed at all. It's only causing mass panic and confusion.

I think it would be good to hide ranks until you've completed a certain number of games.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 08:00:30


Perrin3088 
Level 49
Report
I'd also like to point out, that only 3 people haev complete 5+ games according to that list, lol...


and i am currently running a test ladder using the same names, and a random number generator to determine w/l's and seeing what comes up atm
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 08:06:14


Perrin3088 
Level 49
Report
hmm.. also check out perhaps instead of using removerare X from resultset> you could change ratings to ratings X.. it would allow the games to still have an affect on the players that actually have had enough games, but not for the actual new players to show up until they have reached the threshold..
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 08:12:26


Perrin3088 
Level 49
Report
hmm, and in extra testing, for some reason using removerare 5 on the current ladder is causign the program to hang up on me..? i thought they fixed mm hanging up when players unconnected back in '05?
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 08:37:33


Perrin3088 
Level 49
Report
hmm.. somehow i never got my test to pass the mm stage in the removerare 5, but one of the times i accidentally did ratings >removerare 5.txt it actually seemed to show up properly, despite not having been mm'd properly..

variances.

ratings 5
Rank Name Elo + - games score oppo. draws
1 Perrin3088 46 258 258 5 60% -17 0%
2 Knoebber -133 179 179 7 43% -94 0%
3 3A6L3BA5T -337 227 227 5 0% -108 0%

removerare 5
Rank Name Elo + - games score oppo. draws
1 Perrin3088 94 282 282 2 100% -47 0%
2 Knoebber 0 271 271 2 50% 0 0%
3 3A6L3BA5T -94 282 282 2 0% 47 0%


using the real data of course.. so not much data.. :/
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 08:39:10


Perrin3088 
Level 49
Report
and i ran my test ladder through 15000 games i think, and then ran a new player through 10 games against the middle of the ladder, and the variances seemed to die down a fair bit around 6-9 games or so...


and just for SnG's, here's the results, *based absolutely only on RNG's, lol*

Rank Name Elo + - games score oppo. draws
1 Ruthless 49 20 20 1103 55% 2 0%
2 CuChulainn 39 20 20 1055 54% 3 0%
3 NoZone 36 20 20 1050 54% 2 0%
4 Adam 32 20 20 1080 53% 4 0%
5 GuyMannington 31 19 19 1197 53% 2 0%
6 KnA+v 25 19 19 1149 52% 2 0%
7 Grundie 25 20 20 1042 52% 4 0%
8 deweylikedonuts 24 20 20 1098 52% 4 0%
9 TheImpaller 22 21 21 1003 52% 4 0%
10 FBGMoDogg 15 20 20 1104 51% 3 0%
11 PoopSandwich 15 19 19 1171 52% 2 0%
12 Doushibag 12 20 20 1090 51% 4 0%
13 Ragingpikey 0 20 20 1072 50% 5 0%
14 Shiver -1 21 21 991 50% 3 0%
15 Waya -5 21 21 1021 49% 5 0%
16 FBGDragons -5 20 20 1081 49% 1 0%
17 chas -11 20 20 1045 49% 2 0%
18 Soyrice -11 20 20 1117 49% 1 0%
19 3A6L3BA5T -11 20 20 1035 48% 4 0%
20 Perrin3088 -12 19 19 1164 48% 4 0%
21 sue -14 20 20 1107 48% 5 0%
22 devilnis -15 20 20 1059 48% 2 0%
23 Alcarmacil -19 21 21 1008 47% 3 0%
24 crafty35a -22 21 21 1029 47% 2 0%
25 Fizzer -24 20 20 1083 47% 3 0%
26 iI,IñsI,IælikIæ¥?Iñndy -30 21 21 996 46% 3 0%
27 Knoebber -38 20 20 1035 46% 2 0%
28 BluePrecision -39 21 21 1029 46% 0 0%
29 new -69 190 190 10 40% 4 0%
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 08:39:27


Perrin3088 
Level 49
Report
and those turned out ugly :/
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 13:29:17


NoZone 
Level 6
Report
One thing to remember is that it has only been 48 hours. Let it run for a few days and see how the rankings settle down. Once there is a decent number of players with a few games completed, there won't be such wild swings.

That said, it is interesting to see how the nuts and bolts of the ELO ranking are made. I hadn't really thought about the massive recalculation that would need to be performed if you do adjust for previous games. Seems like that would be prohibitive enough to only use very recent games. How soon till that eats up your processing power?

NoZone
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 14:29:53


crafty35a 
Level 3
Report
While I of course agree that the ratings will settle down after a while and begin to stabilize, I think the more interesting discussion is whether the retroactive adjustments make sense at all. As I mentioned briefly in my last post, Bayesian Elo was designed to measure performance between computer chess programs, which have a constant strength level. While the current method will eventually get us close to reasonable ratings, I don't think it makes sense for human players, long-term.

Some benefits to a more standard Elo system, in my opinion:
- It is much more intuitive. You win, you gain rating points. You lose, you lose rating points. The amount is depending on the rating spread between the players.
- Since the calculation is then only between the two people involved in the game, I would think it would be possible to immediately update ratings when games complete, which would be a nice touch.
- No retroactive adjustments. If I beat a new WL player, and a month down the line he becomes a top player, there's no reason I should be rewarded for beating him as if he was a pro when we played.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 15:31:36


NoZone 
Level 6
Report
Crafty35a,
I agree that the standard ELO makes much more sense. Possibly this was mentioned earlier, but what was the rationale for selecting the one currently in use?
NoZone
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 17:13:38

Dragons 
Level 56
Report
I don't have a problem with past results having an impact on future games and I understand that it will take a week or two for everything to sort itself out.

With that said, in no way should someone who wins a game lose points or someone who loses a game win points. Dropping 200 points by beating someone who has been beaten by others is wrong. If that is an actual possibility with this system (and not a bug), I don't care how quickly everything will sort itself out, the system needs to be tweaked or scrapped.
Posts 11 - 30 of 50   <<Prev   1  2  3  Next >>   
Discussion is locked - replying not allowed