<< Back to Ladder Forum | Discussion is locked - replying not allowed   Search

Posts 31 - 50 of 50   <<Prev   1  2  3  
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 20:58:45

The Impaller 
Level 9
Report
Maybe this is self-correcting, but the system does seem to weight early wins a lot more than later wins. I didn't start playing in the ladder immediately upon inception, so it was later when I first finished some games. At this point, though, I'm 6-0, but my 6 wins are good for a 1558 rating. On the other hand, really early ladder wins, like Waya's 1-0 record, or NoZone's 2-3 have them a good deal higher.

Is there a log of all the commands that get entered into the system to generate the rankings? I am curious if switching the order on results affects the final result.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 21:12:43


Perrin3088 
Level 49
Report
Imp.. the reasoning afaik is cause the early wins are all against everyone.. so Nozones 2-3, the first loss was against fizzer, who won 2-3 the first day, which jumped them both up, then nozone was playing against higher difficulty players, which keeps him at that range with a more modest record

Waya, beat nozone who has that inflated record, thus jumping his record up to match accordingly.. the problem as i see it was that the original players moved around so fast because one game was such a large ratio of their ELO that even the losers are put into a bracket that isn't justified by their w/l record..

once everyone gets a dozen games under their belts the issues should diminish.

as well
http://blog.warlight.net/index.php/2011/02/running-your-own-ladder-simulations/
as the logs as well as the program used to do it.


and i checked it recently and there are 4 more wins with first pick then there are normally..
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 21:15:07


crafty35a 
Level 3
Report
The log used to generate the rankings can be found here: http://warlight.net/Data/BayeseloLog.txt

But no, the order doesn't matter with this system. The reason the ratings change more at first is because there is little data, so the program is less certain about the true ratings of players, and makes bigger adjustments to compensate for that fact.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 21:17:58

Fizzer 
Level 64

Warzone Creator
Report
The order of the wins don't affect the outcomes, and neither does the order you join the ladder or anything like that. You can test this yourself by following the steps in the blog post - it's easy to re-arrange the games and see the affects.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 21:39:10

The Impaller 
Level 9
Report
It seems like this system may be designed under the idea that everyone is going to play everyone else at least once. I have a feeling that this system will perform very well if that does occur. I'm curious to see how it pans out if that doesn't occur.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 21:44:34


crafty35a 
Level 3
Report
Impaller, funny that you should say that. Bayesian Elo was designed to rank computer chess AI engines, and the way that typically works is that they play each opponent a set number of times. I think that the system would work better if Fizzer makes ladder games essentially random, instead of trying to pair you up with people near your skill level (ignoring for now what I consider the essential flaw of this system, the retroactive part).
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 21:59:47

The Impaller 
Level 9
Report
I got different results from running the http://warlight.net/Data/BayeseloLog.txt file with different configurations of match results. I ran it first just as it is, and it gave the following output:

Rank Name Elo + - games score oppo. draws
1 Fizzer 440 349 237 4 100% 191 0%
2 Waya 370 496 357 1 100% 235 0%
3 FBGDragons 319 270 230 4 75% 215 0%
4 Doushibag 253 284 221 4 100% 8 0%
5 NoZone 235 195 203 5 40% 272 0%
6 deweylikedonuts 135 279 306 3 33% 218 0%
7 ATrain 123 352 484 1 0% 253 0%
8 Soyrice 118 245 227 3 67% 84 0%
9 Grundie 109 353 356 2 50% 89 0%
10 Ruthless 108 465 298 2 100% -94 0%
11 sue 94 330 330 2 50% 94 0%
12 Perrin3088 75 241 244 6 50% 54 0%
13 PoopSandwich 68 303 236 3 100% -144 0%
14 TheImpaller 58 196 170 4 100% -70 0%
15 GuyMannington 0 324 324 2 50% -12 0%
16 FBGMoDogg -1 343 461 1 0% 118 0%
17 crafty35a -22 271 233 4 75% -147 0%
18 BluePrecision -39 310 279 3 67% -139 0%
19 chas -46 228 217 5 60% -125 0%
20 VampEZSTreet -51 343 461 1 0% 68 0%
21 devilnis -67 351 484 1 0% 58 0%
22 Eitz -67 351 484 1 0% 58 0%
23 BallLightning -73 352 484 1 0% 58 0%
24 Ragingpikey -73 352 484 1 0% 58 0%
25 Knoebber -142 165 168 9 44% -119 0%
26 iI,I±sI,IµlikIµÑ?I±ndy -178 335 335 2 50% -183 0%
27 CuChulainn -201 241 292 4 25% -84 0%
28 Adam -282 218 249 5 20% -101 0%
29 Alcarmacil -284 299 299 2 50% -284 0%
30 Shiver -287 300 300 2 50% -283 0%
31 3A6L3BA5T -343 182 252 8 0% -61 0%
32 KnA+v -348 301 487 2 0% -142 0%

I then ran it reversing the order of all match results. When I say order, I'm referring to when the game was played, the order in the text file. I didn't change, add or remove any actual match results. This was the result I got:

Rank Name Elo + - games score oppo. draws
1 Fizzer 650 345 217 8 100% 277 0%
2 Waya 555 486 303 2 100% 342 0%
3 FBGDragons 469 234 199 8 75% 315 0%
4 Doushibag 408 277 197 8 100% 18 0%
5 NoZone 342 168 176 10 40% 397 0%
6 ATrain 201 300 480 2 0% 408 0%
7 Soyrice 197 220 199 6 67% 140 0%
8 deweylikedonuts 184 250 265 6 33% 317 0%
9 Grundie 154 322 330 4 50% 127 0%
10 Ruthless 146 462 257 4 100% -152 0%
11 PoopSandwich 138 297 214 6 100% -199 0%
12 sue 126 281 281 4 50% 126 0%
13 Perrin3088 114 229 237 12 50% 86 0%
14 TheImpaller 94 190 152 8 100% -109 0%
15 FBGMoDogg 7 294 458 2 0% 197 0%
16 GuyMannington 5 274 275 4 50% -9 0%
17 crafty35a -41 240 204 8 75% -220 0%
18 BluePrecision -44 264 244 6 67% -192 0%
19 VampEZSTreet -52 294 458 2 0% 138 0%
20 chas -89 199 192 10 60% -193 0%
21 devilnis -104 299 480 2 0% 94 0%
22 Eitz -104 299 480 2 0% 94 0%
23 BallLightning -114 300 480 2 0% 94 0%
24 Ragingpikey -114 300 480 2 0% 94 0%
25 Knoebber -215 147 150 18 44% -177 0%
26 iI,I±sI,IµlikIµÑ?I±ndy -268 285 285 4 50% -271 0%
27 CuChulainn -296 205 248 8 25% -129 0%
28 Adam -442 189 216 10 20% -158 0%
29 Alcarmacil -446 249 249 4 50% -445 0%
30 Shiver -449 249 249 4 50% -444 0%
31 3A6L3BA5T -501 168 248 16 0% -88 0%
32 KnA+v -513 256 480 4 0% -215 0%

It's similar, but there are some definite differences. A number of people have changed positions on the ladder. This suggests that the order matches take place does have an effect on the ladder. Maybe this effect goes away later on, but there is one.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 22:07:41

The Impaller 
Level 9
Report
Also, the 2nd result set has a much wider range of values. Fizzer is ranked 2150 in the 2nd one at first place and he's only 1940 in the first one.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 22:14:52


Perrin3088 
Level 49
Report
make sure you run reset in the resultset> before you re-run a test
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 22:19:31


Perrin3088 
Level 49
Report
i just ran it through twice, without reset or changing order, and received

Rank Name Elo + - games score oppo. draws
1 Fizzer 650 262 262 8 100% 277 0%
2 Waya 555 344 344 2 100% 342 0%
3 FBGDragons 469 215 215 8 75% 315 0%
4 Doushibag 408 226 226 8 100% 18 0%
5 NoZone 342 174 174 10 40% 397 0%
6 ATrain 201 338 338 2 0% 408 0%
7 Soyrice 197 208 208 6 67% 140 0%
8 deweylikedonuts 184 257 257 6 33% 317 0%
9 Grundie 154 341 341 4 50% 127 0%
10 Ruthless 146 315 315 4 100% -152 0%
11 PoopSandwich 138 242 242 6 100% -199 0%
12 sue 126 276 276 4 50% 126 0%
13 Perrin3088 114 247 247 12 50% 86 0%
14 TheImpaller 94 169 169 8 100% -109 0%
15 FBGMoDogg 7 329 329 2 0% 197 0%
16 GuyMannington 5 268 268 4 50% -9 0%
17 crafty35a -41 223 223 8 75% -220 0%
18 BluePrecision -44 252 252 6 67% -192 0%
19 VampEZSTreet -52 329 329 2 0% 138 0%
20 chas -89 199 199 10 60% -193 0%
21 devilnis -104 338 338 2 0% 94 0%
22 Eitz -104 338 338 2 0% 94 0%
23 BallLightning -114 338 338 2 0% 94 0%
24 Ragingpikey -114 338 338 2 0% 94 0%
25 Knoebber -215 153 153 18 44% -177 0%
26 iI,IñsI,IælikIæ¥?Iñndy -268 280 280 4 50% -271 0%
27 CuChulainn -296 225 225 8 25% -129 0%
28 Adam -442 201 201 10 20% -158 0%
29 Alcarmacil -446 240 240 4 50% -445 0%
30 Shiver -449 241 241 4 50% -444 0%
31 3A6L3BA5T -501 201 201 16 0% -88 0%
32 KnA+v -513 320 320 4 0% -215 0%

notice Fizzer
1 Fizzer 440 349 237 4 100% 191 0%
1 Fizzer 650 262 262 8 100% 277
....................^^^

4 to 8.. means the first example has 4 games, and the second example has 8, indicates you doubled the games involved
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 22:19:53


Perrin3088 
Level 49
Report
and i missed with my arrows, lol
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 22:26:14

The Impaller 
Level 9
Report
You're right. I didn't reset. I'm now getting the same result either way. Good catch.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 22:33:13


Perrin3088 
Level 49
Report
i caught maybe a dozen problems with that program the other day.. just to realize it was a fault in my coding, thankfully i corrected and erased the post before i posted most of them, lol
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 23:16:35

Basil 
Level 28
Report
I'm personally just hoping I stay somewhere near the top 10 once rankings start to settle down ;). I'm definitely enjoying #2 while I'm up here, though... (1 game gave me almost 500 points! My second win ended up taking points away, though...)
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 23:19:01


Perrin3088 
Level 49
Report
i am hoping i manage to edge my way into the top ten with my 3 more wins.. and don't forget Waya, if Nozone losses rank, you lose rank, since he is your big win :)
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/21/2011 23:29:22

The Impaller 
Level 9
Report
I'm just hoping to get paired up against one of these titans of rating so I have a shot at stealing points from them :).
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/22/2011 00:09:27


Troll 
Level 19
Report
I love how in this last update, that The Impaller brought 4 people into the top 10 solely based off playing him.
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/22/2011 00:51:57


NoZone 
Level 6
Report
And I have started my slide....
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/22/2011 01:30:30


NecessaryEagle 
Level 59
Report
at least you have somewhere to slide from :)
Discussion of the "retroactive rating updates based on opponents' future results" feature: 2/22/2011 20:50:17


Doushibag 
Level 17
Report
The way it's working now is definitely wrong. People should not be getting a hefty boost just by losing to someone ranked higher or losing tons of points because they *beat* someone bad. I realize things would shake out somewhat eventually, but I still think the system as it is isn't a good system for Warlight. I don't think it should work like that and when a game happened, if you're going to look back at previous games, should matter a lot. A game that happened a day or two ago should matter much more than one 2 months ago. People's skill at the game will change over time and if the system ignores that then it isn't going to work very well.

I don't see much reason that we should give the current system more time. It will definitely look better after some weeks, but I still think it's an inappropriate system for a game like Warlight with players of changing skill and whereby it isn't really appropriate for everyone to play everyone.
Posts 31 - 50 of 50   <<Prev   1  2  3  
Discussion is locked - replying not allowed