<< Back to Clans Forum   Search

Posts 1 - 20 of 21   1  2  Next >>   
Clan matchups on the 1v1 ladder: 12/5/2015 03:48:54


l4v.r0v 
Level 59
Report
Just wanted to see what the data looked like. Now that I have it, might as well share it. :)

Basically, while things like Clan League judge a clan by their best players (on some level), I wanted to see how clans looked like in terms of an average encounter between another player and them (so not looking at them in terms of their average player, because this is going to also be weighed by player activity). That's kind of hard to do in general, but it's easy to do within the context of the 1v1 ladder.

There's a bit of a tradeoff between data recency and accuracy, so I just scraped the past 1000 1v1 ladder matches (825 of which were interclan matches) and used them to construct TrueSkill ratings for each clan (by treating each clan as a single player) as well as unclanned players in general. Then I used the TrueSkill ratings to create a 2-D matrix of clan matchups and win probabilities- the value in the matrix corresponds to the probability that the clan on the left beats the clan on the top (i.e., row clan vs. column clan).

You can check this stuff out here: https://docs.google.com/spreadsheets/d/1LwDMeuTgEWqqImFTuZ2KUl5AIJlOEJCf-o2rz602ie4/edit?usp=sharing

There's clearly some flaws in the data since it's going to underrate (or overrate) clans that don't engage very much in the 1v1 ladder- and clans whose 1v1 ladder activity isn't representative of their memberbase also benefit or suffer similarly (case in point: Apex being underrated).

As far as the color-coding goes: a green row/red column means a clan that the analysis thinks is pretty competitive; a red row/green column means a clan that the analysis thinks isn't very competitive.

(Before you ask why I used TrueSkill instead of just using the matches directly: most of the matchups this is assigning probabilities to haven't actually happened in the past 1000 1v1 ladder games, so I had to extrapolate somehow and TrueSkill seemed like the best option because of the way it models skill/uncertainty). Of course, this also brings me to the biggest flaw in this method: it doesn't actually measure the average encounter as it doesn't take into account the pairing algorithm of the 1v1 ladder. While it says the average encounter between Optimum and ILLUMINATI is going to result in an Optimum win around 89 times out of 100, that's not going to bring me any comfort in a matchup against an ILLUMINATI player because they're probably going to have a rating around mine. But if I just did took that into account, then all the values would be hovering around .500 since the pairing algorithm favors even matchups- but where's the fun in that? Instead, by ignoring the pairing algorithm, I'm using the 1v1 ladder as a proxy for player skill weighted by activity (since activity on the 1v1 ladder is a lot easier to measure than in general).

Edited 12/5/2015 04:05:00
Clan matchups on the 1v1 ladder: 12/5/2015 11:44:52


Timinator • apex 
Level 67
Report
i think a bunch of clans having most (or only players) around the 1500 area also leads to a big misrating. can't think of another reason how juggernauts ended up 2nd in your table atm.
Clan matchups on the 1v1 ladder: 12/5/2015 17:21:05


l4v.r0v 
Level 59
Report
Their actual ratings weren't taken into consideration- instead they were just re-rated using TrueSkill (as a group). I think small sample sizes had a lot to do with it, especially since TrueSkill doesn't take future performance into account.

I iterated through 1000 games- suppose that the Juggernauts' games were concentrated heavily in the first 200. That makes it much more likely that their opponents' ratings weren't really that accurate when they faced them, so if they beat, say, an overrated clan or something like that a few times before that clan's rating was updated, they get overrated too.

I'm trying to redo this in the larger context of the game as a whole, but it's hard to measure a player's skill at that stage. Maybe I can just take the 1v1 ladder basket, average ratings, and then treat those as clan ratings, and then calculate matchup odds.

A lot of the data is pretty weird for sure- Lynx is super-underrated as well.

Edited 12/5/2015 17:29:33
Clan matchups on the 1v1 ladder: 12/5/2015 17:53:57

wct
Level 56
Report
Very kewl, knyte.

especially since TrueSkill doesn't take future performance into account.
What does this mean?

[Edit: Wait, I think I know what you mean now.]

Edited 12/5/2015 17:56:24
Clan matchups on the 1v1 ladder: 12/5/2015 17:58:45

wct
Level 56
Report
One way you could overcome that time-dependent effect would be to re-run the analysis (perhaps a few times) with the orders of the games randomized. If you do it multiple times, you could perhaps do some form of average, or if TS has a specific way of averaging over randomized trials.
Clan matchups on the 1v1 ladder: 12/5/2015 18:02:22


l4v.r0v 
Level 59
Report
Ah- that's a good idea! I'm just going to have to wait a bit before doing that since I've figured out a way to make the analysis not quite as slow. Also, the servers are unfortunately under a bit of heavy load right now. :(

Instead of even averaging it, I could simply just run the analysis 20 or so times across different randomized shuffles of the last 1000 games.

Jk that's just going to exaggerate the effect of outlier wins against clans like MASTER/etc.; I'll have to go through and average.

Edited 12/5/2015 18:22:51
Clan matchups on the 1v1 ladder: 12/5/2015 18:50:55


Master Jz 
Level 62
Report
Can you do more games (4000 to 5000)? With over 400 ranked, you are probably only looking at 4-5 games per player.
Clan matchups on the 1v1 ladder: 12/5/2015 18:59:26

wct
Level 56
Report
Also, the servers are unfortunately under a bit of heavy load right now. :(

That's not from one of your scripts is it?! ;-) :-D :-D
Clan matchups on the 1v1 ladder: 12/5/2015 19:01:41

wct
Level 56
Report
Instead of even averaging it, I could simply just run the analysis 20 or so times across different randomized shuffles of the last 1000 games.

That kinda makes sense, but will give you a much smaller sigma than is warranted.

[ETA: Now that I think about it, I'm pretty sure just averaging over multiple independent shuffles is actually the correct thing to do. You might have to use log-mean (or geometric mean; same thing basically, I think) on the sigma/variance parameter, but the mu params should just be arithmetic-mean I think.]

[ETA2: Why are the mu params and the 'rating' scores different in your first table? I thought mu represented the rating score? (I could easily be forgetting something from what I read about TS.) -- Oh, right. I did forget something. Thx.]

Edited 12/5/2015 19:14:39
Clan matchups on the 1v1 ladder: 12/5/2015 19:13:21


l4v.r0v 
Level 59
Report
^ Rating is traditionally mu - 3*sigma but I used mu - 1.5*sigma because of the small (and highly variant) sample size
Clan matchups on the 1v1 ladder: 12/5/2015 19:13:52

wct
Level 56
Report
Can you do more games (4000 to 5000)? With over 400 ranked, you are probably only looking at 4-5 games per player.

More would be better of course, but just pointing out that he treats each clan as a single TS-player, so most TS-players are actually an aggregate of more than one WL-player. There are 70 TS-players listed (69 clans plus '(no clan)'), so that would be an avg of 14-ish games per clan/TS-player. But your point is still valid; more samples would be better for sure. In fact, it might make more sense from a real-world perspective to sample games by date, like all games in the past week or month or something.
Clan matchups on the 1v1 ladder: 12/5/2015 19:21:44


l4v.r0v 
Level 59
Report
I can do that. I was initially doing the last 10000 games but the analysis was real slow because I had to send 10,001 queries to the Warlight API plus up to 20,000 requests to user pages (since clans aren't given by the query game API) but now I've fixed my code so that I can make just (# of games)/50 queries and be done with things pretty quickly.

What timeframe do you want to work with here?

Edited 12/5/2015 19:22:13
Clan matchups on the 1v1 ladder: 12/5/2015 22:26:32


l4v.r0v 
Level 59
Report
Here is one with the past 10,000 games (7,752 of which were interclan matchups):

https://docs.google.com/spreadsheets/d/1nIlqEUzpkx4vzJgsUGsAkj097vEGXoK6rt9YAYapIMQ/edit?usp=sharing

One caveat: clans that don't have clan icons weren't counted by the faster method I used this time. I can't really fix that without making the fast method slow.

As for Apex, turns out only 5 of those 7752 games involved Apex in some way (with the most recent one being from October 1) and 3 of them were losses. Small sample size- nothing I can really do about that.
Clan matchups on the 1v1 ladder: 12/5/2015 22:42:42


Onoma94
Level 61
Report
How big timeframe is 1000 1v1 ladder games?
Clan matchups on the 1v1 ladder: 12/5/2015 22:48:29


l4v.r0v 
Level 59
Report
1000 just went as far back as 11/27- so a week.
Clan matchups on the 1v1 ladder: 12/6/2015 00:07:32


l4v.r0v 
Level 59
Report
And now, all time: https://docs.google.com/spreadsheets/d/1sNuA6loNMIEwupjrNVLt3rq9uav-lm_1DjyWblNKtmE/edit?usp=sharing

16,495 of the 132,590 games that had occurred on the 1v1 ladder at the time of this analysis were clan vs. clan matchups. That's about 12.4%.

Edited 12/6/2015 00:08:58
Clan matchups on the 1v1 ladder: 12/6/2015 00:32:02

[wolf]japan77
Level 57
Report
Somehow there are clans worse than TLW in every list, were not last!!!
Clan matchups on the 1v1 ladder: 12/6/2015 01:05:17


l4v.r0v 
Level 59
Report
Clan matchups on the 1v1 ladder: 12/6/2015 06:47:56


Master Ree 
Level 58
Report

And now, all time: https://docs.google.com/spreadsheets/d/1sNuA6loNMIEwupjrNVLt3rq9uav-lm_1DjyWblNKtmE/edit?usp=sharing

16,495 of the 132,590 games that had occurred on the 1v1 ladder at the time of this analysis were clan vs. clan matchups. That's about 12.4%.


All I learned from this is WarLight is indeed rigid (see WarLight Staff).

Edited 12/6/2015 06:48:59
Clan matchups on the 1v1 ladder: 12/6/2015 06:58:33


l4v.r0v 
Level 59
Report
lol they're ranked based on 2 games (both of which were Fizzer vs. REGL); I only kept them in because I found it funny

Edited 12/6/2015 07:03:28
Posts 1 - 20 of 21   1  2  Next >>