<< Back to Clans Forum   Search

Posts 101 - 120 of 190   <<Prev   1  2  3  4  5  6  7  8  9  10  Next >>   
What happened to MH?: 10/26/2022 14:51:13


Norman 
Level 58
Report
You also don't understand statistics nearly as much as you think you do. You think that the winrate of all clans in CW has to regress to 50%, because of the matchmaking system.


I have never said that but have always been precise without ever changing my statement.

-------------------
The win rate of CW is a function determined solemly by the skill difference of your clan vs your opponent clans. Both skills are measured by the Clan Wars rating which gives a relatively adequate and stable representation of the skill of your clans players in the templates they are playing. Merely increasing your own clans skill (=rating) has no direct effect on your win rate since the average opponents skill will follow suit. The one and only way (true for each and every clan) to increase your win rate is to change the skill (=rating) gap between your clan and the average opponents clan. In practical terms you need to force the matchmaking to pair you with weaker opponents.

The matchmaking algorithm is well understood by the interested WarLight community. It goes top down from highest to lowest clan and when it tries to pair a player it looks for the next non paired player from a lower ranked clan to pair him with. It is true that when a clan is extremely high skilled (= has high rating), then the matchmaking algorithm will be more prone to fail to give an adequately rated opponent. However the abysmal win rate of the usually highest rated clan (=ONE!) should be a warning. An interesting effect of the matchmaking algorithm is that when you join with "n+1" players instead of with "n" players, you are guaranteed to get a lower rated (or equally rated) opponent clan for each new player you join with. So you can easily force the matchmaking algorithm to give you opponents from easier clans by just joining with more players.

If a clan truly is better than each and every opponent clan this does not change the fact at all, that they have a positive win rate solemly because they play against weaker opponent clans. There are no edge cases which need a separate explanation. You tell me your rating + collect all the opponents clans rating your clan has played against and I use an online site to calculate your win rate up to +-2% random noice exactly.

-------------------

I kept hearing stuff regarding my explanation not being valid due to different players contributing to the rating and whatnot. You can indeed create a very artificial case with an opponent clan tanking win rates in one template and gaining in the other while you are only facing them in one of both templates. However those are at least theoretically plausible arguments and this is not what you and others were throwing at me. People were just getting mad at me because our clan had a 55%ish win rate, Exel had something relatively similar and I have explained to my guys that they should not look at the win rate but at the rating gap to determine the skill. The mates who got mad at me did so because they were not capable to grasp basic maths. There was some talk from other clans about us inflating our rating due to playing lower rated clans and stuff. This all makes no sense at all. It has no effect on your rating whether you beat up noob clans or struggle playing against evenly matched opponents. I'm not a genius for saying that but just a random dude on the internet who knows how Elo works.

Edited 10/26/2022 15:49:23
What happened to MH?: 10/26/2022 16:21:05


l4v.r0v 
Level 59
Report
Yeah I don't think Norman's ever implied universal regress to 50%, since he's been more aware than basically anyone else of all the ways to avoid this regress (e.g., he's the one who explained squad bonus to me and other people). Can't help but think some of the criticism of Norman specifically is misplaced.

He's generally a thoughtful person and has invested his free time in helping other people. I understand there's interpersonal conflicts with him at play here, but I don't think it's fair to essentialize all that by saying he's generally rude or dumb.

---

@(deleted): hi nonolet. I'm sorry I publicized your cheating, but it's been nearly a year since then. You're like 50-something, right? Spend your remaining time better than this. A whole year holding a grudge is like 3-5% of your remaining lifespan. Your daughter's going to ask why daddy didn't spend more time with her, and I hope you have a better answer then than "I had to obsessively compare this guy to Hitler after he caught me cheating for pennies in digital currency." Let it go. Move on. Find God.

---

Anyhow this thread has gotten rather toxic and doesn't present well for M'H or Optimum, especially as the squabbles get increasingly niche and the topic goes unanswered.

Edited 10/26/2022 16:33:30
What happened to MH?: 10/26/2022 17:29:20


Harmony 
Level 58
Report
@Norman you seem to be very smart, so could you explain this:

According to https://wz-clanwars.netlify.app
Harmony: Win rate: 55% || CW Rating: 142.6
M'Hunters: Win rate: 56% || CW Rating: 439.0

Why is there such a rating gap between our 2 clans?
What happened to MH?: 10/26/2022 17:32:47


Cicero_ 
Level 63
Report
because u are taking wr of this season and CW rating is coming from the past. your comparison is unsense.

Edited 10/26/2022 17:33:05
What happened to MH?: 10/26/2022 17:34:21


pooposaur
Level 34
Report
norman why does my breath smell when i eat my cat's litter
What happened to MH?: 10/26/2022 17:35:37


Cicero_ 
Level 63
Report
@eternity. Optimum has a superb WR this season and they don't have the highest Rating because it started from very low this season. just another example.
What happened to MH?: 10/26/2022 17:40:18


Harmony 
Level 58
Report
The Fancy Dot ● currently have ~59% win rate and 46.5 CW rating, do you think M'Hunters' rating is going to fall to that level?

I would greatly appreciate the explanation on this rating thing and how it works.
What happened to MH?: 10/26/2022 17:44:55


l4v.r0v 
Level 59
Report
Rating is Elo-like, Eternity. It's not just a reflection of WR.

e.g., on the 1v1 Ladder, Grantelbart has won 7 of his last 10 games (https://www.warzone.com/LadderGames?ID=0&LadderTeamID=26117)
meanwhile, Carl Pickens has won 4 of his last 10 (https://www.warzone.com/LadderGames?ID=0&LadderTeamID=14450)

But Carl Pickens is rated 250 points higher than Grantelbart because ratings also reflect opponent quality. Harmony has generally faced easier opponents than M'Hunters. In an ideal ladder, everyone's win rates converge to 50% because they will face opponents around their own level.

If you want the long explanation:

Imagine everyone has a stick, and these sticks vary in length. People are shy about showing their sticks to strangers, but you really want to figure out who has the longest sticks. So you set up a stick-measuring contest.

Instead of showing their sticks to everyone, participants would just go, two-to-a-room, show their sticks to one another, and tell you who had the longer stick. But you're clever: you will update your estimate of their stick based on this.

See, you will start by assuming everyone has a 5" stick. Then you will adjust your estimate based on the results- with bigger adjustments for more surprising results (so if a guy you thought had a 2" stick beats a guy you thought had a 7" stick, you will update more because that's unexpected and probably reflects a bigger wrong estimate). But you want quality data- if you send a guy with an 8" stick into a room with a guy with a 1" stick, that won't tell you much when the 8" guy wins. So you instead want to pair people with about the same estimated stick size so you can get more useful results and increase the granularity of your model.

Now, sometimes the results will be noisy: maybe someone's stick grew or shrank recently, maybe they got lucky and it looked larger, maybe they weren't trying their hardest and only showed some of their stick. But basically Elo is this stick-measuring context.

Except the stick is "skill." It's trying to measure a single metric of your entire clan's CW skill level. Harmony has probably a 3" stick and keeps getting paired with guys with 2.5"-3.5" sticks, against whom it wins about 56% of the time. M'Hunters has probably a 6" stick and keeps getting paired with guys with 4.5"-6.5" sticks, against whom it wins around 55% of the time. The win rates don't tell you much about the stick size- they also reflect who you get paired with (both of yours are above 50% because your volume means the matchmaking gives you generally opponents slightly weaker than you; this is because of what Norman calls the "squad bonus").

For a more detailed & mathy explanation of TrueSkill, the actual rating system CW uses: http://www.moserware.com/2010/03/computing-your-skill.html

Edited 10/26/2022 17:52:54
What happened to MH?: 10/26/2022 18:47:22


Norman 
Level 58
Report
@Eternity: I don't know the exact details about the true skill derivate WarLight uses but at least with classical Elo, the absolute number of the rating difference translates directly into a win probability when our 2 clans face each other.

According to classical Elo, our rating difference would translate to a 15.37% chance of you guys winning a random matchup against our guys (https://sandhoefner.github.io/chess.html). Here it is a bit more difficult (https://stackoverflow.com/questions/28031698/with-the-trueskill-algorithm-given-two-players-ratings-how-can-i-calculate-th) but the general fact is still true that the absolute value of our rating difference translates to a win probability when our 2 clans face (no matter what the exact probability is here).

M'Hunters rating being higher than Harmonies rating means that our clan is stronger than your clan. It has nothing to do with any win rates. The function to determine your clans win rate does not only take your own clans skill into consideration but also your opponent clans skills. As I have pointed out, you can easily influence your opponent clans skills by just joining with multiple players. The more players you have playing the higher not only your overall wins become but also the higher you win rate becomes.

Edited 10/26/2022 18:49:29
What happened to MH?: 10/26/2022 18:53:17


Tac(ky)tical 
Level 63
Report
leave it to kynte to turn warzone into a dick measuring contest
What happened to MH?: 10/26/2022 18:53:31


Tac(ky)tical 
Level 63
Report
sorry I meant stick, typo ;)
What happened to MH?: 10/26/2022 18:54:12


Samek ●
Level 57
Report
I can't wait for this thread to be closed. ●
What happened to MH?: 10/26/2022 19:52:53


Harmony 
Level 58
Report
@l4v.r0v and Norman thanks for explaining how clan rating works! I learned some useful information from you.

@Samek I think this thread can only be closed if it violates Warzone ToS, based on what I've seen in this thread it seems to be compliant with Warzone rules. If any particular players breaks the rules, I think mods can edit his message instead.
What happened to MH?: 10/26/2022 21:09:17


l4v.r0v 
Level 59
Report
dick measuring contest
It's just the best Elo metaphor I've ever thought of. Skill-rating algorithms basically try to measure scalar properties of individuals (or groups) that can't be directly observed but show up comparatively.

It's basically two steps removed from height:
- you can directly observe height, which is a generally stable scalar in mature humans: just use a yardstick or measuring tape
- one step removed is intelligence (general g or IQ), an abstraction we impose on all sorts of mental processes to try and quantify a stable predictor of performance on some significant subset of cognitive tasks: it's not straightforward to measure, since we can only observe its effects on other metrics, but we can still measure at the individual level (e.g., working memory tests, Raven's progressive matrices)
- two steps removed is the skill that Elo & other models (see: Bradley-Terry models) attempt to capture. We once again posit that there's a property of individuals or groups that we can consider to be "skill" (within a domain)- i.e., a stable-ish predictor of performance against others in zero-sum competitions, like 1v1 Tetris or chess or Warlight or multiplayer Halo. It's a useful abstraction for some domains where individual performance is generally opponent sensitive (e.g., if you just counted kills in Halo, you wouldn't account for some opponents being far easier to kill than others). It's tricky to measure!: we estimate it indirectly through observations on the results of zero-sum contests, and we can estimate it more precisely & accurately if we have some control over the matchmaking.

Quantifying skill can get us fairly deep into statistics & machine learning, as TrueSkill shows. The actual mechanics of Clan Wars ratings take some mathematical background to understand, although Moserware's blog (http://www.moserware.com/2010/03/computing-your-skill.html) does a great job of explaining it all from first principles in a way normal people can understand.

I think a stick-measuring contest metaphor captures enough of the nuance, as an alternative. It's enough to reason about both the general principles and some challenges- limitations on matchmaking (if everyone else has at most a 10" stick, how do you tell if someone has a 15" stick or 11" or 20" stick?), rating manipulation (if you trick the system into thinking you have a 0.1" stick, you can sometimes exploit it to overcorrect & temporarily wind up with an overestimate of your stick size), the difficulty of trying to apply it to groups (what does it mean for Harmony to have a 3" stick? its members' sticks vary a lot, from 0.5" to a mighty 7"), consequences for matchmaking (if Harmony gets opponents based on a 3" stick size, what does that mean for opponents of players who have 7" sticks? 0.5" sticks? Some are destined to mostly win or mostly lose).
What happened to MH?: 10/27/2022 04:07:16


krinid 
Level 62
Report
wooooow, this thread blew up

Still reading through ...

But I did see this:

@Norman you seem to be very smart, so could you explain this:

According to https://wz-clanwars.netlify.app
Harmony: Win rate: 55% || CW Rating: 142.6
M'Hunters: Win rate: 56% || CW Rating: 439.0

Why is there such a rating gap between our 2 clans?

l4v probably responded already with plenty of details, but here's my answer.

If your WR is 50%, then your Clan Rating is accurate.

If Harmony wins 50% @ 142.6, then you would (likely) not be able to sustain the same WR @ Clan Rating 439. And if MH's rating suddenly changed to 142.6, WR would logically be expected to rise from 50%.

On this idea, will be interesting to watch Optimum. They've had a high WR of 80%+ this season but they started with a very low Clan Rating, thus easier opponents. Now they're at 456.3, so logically would expect the WR to drop (on a day to day basis, not immediately since it's cumulative) but only time will tell.
What happened to MH?: 10/27/2022 04:54:09


l4v.r0v 
Level 59
Report
nitpick: 50% doesn't imply accurate and accurate doesn't imply 50%, unless matchmaking is perfect. Accurate ratings for CW look more like 45-55% for average-skill clans but can look like 70-80% for high-skill clans, simply because they can't be matchmade with peers. To revisit my stick analogy, the guy with the 11" stick will have a lot more wins than losses simply because there's not a lot of other 11" guys for him to face.

If you want to validate a rating to see if it's stable yet, you can use the methodology of https://bit.ly/am-i-underrated - you need to also account for the ratings of your opponents, to correct for inaccuracy from matchmaking, and you may also need to estimate the inaccuracy of your opponent's ratings (which is hard).

Edited 10/27/2022 04:55:46
What happened to MH?: 10/31/2022 02:35:27


krinid 
Level 62
Report
nitpick on nitpick: 45-55% essentially _is_ 50%, no one is actually claiming you need to be at exactly 50% for it to be "accurate", nor that 50% itself is a perfect assessment, but rather that it is the methodology of the rating system

Of course it's never going to be perfect, b/c:
(A) it takes players in different clans in the same queue during a timeslot to form a match, and if there's no one around your CW rating, you play the closest thing, which could well be (to show an exaggerated but completely possible example) a 500 CW rated clan vs a 100 (or even a -50) - essentially your case of the lack of 11" stick wielders

(B) CW rating != clan member skill, but rather an average across all CW wins/losses, eg: what you call the Ursus problem; a given clan could indeed have individuals with 100% WR and 0% WR and if they all played the same # of games equally, they'd come out with a middle of the road CW rating. Effectively some players in the clan have 11" sticks, other have 2" sticks, other may just still be throwing fists

(C) CW rating is carried over from previous seasons but WR isn't, which means one carries history (including member changes over time) while the other is a clean slate each season; WR is only an indication of performance within the current season, and the sample size may not be large enough, and how many games is a big enough sample size to accurately draw conclusions likely differs depending on how close to your actual CW rating you start off at at the beginning of the season

I suppose we could reach the optimal case for accuracy if CW rosters only had members of roughly the same skill, and there were always enough players in each template in each timeslot to provide "close games" (close ratings) for all parties playing - then 50% WR would actually be accurate

Edited 10/31/2022 05:10:26
What happened to MH?: 10/31/2022 02:46:43


l4v.r0v 
Level 59
Report
nitpick on nitpick on nitpick: it's not just 45-55%; a rating can be accurate at any win%, depending on the opponents you get matched with. E.g., on the 1v1 ladder, a rating of 2400 would be accurate with something like a 85-95% win rate and a rating of 600 is generally paired with a close to 0% win rate.

The nitpick was about matchmaking, not noise. Win rates for accurate ratings only converge to 50% when matchmaking can serve evenly-matched opponents. Barring that, as we see in CW, you will have accurate ratings with stable win rates far away from 50%, because of sandbagging, squad bonus (which you hinted at in (A)), and machmaking generally struggling at the margins.

This is actually important because if you want to win CW, you want to have a stable win rate well above 50%.

Your CW score = (participation) * (win rate). Getting your win rate to converge to 60% rather than 50% has the same effect as increasing participation by 20%.

E.g., if we suppose that a 100 point TrueSkill difference corresponds (like Elo) to a 64/36 win rate, then using the squad bonus to drop your opponents' expected ratings by 100 points has the same impact as increasing your participation by 28%. Hence the nitpick: your model left out the most crucial fact for CW strategy. Clans are either participation-limited or win-rate-limited, and they need to decide how to invest each unit of effort to focus on the correct limiting factor.

The other implication here is that CW is skill-based in inverse proportion to the variance in participation and participation-based in inverse proportion to the variance in skill.

Edited 10/31/2022 02:59:51
What happened to MH?: 10/31/2022 02:59:32


krinid 
Level 62
Report
nitpick on nitpick on nitpick on nitpick: your nitpick on nitpick on nitpick agreed with my nitpick on nitpick, just that it presented itself as being a departure from it, while in actuality it was a restatement of it with a bit of additional clarifying material

TLDR: we both said that 50% WR indicates right placement when facing even opponents
What happened to MH?: 10/31/2022 03:06:13


l4v.r0v 
Level 59
Report
Anyhow, you can check whether your rating is accurate by sampling results, using the Normal approximation of the binomial distribution, and finding the probability that you would've had those results had your rating been accurate. This is partly a theoretical exercise for CW since historic ratings aren't easily available (unless 5s records them) but it's a better test than just eyeballing to see if your rating is 50%. Ideally your stable win rate should not be 50%: if it's close to 50%, fire your clan manager.
Posts 101 - 120 of 190   <<Prev   1  2  3  4  5  6  7  8  9  10  Next >>