COIL Explained

Adapted from this thread.

tl;dr--COIL is a system that was developed for suspect requirement purposes. Your COIL score is a function of your GXE and the number of battles you have had on the ladder. The higher your GXE, the fewer number of battles you'll need to achieve reqs.

If you're participating in a suspect test, you may have noticed that you achieve reqs based not on your usual ladder rating, but on a mysterious score called COIL. This "mystery rating" is a system I personally designed for determining requirements for suspect tests.

In the past, determining reqs. has been a thorny issue, due to the fact that conventional rating systems (Elo, Glicko) are designed to measure skill, not achievement. So often times, the top-rated player has been someone who's, say, gone 20-0, vs. a player with a few losses but with many more games. Does that player deserve reqs? The consensus has been an emphatic "no" (whether that person deserves to top the ladder is a separate question). But so if estimated skill alone shouldn't determine player rating, what should?

The simplest thing would be to determine reqs based on W-L ratio and number of battles, but there's a problem with W-L ratio: Pokemon Showdown does matchmaking that attempts to pair players with players of the same skill level as often as possible, so in a perfect world, everyone would have the same win-rate of 50%.

But while we can't use literal win-rates, we can use another measure that we already compute to get at the same idea. GXE is a measure that was developed by Smogon legend X-Act as an alternative ranking system to conservative rating estimates such as ACRE. It represents the percentage odds of you winning a battle against a randomly selected person on the ladder. In other words, GXE corresponds to what we would expect your win-rate to be if we weren't doing any matchmaking. Thus, we can use it as a substitute for W-L ratio.

And that leads me directly to the Converging Order-Invariant Ladder (COIL) score. It's calculated entirely based on GXE and number of battles, and it's set up so that the higher your skill level (your GXE), the fewer number of battles you'll need to achieve reqs. Note that with COIL, your rating eventually converges to a fixed value (40x your GXE). That means that if your GXE is under a certain value, you'll be unable to achieve reqs no matter how many battles you have. This is by design, and the COIL cutoff is chosen for each suspect test such that only players of "sufficient skill" can achieve reqs.

The formula for COIL is the following:

C=40*GXE*2^(-B/N)

where B is a parameter unique to a given tier and suspect test, N the number of battles you've had and C your COIL score. In the long-term, a player's rating will converge over time to 40 times their GXE. So the very top players may end up with a COIL of around 4000, while good players will end up with GXEs around 3000.

That's nice and all, but what most players want to know is, "how many battles will I need in order to get reqs?" To find that out, we solve the above equation for N:

N=B/log2(40*GXE/C)

where C in this context is the reqs cutoff. (This formula is ready for you to plug in your specific values. If you're going to do so, I recommend using Google Calculator).

Let's try an example. Let's say the reqs cutoff for a test is 2000, and it's using a B of 40. For that test, a player with a GXE of 90 will require 48 battles to meet reqs, and a player with a GXE of 75 will require 69. A player with GXE of 60 will need 152 battles to reach reqs, and a player with GXE of 50 never will.

As we move forward, when a suspect test starts, the tier leaders should be announcing not just the cutoff value but their choice of B as well (though note that these could change midway through a test, if necessary), meaning you'll be able to use that last formula to figure out how quickly you'll be able to get reqs. If B is not explicitly published, you can look it up for yourself by checking out the Pokemon Showdown source code (no guarantees that that link will stay pointing to the right lines).

Feel free to ask me any questions below regarding this system, though questions regarding specific suspect tests should probably go in their individual threads.
 
Last edited:

tehy

Banned deucer.
actually i heard it doesn't. supposedly the tier leaders will be checking your coil and i'm sure they can detect any serious anomalies and react accordingly. that said the possibility for shenanigans is still technically there but only for the first few games, and since you want to start those games strong i'd recommend hopping on an alt regardless

if i had one complaint about this tutorial it is that I feel it doesn't emphasize well enough that B is chosen by the tier leaders. Maybe that's just me and everyone else gets it, especially since it does eventually say so, and it sort of says so around the middle area once or twice, but there's my complaint.
 
Unless it's been changed, COIL is calculated directly from GXE and number of battles. If that's still true, there should be NO advantage to resetting your W-L, COIL-wise.
 
Unless it's been changed, COIL is calculated directly from GXE and number of battles. If that's still true, there should be NO advantage to resetting your W-L, COIL-wise.
Besides what's already been pointed out, wouldn't a rank reset allow you to take advantage of a higher initial GXE while facing lower ladder (generally worse) opponents?
 
Besides what's already been pointed out, wouldn't a rank reset allow you to take advantage of a higher initial GXE while facing lower ladder (generally worse) opponents?
I don't think there's any advantage there, since your GXE will rise slower against weaker opponents, and what matters is overall number of battles, not W/L ratio.
 
BigGameHunter Originally I envisioned the cutoff always being 2000, so then B would have corresponded to the minimum number of battles needed to achieve reqs (GXE=100, so log2(4000/2000)=1).
 
If the GXE is based around your chance of winning agains an random oponent, and the match-making thing group players around skill, how is it calculated?
 
Ok good I have elo 1100 yay
Thanks for awnsering my question.

Also, I liked this system.
A player that find a "hole" in the metagame can exploit it till reach a higt ladder,
but he wont know a lot about the metagame.
On the other side, a player that likes to experiment a lot loses a lot (because he experiments with not common pokemons) way be low ladder, but knows a lot of the metagame.
 
Last edited:

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top