What did we learn from tiering in generation four?

cim · Sep 15, 2010

Blame Game said:
Really? You think if 4th gen's banlist stopped at, say, Deoxys-S, Kyogre and Arceus, that that metagame would have been just as popular with players (assuming it were equally balanced)? You think that many of the 3rd, 2nd, and 1st gen community mainstays would have been at all comfortable with this ruleset? Or that new members, whose experience with "competitive" Pokemon probably amounts to the Battle Tower ruleset or similar, would be unaffected by the fact that suddenly, Mewtwo is their favored sweeper, not Starmie? I don't think so.

I'd say so. I mean, people were pretty damn comfortable with Claydol becoming useless, Heracross becoming average, and Raikou becoming rather difficult to play. I think adjustments to "Mewtwo instead of Starmie" would be just as natural as the jarring shifts each generation already produces "naturally" when we try and keep a certain ban list.

Of course, assuming it were equally balanced. 90% of people that freak out would be freaking out because they assume unbalance.

skarm · Sep 15, 2010

I am disregarding some of the material on the second page here to reply to the material presented on the first page. What have I learned from Generation 4's tiering as compared to previous generations? To be honest, I have learned that the tier voting majority of Smogon is not something that I agree with anymore.

I felt "dirty", as I have mentioned to Jumpman, when Garchomp was banned. However I could see the reason for this, and it seemed to have a high majority supporting the ban of it. Then we unbanned Deoxys-S, which I was elated for.

You know what. I'm just going to sum thus up instead of wasting my "breath" in this case: You need to start with a ban list from some point, but I would suggest against starting with Gen 4's.

If you want to start somewhere, start with the Battle Tower ban list. This is obviously what Game Freak's intentions were for the "tier levels" in their own game. We can always tweak from there with a more streamlined process as discussed above.

Alternately, as a second option, I would then support 680+ Basestat ban and working swiftly from there.

What I do not want to see is another Garchomp or Salamence ban. I would additionally love to see Latios and Latias back in OU, but I somehow think consensus won't be with me on that one.

lilyhollow · Sep 15, 2010

Chris is me said:
I'd say so. I mean, people were pretty damn comfortable with Claydol becoming useless, Heracross becoming average, and Raikou becoming rather difficult to play. I think adjustments to "Mewtwo instead of Starmie" would be just as natural as the jarring shifts each generation already produces "naturally" when we try and keep a certain ban list.

I think there are 14 Pokemon that were OU in Advance, but are below that in 4th gen: Heracross, Dugtrio, Weezing, Regice, Medicham, Milotic, Donphan, Raikou, Blaziken, Sceptile, Claydol, Slaking, Magneton. I mean, yeah, they were "lost in the shuffle," ultimately nobody cared much, whatever. That's totally different than an all-out redefinition of what OU is. I can literally just spout off random non-steel DP OUs, and any given one of them has like a minimal chance of being relevant in a metagame where Mewtwo/Dialga/Groudon run rampant. Celebi, Snorlax, Gyarados, Starmie, Vaporeon, all are more likely than not to be just completely irrelevant in an Ubers-lite metagame. And besides Blissey, I would be hesitant to call out any one 4th Gen OU as a likely top contender in an Ubers-lite 5th gen metagame. The perception won't be one of "some of the old Pokemon being lost in the shuffle," but of a complete overhaul of the metagame (which obviously is the exact intent of your suggestion).

capefeather · Sep 15, 2010

In other words, Logic and Common Sense do not always see eye-to-eye, and frankly, most of the time, Common Sense is the better advisor.

This very thread shows that people have drastically different ideas of "common sense". I would consider my opinion "common sense" just as much as you consider yours as such.

cim · Sep 16, 2010

Blame Game said:
I think there are 14 Pokemon that were OU in Advance, but are below that in 4th gen: Heracross, Dugtrio, Weezing, Regice, Medicham, Milotic, Donphan, Raikou, Blaziken, Sceptile, Claydol, Slaking, Magneton. I mean, yeah, they were "lost in the shuffle," ultimately nobody cared much, whatever. That's totally different than an all-out redefinition of what OU is.

I wouldn't say so. Consider that we'd have at most no more than about 15 or so new Pokemon in OU when all's said and done (I'm giving everyone arguing that this is pointless the benefit of the doubt and saying that they're probably right about Arceus / Kygore / Groudon / Rayquaza / Palkia / Dialga / Mewtwo), I would be surprised to see a metagame "just as balanced" that didn't knock out more than 15 other OU Pokemon.

Basically you're saying "oh no one cared about all those Pokemon, but a few of these Pokemon, EVERYONE will care about!" I just honestly don't buy that. Consider that Gyarados wasn't that good until the 4th gen, Snorlax is already basically dead this generation, et cetera. I see no reason to believe some major major shifts will just be ignorable, but these particular shifts are critical so we need to do everything we can to fight for some arbitrary Pokemon.

lilyhollow · Sep 16, 2010

15? You mean 15, plus all of the new B/W Pokemon, right? And like you said, 15 might be pretty "generous" anyway. I've always thought that even 4th gen Ubers could have been turned into a relatively balanced, playable metagame with just a few choice bans. It will only be easier to make a legitimately "balanced Ubers" next gen.

Basically you're saying "oh no one cared about all those Pokemon, but a few of these Pokemon, EVERYONE will care about!"

This isn't about specific Pokemon, or "particular shifts." This is a subjective, common sense "when I transition from gen 3 to gen 4, I can see an obvious resemblance between the two, and also take note of the numerous obvious differences. If I'm transitioning from gen 4 OU to 'gen 5 Ubers-lite OU,' I'm in an alien environment."

Alien environments can be fun, and they might not even have a negative impact on the popularity of the game for whatever reason. Okay. I haven't seen anything terribly convincing of that being the case, but it's not a crazy proposition by any stretch. "Huh? It's just, like, 15 new Pokemon or something. The transition will be just about the same as it was from 3rd to 4th gen," on the other hand, is not something I'm willing to swallow.

Also, Gyarados was good in 3rd gen and Snorlax is still viable. shrug

Jumpman16 · Sep 17, 2010

I feel I should weigh in here even though I should be working. I agree pretty much completely with Aeolus's first post so I'm not going to repeat anything he said, and I will completely defend his proposed banlist thread even if we didn't discuss the minute details (because why should we have to, it's a proposal and we have talked about starting with a banlist months ago).

The main reason I defend the posting of Aeolus's "Proposed Starting Banlist"thread is because, lest we forget, this is a discussion forum of potential pokemon policies. If there is one thing that we both, as the obvious and long-term leaders of the Gen IV Tiering efforts, have striven for throughout, it is collaborative engagement. Collaboration with each other, collaboration with Tiering Contributors both would-be and appointed, collaboration with our entire community. A proposition is a fine springboard for discussion, and an especially welcome gesture from a relatively "all-powerful" facilitator of competitive pokémon.

Cathy said:
The power level defining ubers was never serious reconsidered. We have a duty as competitive players to explore that power level properly, especially in the face of a new game. I have seen a lot of posts by people so confidently stating that not much will change. This couldn't be more wrong. The truth is you have no idea what subtle changes to move pools, move stats (e.g. power, PP), and new Pokémon will have on the relative quality of Pokémon. It doesn't take much to shift the game significantly, and deciding a ban list in advance will effectively blind you to it.

To be clear, I am first and foremost pleased you made this thread. Its spirit is the one thing I value above all else with regard to competitive pokemon—actual proactivity towards and passion for the game. This is a quality that has been sorely lacking in our community for years and is still a very, very big problem that sadly can only be addressed by people who care.

This mistake, made early in the history of DP, laid the foundation for all of the tiering debates to come. It is a mistake that should have been avoided. Only banning broken Pokémon, after plenty of play experience, would have been years shorter than the process that actually ensued, and not tainted by doubts of illegitimacy.

By November 2007, Shoddy Battle 1 had ladder functionality. The Smogon arbitrary ban list had not changed in that time. Unfortunately, that said arbitrary ban list was already well ingrained, and any major change to it was impossible. Independent of Smogon, we (me, AA, obi, tenchi, and others) adopted a very minor testing scheme, involving a tournament to test Deoxys-S. One thing we learned from the tournament is that Swiss tournaments are too complex for most players in this community, at least without software support. More significantly, not a single person of the hundreds of people who had played in the tournament voiced a problem with Deoxys-S being unbanned.

Cathy, if you felt so strongly about starting Gen IV completely play with no bans, there was very literally nothing stopping you from adopting a not-so-minor testing scheme with no bans on your Official Server. I genuinely wish you had done this and ask you completely without rhetoric why you did not, especially given your willingness to include the controversial Wobbuffet and Deoxys-S in a metagame you had literal control over.

Two weeks after the conclusion of the tournament, some notable Smogon members who were up to that point uninvolved with official server, were so excited by our unbanning of Deoxys-S that they asked if I could unban Wobbuffet immediately, without the benefit of another tournament. It turned out that Wobbuffet was the next item on our list anyway, but we mulled over whether another tournament was worth it. Ultimately, in light of the fact that the previous tournament had failed to convince anybody of anything, we decided to forge ahead and unban Wobbuffet. The backlash was intense. No one wanted to test Wobbuffet. In public, I defended our move, but in private, I was quite upset with AA. I had put in hundreds of hours of work writing a Pokémon simulator, which was extremely popular, and was the basis of competitive Pokémon on the internet at that point, and everybody hated me for some minor tier experimentation that wasn't even my idea. This was extremely grating.

I was so uspet by the backlash that I attempted to devise a statistical argument for banning Wobbuffet. Unfortunately, it couldn't be done. Barely anybody even used Wobbuffet on the ladder. You could play the game as though Wobbuffet did not exist, and you would only lose the occasional match. In effect, this was not a broken Pokémon, because it didn't affect how you constructed your team at all, as far as ladder play was concerned. This never changed for the entirety of Official Server.

The lesson learned here is that popular opinion cannot be ignored in tiering decisions. Strong feelings that a Pokémon is broken prevent it from being tested. In fact, the hatred for this Pokémon was so intense that any vote to ban it would have easily been by a 2/3 supermajority, and probably much more.

This reflection seems to forget two key things (insofar as the reflection itself if rather pointless if it was actually made with cognizance of the keys I am about to recite). First, I hope you realize that appealing to emotions when recounting how "grating" your experience was or how "upset" you were will not go very far here and has no relevance as far as what "we" learned. The Suspect Test Process was not a walk in the park for either Aeolus or myself, but you will hardly read public accounts of my disappointment and likely not find any of Aeolus.

Second, you touch upon an important lesson we all hopefully did learn through Wobbuffet. But, again, what was stopping you from posting your proposal of a supermajority vote then, in March 2008? It likely wouldn't have been opposed according to your suspicions. Why did you stay silent? Why did most everyone else involved in competitive pokemon? Why did I have to be the one to make the effort to actually get Wobbuffet banned from your server when it became obvious that statistics weren't going to cut it? As I said after two months of effort in that thread:

Jumpman16 said:
In fact, the only concrete things we have to go on, again, are the aforementioned #39 -> #43 -> #43 -> #46 in weighted usage on the ladder and the fact that the argument that Shed Shell and U-turn usage has not gone up as was expected by many of the people against Wobby. Again, I'm not speaking for Colin, but even if he doesn't seem interested in analyzing something other than numbers, "that's why I made this thread".

Collaborative engagement was the only thing that was going to work to get Wobbuffet banned. And collaborative engagement is one of the only ideals that actually cannot be faulted about the entire Suspect Test Process of Gen IV.

Cathy said:
Unfortunately, things went very far downhill shortly after this. The next year was spent on entirely pointless "tests" because by its very design, so-called "Stage 2" was 100% pointless. Eventually, when Stage 3 rolled around, the results of Stage 2 were irrelevant.

Jabba already defended the process, which, again, was his and agreed upon by a..."supermajority" of other PR posters as a result of the collaborative engagement created when I made my Order of Operations thread. But for the record, my intention was for the Stage 2 "tag" to have a quantitative weight on whatever the Stage 3 results would be. I've already made clear elsewhere that the point of the Suspect Test was this, as voiced in the beginning of this year in my Smog article:

My goal has always been to include the community in the process of making and maintaining our competitive tiers, even though it would have been much faster to simply poll the opinions of a few of our tenured, well-respected and battle-tested members instead.

I don't care any more about your assessment of the merits of this goal any more than you care to ask (this goes for everyone, not just Cathy). The aim was clear, and, most importantly, underlines the collaborative engagement I've referenced quite a few times already.

Let's take a step back and think about the previous paragraph. A whole year was wasted by a process that was designed out of the box to be pointless. I want to make sure that is very clearly understood. Stage 2 was pointless. This is so important to understand because it is often bandied about that proper tiering processes take too long. In reality, poor decisions regarding the tiering process is what makes it take too long. Unfortunately, DP was a case of the latter. A sane process would have been similar to stage 3 from the start. Also important is that a sane process would have stopped at the design of the first test, and considered only a simple rating and deviation check.

Another way in which things went far downhill was the introduction of two extra metrics to filter the voter pool. First, voters had to submit "paragraphs" which were never published for public inspection, and which were arbitrarily used to decide who would vote. This measure alone ruined the system. Particularly ironic is the fact that in ruining the system, it was also made slower, and one big complaint is always how slow things are; this was the fault of the people making this complaint.

The second big mistake that was made around this time was the introduction of "suspect experience". This is a secret measure that no one except for three people know the definition of. We were told repeatedly that it was good, and useful, but of course, since we couldn't see it, we had no idea. At this point, the process was devastated. Voters were excluded on completely mysterious grounds, both through paragraph submissions and a top secret formula that was a terrible idea, and remains a terrible idea.

All three of these ideas (Stage 2, voting paragraphs, SEXP) were proposed and discussed both openly and repeatedly in the spirit of collaborative engagement, as per the norm. Only regarding SEXP would you have any argument, which you have already posted (and has already been addressed by me, Doug and X-Act countless times). The three of us, including Aeolus, do not feel that is a "terrible idea", and since we are the only ones who know what it is, this is very simply a matter of whether you trust us, as Doug posted last year. Just because you or others may not doesn't mean that SEXP or the "mysterious grounds" upon which Aeolus and I rejected paragraphs (or accepted them, which you should have taken equal issue with if you wanted your concerns to seem impartial) doesn't mean they were a bad idea, and you are simply going to have to deal with that.

The next substantative thing to happen wasn't until August 2009, when so-called "Stage 3" started. This represented a process similar to what the process should have been from the start. Particularly jarring was the way it had been designed to make the previous year's work useless. The flaw here was wasting the previous year; Stage 3 should have been the entire process. Stage 3 was still a mess though. My attempts to improve it slightly ended up wasting many dozens of hours of my time, and ultimately led to nothing, despite the large number of people who supported something along the lines I was proposing.

Again, I hope you don't think that referencing the waste of your singular time is going to move many people. Unless you literally don't believe what you are telling us "we learned" from this process, I am pretty sure there are others in this community who have, by your own pessimistic and results-oriented calculations, wasted hundreds more hours than you on it, so a further appeal to emotion is either contradictory to your actual thoughts or dwarfed by the largely unvoiced disappointments of more involved parties.

Besides, I'm not even sure what you mean by stating your attempts "ultimately led to nothing". Didn't you get the supermajority observation that you wanted when you finally decided to speak up?

After Stage 3, things became even worse. After messing up immensely over the last year and a half, the wasted time was used as a reason to introduce an another bad process. First of all, after messing up so badly, there should have been a major leadership change in tiering policy. How does it make any sense that after messing up badly you get a second chance? We have plenty of people far more capable of handling tiering than the people who handled it this generation. We need people with special skills. People who not only enter tournaments, but place well in them. People who engage with strategy and the community in Stark Mountain. People who have contributed to site content more recently than two or three years ago. People who are capable of putting in the technical work required to make processes a reality. It's time for other capable members of the community to set the direction for tiering policy.

Okay, let's step back for a second and look at this with the objectivity you called for in your opening. Why *were* Aeolus and I allowed to stay in our leadership position of Tiering Contributors? Please ask yourself this honestly and, again, without any rhetoric, because I intend none whatsoever. Why do you think that we were "permitted" to continue leading the Tiering directive?

A good guess would be that we had perhaps built up enough trust in the community after our repeated efforts at collaborative engagement that the very community we explicitly sought to respect with the spirit of the Suspect Test Process still trusted us to see our process through. I wouldn't make this guess, though...instead, I would simply repeat the not-so-rhetorical questions I posed to 5KR when he asked the very same questions you are now:

Jumpman16 said:
what if gouki, an upper requirement voter for this test and generally respected member of our community, tallied the votes on latios? after all, gouki had no problem reaching the upper limit, and, more importantly, had an extremely high Suspect EXP ranking.

he voted uber. under your assumption that experience in a given suspect test metagame is required to be able to determine "adequate suspect usage and sound voting reasoning", gouki is much, much more qualified than i to tally submissions.

if he had, how do you think he would perceived the 50 or so submissions? do you think he would be more inclined to find issue with the submissions that state why the would-be voters feel that latios is OU, or less inclined? no inclination? why?

and of those submission that leaned uber...would he be more inclined to agree with these than those that did not? less? no inclination? why?

do you know what i think about latios?

do you know why you don't know what I think about latios?

Now this thread of yours may very well succeed in "ousting" Aeolus and I as Tiering Contributors. I don't really care if it does or doesn't. But you must realize that if anyone besides us actually wanted to step in and lead this process, they wouldn't need the urging of an admin dissatisfied with the process to do so years after the fact (as if there's only one such admin to begin with).

I honestly and respectfully think that your rallying cry is extremely pathetic. Not because of its merit (and not "you", don't read this comment as a personal attack, if I wanted to call you pathetic I would), but because I know and Aeolus knows that the mere fact that you even had to *post* it should underline what we personally are disappointed in most about the community: people here rarely step up to lead even if they are "obviously" more qualified for the position. It underlines my meaning in the beginning of this post where I said I was pleased that you had posted—I am glad it has now been made apparent by one of the "few people"—by the faulty perception of the rest of the community—worth listening to. You have highlighted the source of my outright discouragement with the would-be all star battlers who fit the description of "leader" you so plainly outline in your missive better than I could have, in the sense that such a lament wouldn't do much good coming from me myself (which is why I've never posted it).

I am actually now compelled to break down your job description to see how many people here are qualified to lead us into Gen V.

We need people with special skills.

Great! We have a lot of users like this, even if the description is kind of vague for a leadership position.

People who not only enter tournaments, but place well in them.[/i]

We have a ton of these people too, and easy to point out as well. A promising start! Let's keep going.

People who engage with strategy and the community in Stark Mountain.

Uh oh. That number has greatly diminished already! It is a shame that not too many people are willing to post in Stark Mountain or Policy Review (posting in the latter being a function of excelling in the former forum). We could name names of people who have passed on the first three of your requirements, but I fear that this list is already rather short.

People who have contributed to site content more recently than two or three years ago.

"Contribution" to the site is as vague as the several threads in Inside Scoop pertaining to the word's meaning with respect to badges would suggest, but this doesn't necessarily weed out anyone who has passed your first four requirements.

People who are capable of putting in the technical work required to make processes a reality.

Hmmm, "capable". Another eager quality, no doubt, but does this necessarily capture who we're looking for? And to what do you refer when you say "technical work"?

It's time for other capable members of the community to set the direction for tiering policy.

Maybe it is, maybe it isn't. I fear that your aims have been a little optimistic though, because I'm not sure anyone here fits this description.

Ultimately, this all goes back to when I "had to" post that Wobbuffet thread and before. You left off "willing" in your list of ideal requirements, and it is the single most important characteristic anyone can have in this volunteer vocation that is Smogon. We already know who the willing people are, regardless of what other qualities they may have, because they post here, and post here consistently. Lack of willingness is a problem at Smogon. It's not changing, no matter how long your post is or how right you may be. If you haven't realized that by now, Cathy, you are going to remain frustrated for as long as you are here.

Cathy said:
The Smogon Council was a very bad idea. When it was first mentioned it in #stark, I said in a private message that it was not even worth the time to argue with it, because no one would swallow it. Obviously, I was wrong. Smogon's culture of respect (people with status must be respected unconditionally) has prevented people from pointing out the obvious: that the smogon Council was the worst idea since suspect experience. The Council was not even faster than a simple vote based on a simple rating/deviation metric. The Council consists of people handpicked by two people in a process based on nothing tangible and with no oversight. It's effectively no different from those two people banning pokemon by fiat. It may be better than the previous process, but that's a low bar.

Your underestimation of the totalitarian sway of the mighty Jumpman16 and Aeolus does not preclude your own absolute responsibility to have spoken your mind. As a user who has just successfully impacted a change in the tiering process with the suggestion of the utilization of a supermajority. As an administrator of this site and this community. As an intelligent, well-spoken user whose input is very widely respected here. As a user of tenure regardless, whose input would be much more likely to have an impact regardless of anything as. And, most importantly, as a user who genuinely profess to care or have cared about the tiering process.

In light of all this, I would argue that it is more disappointing that even a user of your influence could be likened to the rest of a community that remains silent when needed most. It is disappointing at its core, and regardless of any personal affront you may feel this post of mine is, I want you to read that again. It is utterly disappointing that even you, Cathy, were silent when the community needed "proper" input the most. There is no getting around that. Are Aeolus and I honestly that respected or feared that we hardly have any opposition even from those literally most likely to topple our misguided regime? If this is true, then there actually may be some merit to appointing some new but not-too-respected leaders for Gen V's tiering, someone whose "stature" will not get in the way of the necessary voicing of concerns and opinions regarding the process itself. I'm being completely seriously, by the way. If there is going to be a serious appeal to the "problem" of how respect leaders are as it relates to voicing necessary concerns, then we have a very, very big problem on our hands.

That brings us to today. Everybody knows the first process was a disaster. After all, the flaws with that first process are continually cited as the reason to introduce the council. This alone should raise eyebrows about the same people who designed that previous process having continuing influence on Pokémon policy. Although they don't realise it yet, they also messed up a second time with the "Smogon Council". Twice is more than enough chances. You may not agree with my personal position of not banning Pokémon before the game is released, but if there is one thing you should take away from the history of tiering in DP, it's that some new qualified people need to step up to the plate to spearhead tiering in the next generation. We should avoid banning things hastily. We have plenty of time to do it right. So long as we avoid developing a process as bad as paragraph submissions, top secret formulas, and other arbitrary delays and exclusions, we don't run the risk of wasting years this time. Such a working process is a simple vote with the only filter being a ladder statistic check.

I thought that Aeolus and I had already agreed months ago that the Smogon Council was largely a failure, but maybe somehow you would know better than we. I could make the argument that we have devised most of the Gen V tiering process after having learned from Gen IV better than anyone else and more comprehensively than anything your post may have said, but you probably wouldn't want to hear it.

The bottom line is that there is no justification for starting off the next generation with arbitrary bans. The DP ban list is already very long, and the next generation is only going to introduce more pokemon of a similar level of power, or revise older pokemon up to that level. Even the argument about saving time doesn't hold water, because, using a good process, we can balance the game far faster than was done this generation. The best process is a simple vote based on a completely open metric. This is efficient, fair, representative, and completely peer reviewable. Most importantly, we should not ban any pokemon without having played the game for a while.

I think you could have kept your post to this paragraph and had it be just as effective if not more. As you can see, there is discussion now about whether or not to begin Gen V with a banlist, which is what we should be talking about. Your "completely open metric" can and should be posted assuming you haven't already (we wouldn't want echoes of "mysterious" SEXP in Gen V).

In conclusion, I was only as compelled to respond to your long list of grievances as you were to post them in the first place. You can respond if you want to, I don't care too much either way. If you honestly and truly feel that Aeolus and I are not fit to lead the tiering directive in Gen V, that's fine. In this event, it would make sense for both of us to share what we learned during this process that we haven't shared (and that your post still has not addressed, though how could it coming from a third party), because we both necessarily know what we could have done differently better than anyone else. I'm not sure you are interested in this feedback though, and if I'm right, then as Aeolus said, the same things will happen in Gen V that happened in Gen IV. Hindsight is rather easy to have on pilot projects, but if you didn't do the steering yourself, then your next flight may be every bit as turbulent.

Cathy · Sep 17, 2010

Your post is long, but it can be summarised as "Why didn't you do X, Y, and Z previously?" It's a classic technique that you post in basically every thread. Obviously, the fact that X, Y, and Z weren't done previously is why we are discussing now. Although I must note that every time I tried to do X, Y, and Z I ended up wasting my time because typically your arguments amount to "You should have said this earlier. Why didn't you say this earlier? There was plenty of time to discuss this earlier." This is also the reason why I will not be engaging in detail the above extremely repetitive post.

Secondly, I will offer a tip to people writing posts in debates. Please write coherent essays rather than piecemeal responses to quotes, so your posts are actually worth reading.

That said, Jumpman, your insights into tiering policy are welcome on this forum, just like anybody else's, but without carrying any special weight either.

Firestorm · Sep 17, 2010

Blame Game said:
But why have they decided this? Because it is very important to the actual, practical integrity of their game, and not just the philosophical idea of it. If anything, the "just play the game" philosophy that is so pervasive throughout most competitive communities is nothing more than a byproduct of practical issues that those communities would have had had they adopted another philosophy.

If I went up to a Street Fighter arcade cabinet in the 90s, an attitude of "X character is cheap," or "Y tactic shouldn't be used in situation Z," would not be feasible, because chances are that my opponent couldn't care less. Move up to local tournaments, and you see people formulating rulesets based on whatever the "generally accepted rules" are--or risk poor community support. The same applies to larger events, with the added caveat that there are other countries on the planet that we would like to be able to compete with fairly. So the fundamental difference between Pokemon and "traditional games" is not that bans are less impactful (though that does help), but that it merely doesn't suffer from any of these potential issues. We ban Blissey, and that's it: suddenly, everyone is playing Pokemon without Blissey. Do we have to worry about the potential backlash of such a decision? Yes. Likewise, it is perfectly reasonable to suggest that the potential backlash of an "Ubers-lite" metagame is necessary to consider as well.

No. At the highest level of play, SmashBoards Back Room sets the rules for Smash Bros. Evo sets the rules for Street Fighter. Mario Kart 64 (the site) sets the rules for Mario Kart time trials. Smogon sets the rules for Pokemon.

I agree with what capefeather is saying and what Cathy is saying about how we should go into Generation V, but that's what I've been saying when I get the chance for the past 3 years. However, I'm not involved enough with Pokemon policy to really be able to make a dent in how this goes.

lilyhollow · Sep 17, 2010

Firestorm said:
No. At the highest level of play, SmashBoards Back Room sets the rules for Smash Bros. Evo sets the rules for Street Fighter. Mario Kart 64 (the site) sets the rules for Mario Kart time trials. Smogon sets the rules for Pokemon.

"No" what? "No, 'ban only when necessary' is not very important to the actual, practical integrity of the Smash Bros., Street Fighter, and Mario Kart communities?

First of all, SBR is not even relevant to this discussion because of how limited their "authority" is. They wouldn't have succeeded in banning Metaknight even if they tried, for one thing.

I don't know anything about Mario Kart, but the fact that you're talking about Time Trials is already pretty dubious as far as its relevance to a discussion of an actual 1v1 competitive game is concerned.

As for Evo, give me one example of a decision they made that flew in the face of practicality in favor of "philosophical purity" and only "philosophical purity." There are none. They have never done anything that they did not think would increase turnout, or community loyalty, or player morale, or some other tangible thing, ever. Everything they have done, or have not done, can be perfectly justified--and not just with the baseless mantra of "Ban Only When Necessary." This philosophy is ultimately derived from their community's environment, which is one where banning something accomplishes little more than to disrupt and confuse people who are trying to fairly compete. It is a heuristic. "Ban Only When Necessary (to avoid community divisiveness and other such messy issues)" is how this "philosophy" should really be interpreted, and largely, it does not apply to Smogon.

Erazor · Sep 17, 2010

User Destiny Warrior has some input on the matter of our initial banlist.

Destiny Warrior said:
Considering the arrival of the base stats of all the fully evolved Pokemon(which has been confirmed by Serebii), this is when we need to realize that we cannot use the same power standards as we did for DP. There are a lot of Pokemon that have at least one base stat about 105(just an arbitrary large number), meaning that they(at least from a theorymon aspect) are going to cause large ripples in the metagame. If we expect to start with some predefined bans, we are at once beginning with a mindset that Generation 5’s OU will have the same approximate power level as Generation 4’s OU. This is a fallacy, as while we may achieve a balanced metagame, we are then trying to achieve a V2 DP metagame, which while keeping us in our comfort zone, is likely to cause an extremely large ban list of new Pokemon, due to them being above our perceived “required power level”.

There is an extremely good chance that some of our current Ubers may not be so in Generation 5, due to the addition of several Pokemon to the metagame. Hypothetically, if say, Deoxys-A was balanced in this metagame due to a lot of priority users being available, pre-defining Deoxys-A to be Uber would be a mistake, since it does not overcentralize the new metagame. This would lead to arguments and debates later, when it could have been easily solved earlier. Note, this is just an example.

While this may be argued to be an “Ubers Lite”, let us examine what we mean by Ubers Lite. We mean that the power level will be lesser than that of Ubers, but larger than that of OU. However, another fundamental concept which we need to understand is that the so called “Uber Lite” power of one generation may be the “OU power level” of the next.

OU is decided by usage, but also indirectly by power, due to anything being perceived to be overly powerful and centralizing being banned to Ubers. Thus, OU has a certain power level, and it has to be accepted that the level changes between generations. Trying to set pre-defined limits on OU’s power range based on a previous generation would be an appeal to tradition, trying to make ourselves comfortable in a metagame we are used to, rather than trying to experiment with a new one.

So basically, starting with no bans for a 2-week testing period would solve a lot of these problems. We can run such a testing period by opening a thread where people can report what the new metagame is like, what is powerful, what was hyped incorrectly etc. If there is too much outcry about particular Pokemon being broken, we can test those specific Pokemon with whatever testing process we intend to use then.
If anybody wants to say that people would be discouraged from entering an “unbalanced “ metagame, I see no need to advertise a “balanced” metagame while we are in the process of testing at the beginning.]

tl;dr: Trying to have a pre-defined banlist for Gen 5 is like trying to preserve the DP metagame in a slightly modified form, which is not the aim of our tiers system. The real aim is to provide a competitive metagame, which does not have to be very similar to the previous generation’s metagame.

Thanks for hearing me out.

For what it's worth, I support banning only 670+ BST mons, and unbanning the rest. This is because there is no point trying to theorymon something as broken. Hell, UU is the perfect example of this. Even Heracross is not as broken as everyone thought it was(not yet anyway).

Jumpman16 · Sep 17, 2010

Cathy said:
Your post is long, but it can be summarised as "Why didn't you do X, Y, and Z previously?" It's a classic technique that you post in basically every thread. Obviously, the fact that X, Y, and Z weren't done previously is why we are discussing now. Although I must note that every time I tried to do X, Y, and Z I ended up wasting my time because typically your arguments amount to "You should have said this earlier. Why didn't you say this earlier? There was plenty of time to discuss this earlier." This is also the reason why I will not be engaging in detail the above extremely repetitive post.

Secondly, I will offer a tip to people writing posts in debates. Please write coherent essays rather than piecemeal responses to quotes, so your posts are actually worth reading.

That said, Jumpman, your insights into tiering policy are welcome on this forum, just like anybody else's, but without carrying any special weight either.

This is exactly the same attitude that you cite as the reason you did not bother speaking up about the Smogon Council. "It was not worth the time to argue because x". The very next thing you say is: "Obviously, I was wrong. Smogon's culture of respect (people with status must be respected unconditionally) has prevented people from pointing out the obvious: that the smogon Council was the worst idea since suspect experience."

But even though you were sure that our idea (Council) wasn't any more worth reading than was my post here and would never be accepted, you were wrong, as you admitted. What is going to change this time around, Cathy? What reason do you have to believe that Aeolus and I are "going anywhere" or at least going to heed any of the advice you posted (assuming we weren't already)?

If it is easier for you, you can consider just this part of my post. Or not. Again, I don't care much one way or the other. But the last time I checked, Aeolus and I hadn't lost much of this "unconditional respect" regardless of the merit of any one idea we had. So it is kind of puzzling when you merely *say* that my insight doesn't carry "any special weight", because it is pretty obvious that it does due to "unconditional respect" no matter how much you wish this weren't true. You owe it to everyone one else in our community to respond to this despite your personal confidence that things may just "happen", Cathy, not to me.

Anyway, there are some good ideas and thoughts posted in this thread, and I am glad to see that Doug is going to resurrect his ideas on how we should be shaping our metagame. I hope to respond to specific posts this later this weekend.

cim · Sep 17, 2010

Jumpman16 said:
But even though you were sure that our idea (Council) wasn't any more worth reading than was my post here and would never be accepted, you were wrong, as you admitted. What is going to change this time around, Cathy? What reason do you have to believe that Aeolus and I are "going anywhere" or at least going to heed any of the advice you posted (assuming we weren't already)?

I guess I'll be the one to say it. This is a bit of a tangent discussion, but whatever.

Why are you two in charge of tiering competitive Pokemon?

Seriously.

The reason I ask isn't rhetorical. I'm not asking "why are you two in charge" because I disagree with decisions you've made, though I obviously do disagree with some of them per Cathy's post. But especially with a new generation and new changes to the power structure, is there any reason you guys have to be the people in charge of tiering?

The reason I ask is because I'm honestly curious why you two feel to be the most qualified people for the job. On paper, it doesn't make much sense. You guys haven't exactly been consistently active ladderers, or consistently active tournament players. Maybe you've played a lot on alts, but never enough to be notable on the ladder, at least consistently enough for anyone to notice. Isn't the most intuitive thing to make a great, active competitive Pokemon player in charge of the decisions that affect Pokemon players?

Especially with the new power structure being slid into place over time, there's no reason we as a community need to have any particular people in any particular job other than who we feel is most qualified, no?

Lest you say I have a self serving ulterior motive, I would never take any job that put me in charge of Pokemon tiering. I also am not "good enough" at Pokemon to meet the standards I asked about above.

----

Again, sorry for the tangent, but the post striked me as "We're in charge so why do you think we give a fuck about what you think?", which begs the question of why you guys are in charge in the first place.

capefeather · Sep 17, 2010

Blame Game said:
As for Evo, give me one example of a decision they made that flew in the face of practicality in favor of "philosophical purity" and only "philosophical purity." There are none. They have never done anything that they did not think would increase turnout, or community loyalty, or player morale, or some other tangible thing, ever.

That depends a lot on what you mean by practicality. When a fighting game community picks up a game for which there's already a central competitive community, on whom are we to base our measure of practicality, turnout, loyalty, morale, etc.? I view Evo 2008's handling of Smash as extremely impractical because it didn't keep Melee on the game list, it used a drastically different ruleset from Smashboards's recommendations, and the post showing the rules gave only a vague explanation of the motivations and precedent behind them. They practically asked for the backlash, so I would consider this the epitome of philosophy over practicality. However, if you're looking out for a player trying to get into Brawl and potentially being bogged down by possibly unnecessary rules, then yes, you'll see the actions taken as the most practical. It would also be a fallacy to suggest that their attempts at middle ground were practical. So this really depends on who gets to decide what practicality is.

---
With Pokémon, the notions of "default settings" (usually *the* benchmark) and "practicality" are extremely vague. I would view the "default" in this game as a link battle with unhacked teams, no other frills attached. However, someone who holds Stadium/Coliseum/BR/BT rules in high regard, for whatever reason, may add these rules into his/her perception of the "default" benchmark.

This brings up my next point that I just realized. The former attitude is much more likely to be associated with intelligent competitive players who are willing to play with aspects of the game that they don't like. On the other side, the latter attitude is usually associated with a non-competitive mindset that makes up rules according to personal desire or some other desire imposed on them, and thus is not nearly as willing to concede to a collective ruleset. Therefore, the "louder" non-competitive attitude gets an inflated say as far as "practicality" is concerned.

I'm finding the result of this thought deeply disturbing, even nonsensical. Before, it was more about having statements of fact for which we cannot say that we unnecessarily forced a certain aspect of a metagame. However, now it seems that this is basically getting down to a louder non-competitive mindset overriding a more patient, quieter competitive mindset, for the sake of "practicality". This is, IMO, simply going against Smogon's goal of being a competitive Pokémon community.

Aldaron · Sep 17, 2010

With the recent reveal of Fifth Generation "Dream World" abilities, and with the confirmation that they are actually playable in link battles (Jibaku mentioned Vaporeon / Glaceon were seen in link), I think it is even more important to note that pre-playing bans are hogwash at the moment.

For starters, with two "normal" (non "uber") Pokemon in Ninetales and Politoed getting Drought and Drizzle, a large portion of what we "assume" about singles is totally moot.

As it stands, for singles at least, we have no choice BUT to start with as open a slate as possible, and see how a minimal ban ruleset metagames plays out because this is honestly unlike anything we'd had before.

By minimal bans I mean only perhaps species clause (and arguably sleep clause though at the moment that is very arguable).

lilyhollow · Sep 17, 2010

capefeather said:
That depends a lot on what you mean by practicality. When a fighting game community picks up a game for which there's already a central competitive community, on whom are we to base our measure of practicality, turnout, loyalty, morale, etc.? I view Evo 2008's handling of Smash as extremely impractical because it didn't keep Melee on the game list, it used a drastically different ruleset from Smashboards's recommendations, and the post showing the rules gave only a vague explanation of the motivations and precedent behind them. They practically asked for the backlash, so I would consider this the epitome of philosophy over practicality. However, if you're looking out for a player trying to get into Brawl and potentially being bogged down by possibly unnecessary rules, then yes, you'll see the actions taken as the most practical. It would also be a fallacy to suggest that their attempts at middle ground were practical. So this really depends on who gets to decide what practicality is.

I was referring to their handling of Street Fighter. Evo does not have, nor has it ever really purported to have any serious "authority" over Smash. That being said, amongst the fighting game community--their main audience--"All Brawl" was probably the most popular Brawl ruleset (certainly that was at least the perception of things). Evo would not have organized the tournament under those rules if members of the fighting game community did not actively show support of the ruleset by hosting All Brawl tournaments on a regular basis. Some of those members did act purely on the notion of "philosophical purity," and some of them just legitimately thought it was a better version of the game. Either way, it flew in their faces because the "ban only when necessary (to avoid community divisiveness and other such messy issues)" heuristic does not apply to Smash as well as Street Fighter (but still much better than it does to Pokemon).

This is like, completely alien to me. Ultimately, you are arguing that we should form our metagame on the sheer "basis" of philosophical purity, because "that's what competitive communities are supposed to do." Except no competitive communities actually do do that; the philosophy just happens to go hand-in-hand with practicality for them, because big surprise, they're the ones for whom it was largely created. Except even if that is how they operated, why would we ever follow suit? If a decision makes for a stronger, better community, that is the decision we should make, not some other dubious one that follows an arbitrary standard.

Why do you insist on arguing that "No Bans Unless Necessary = Automatically a Competitive Mindset?" Even if you honestly feel that that seriously just speaks for itself, why can't you use terminology that your opposition will actually recognize and consider? The last time you gave any practical justification to the "no bans" mindset was when you haphazardly suggested that an increase in overall power would counteract 4th gen's "diversity problem." I didn't even dismiss it either, because really, in the end I'm just looking for "some" reason to think that starting a metagame with no bans is a good idea that could like, theoretically help progress this community. As I've said, I think that Tangerine's old "we'll be less popular if we don't ban stuff right off the bat" argument is little more than conjecture. Somehow, though, it is still easily more convincing than anything stated to the contrary.

You also should understand that I am in no way strictly "against" taking on an as-few-bans-as-possible philosophy. I follow and participate in several other competitive communities. I probably pay closer attention to Sirlin and his website than anyone else on this entire forum. A 5th gen OU metagame with absolutely no bans would be very aesthetically preferable to me. I am quite literally hoping that someone will prove that it is indeed the correct philosophy to apply to this community. The fact that I keep reading through "well look at what this unrelated community did!" and "no-bans is competitive! It speaks for itself, guy!" is deeply disappointing to me. I am not here to serve my own selfish desires, or push some agenda; I am here to support what I believe to be the correct decision for this community. I say this because it may seem like I'm stubbornly digging in my heels, or "arguing for the sake of arguing" (sometimes I feel that way myself). Really I'm just asking for someone to argue on my terms, address my concerns, and convince me.

Aldaron said:
With the recent reveal of Fifth Generation "Dream World" abilities, and with the confirmation that they are actually playable in link battles (Jibaku mentioned Vaporeon / Glaceon were seen in link), I think it is even more important to note that pre-playing bans are hogwash at the moment.

For starters, with two "normal" (non "uber") Pokemon in Ninetales and Politoed getting Drought and Drizzle, a large portion of what we "assume" about singles is totally moot.

As it stands, for singles at least, we have no choice BUT to start with as open a slate as possible, and see how a minimal ban ruleset metagames plays out because this is honestly unlike anything we'd had before.

By minimal bans I mean only perhaps species clause (and arguably sleep clause though at the moment that is very arguable).

Could you elaborate on why these new changes justify a "no-bans" initial ruleset? Right now, I envision your argument as "well, a lot of stuff is definitely going to change. It just sort of 'feels right' to reset things here, for better or for worse. You know, because of all the change."

I agree that it does render a pre-test banlist useless, though.

Cathy · Sep 17, 2010

Chris is me said:
But especially with a new generation and new changes to the power structure, is there any reason you guys have to be the people in charge of tiering?

Don't worry, they aren't. What Jumpman seems to have missed from my post is that I was declaring that he will no longer carry any special weight, because any attempts he makes to set policy will merely be considered like anybody else's proposals, rather than actually implemented.

From now on, we can focus on the actual policy, rather than Jumpman's idiosyncrasies. I have some more ideas about a process which I will probably post later.

capefeather · Sep 17, 2010

I wouldn't say that "ban as little as possible" necessarily applies less to Smash than it does to Street Fighter. The former simply has more ban rules, for whatever reasons that have been forged throughout its history. Maybe Brawl auto-banned more than Melee did, but that's about it.

Onto the topic at hand:

Why do you insist on arguing that "No Bans Unless Necessary = Automatically a Competitive Mindset?"

Maybe that's not the right way to put it. I was thinking more about "no bans without due consideration". It is the concept of using experiment to prove a conjecture that I wanted to emphasize, which is why I was linking it (by part of a mindset, not a person) to intelligence and competitive-mindedness. The whole point of pursuing fewer (if any) auto-bans comes from a willingness to accept statements that happen to become fact, whether or not we thought that they were fact prior to putting it to the test. It may even make for a "stronger, better community", what with adopting more and more an attitude of not taking things for granted. I'm sorry if I'm sounding more and more like I'm pursuing some kind of elitist standard just for the hell of it.

I'd also like to say that I wasn't even the one to bring up other communities. I wanted to avoid bringing up Street Fighter, etc. because I knew that it was an ineffective avenue through which to argue and that it may even get me subconsciously typecasted. I was obviously willing to talk about it if it came up, but I didn't want to be the one to bring it up, so I didn't.

EDIT: I don't know whether I could come up with a "concrete reason" to believe that a community that tests more is better than a community that tests less, any more than I could come up with a "concrete reason" offhand to believe that a society that believes in a heliocentric solar system is necessarily better than a society that believes in a geocentric universe, or that a society that believes in fair trials is necessarily better than a society that doesn't. It is just a simple belief that I have that recognizing that we might not know is better than pretending that we do. I can't really argue anything to people who don't strongly share that belief. If we take this belief and look at history, it's clear that trying to appease "the community" will not necessarily make it better.

lilyhollow · Sep 17, 2010

"No bans unless necessary" can be substituted with "no bans without due consideration" or any other articulation of the "anti-ban" philosophy that you happen to prefer, and my point will remain the same: it is not an elitist standard per se, but one that is in no way tailored towards increasing the competitiveness of our community. Would it create a more "scientific" environment? Maybe, and if there is serious reason to believe that a more scientific environment will actually create a better/stronger community, great, show me (not sure how an "attitude of not taking things for granted" is either a concrete benefit to the community or achievable by starting out with no banlist, though).

Philip7086 · Sep 18, 2010

Though I initially would have liked to see something like >600 BST bans only, now with a lot of the new information we have about BW, I would have to agree with Aldaron. There is absolutely no way we can assume what is Uber and what isn't based on our previous experience from past generations -- things are just far too different now. If we find something to be blatantly Uber, it shouldn't take us too long to determine, and we should have a system in place to ensure that we can react to these consensuses swiftly. I can't see it taking us more than a month to determine if Pokemon like Mewtwo, Kyogre, Groudon, Rayquaza, etc. are broken or not, and it's far better to know for sure that they're Uber than to risk making early incorrect assumptions.

Elevator Music · Sep 21, 2010

I feel like I'm in the minority here for some reason but....

I don't like the idea of not banning Kyogre/Mewtwo/Rayquaza/etc, aka the 'obviously broken' Pokemon. I understand the importance of not making incorrect assumptions about tiering, and I know a lot of new stuff has come up with Black and White. However, I think we are wasting a lot of time in UT right now pretending we won't end up banning these Pokemon.

I know some of you guys really want to test everything, but we will likely not have a working simulator for a very long time. Consider a scenario, just for a second, that Kyroge is still incredibly fucking broken in a Gen 5 metagame. Since it's currently not banned, all discussions that we have in Uncharted Territory about the OU metgame are of a Kyogre-centralized metagame (which is likely very different from a Kyogre-less metagame). If we wait out the period until we have a working simulator or a big enough WiFi base to warrant a test, all the discussion we had in UT that pertained to Kyorge, which would be mostly everything, would be useless.

I understand that the definition of "obviously broken" is varied from user to user, but I'm pretty sure the users of this forum can come to a consensus about what is and isn't removable (if it's controversial then we may as well include it after all!). My point is that Kyogre/etc aren't controversial for the most part. And, at the very least, even if they aren't "obviously broken", it is "obvious" that they aren't part of a desirable baseline (I'm basically quoting Rising_Dusk from earlier in the thread here).

I don't know.... "lets wait it out and test it and if its broken ok" seems like a really silly thing to do for Pokemon like that when we're talking about wasting months of time. Isn't that what we're trying to avoid?

As a side note, I also agree with Chou and others who have echoed the sentiment that we make this metagame for players and that we should be giving them a metagame they want. But that is an entirely different issue and not my reason for posting.

Don't kill me!!

Chou Toshio · Sep 21, 2010

No reason to kill the voice of reason. I definitely don't think we should feel like "Start from Scratch" party is the dominant one, nor that their bullish attitude towards the issue is the true voice of reason here.

As a side note, I also agree with Chou and others who have echoed the sentiment that we make this metagame for players and that we should be giving them a metagame they want. But that is an entirely different issue and not my reason for posting.

Speaks for the vast majority of players who participate in the competitive community, in or out of smogon. A majority that frankly, is extremely underrepresented in PR, with no ability to speak on the issues. If anything, that was another real issue I had with 4th Gen-- the development of something of a "silent majority." It probably wasn't great when Smogon just opted to say "fuck you" to the many, many players that were adamant against Salamence or Garchomp's ban, but basically had no real way to influence the issue.

Our policies will affect the game for all of them, whether they even pay attention to our PR directly or not. Those (including me) who want to take the ideals of the players themselves seriously should not feel intimidated by (frankly rather pointles/empty) lofty philosophy that so far has only lead tiering into a ridiculous mess.

gengod · Sep 21, 2010

I agree with EM 100%. Sure there will be a few people that want to test everything but it will inevitably end up being a complete waste of time. Sure we can add those legendaries like Kyogre and Groundon to OU but they will simply shape the metagame around themselves and when they are finally removed we will be back at square one.

We should start with some initial bans, and we should definitely have some kind of major discussion as to which of the current Ubers will be allowed and which won't but to blindly throw every Pokemon back into a metagame and hope for the best isn't a good way to go about things.

RBG · Sep 21, 2010

Wouldn't starting our tier lists with the ones that Nintendo uses for the Random Wi-fi battles be a good place to start when it comes to making the site accessible to new users? I think that might be a valuable place to start.

lilyhollow · Sep 21, 2010

I agree, RBG.

What did we learn from tiering in generation four?

happiness is such hard work

I HAVE HOTEL ROOMS

toot

happiness is such hard work

np: Michael Jackson - "Mon in the Mirror" (DW mix)

Banned deucer.

I did my best, I have no regrets!

✓ Just Doug It

np: Michael Jackson - "Mon in the Mirror" (DW mix)

happiness is such hard work

toot

geriatric

Banned deucer.

toot

Myuu

63194

Over9000

Users Who Are Viewing This Thread (Users: 1, Guests: 0)