In my last post, I started out by writing out the current competition format, and put forth an argument that while it has the advantages of being basically fair and computationally simple, it has the decided drawback of generating a lot of lopsided matches. The reason for that is that it avoids the unfairness of some fencers getting an artificially inflated ranking after winning a lot of bouts in a poule where random luck placed several weak fencers – while other fencers may fence well but not get a commensurate ranking due to being placed in a strong poule – by making all the poules equally strong, as far as is possible. The drawback of that approach is that if the poules are to be equally strong, one has to separate the top fencers from each other and put them in different poules (and likewise with the weakest fencers) which leads to a situation where each poule has a really strong fencer, one really weak fencer, and the intermediate fencers likewise stratified. That means that the matches fenced in that poule will mostly feature matchups of fencers who are quite disparate in ability. So, the current competition format naturally leads to a lot of lopsided bouts.
The situation is not alleviated by the fact that the ranking after the first round of poules is used to create the poules for the second round, with likewise stratified poules, and that the DE tree is built so that the best fencer after the two poules fences the first match against the worst fencer, the second match features the second best fencer against the second worst, and so on.
Soapbox mode on
Let us see that from the perspective of a young rookie fencer arriving to one of his first competitions, in which he is ranked last before the first round of poules start. The poules in the first round of poules are created such that all the other weak fencers, whom our rookie would have had a prayer against, are placed in the other poules. Instead, he gets overwhelmed in all poule matches except maybe one or two. Based on those poule results, our rookie is ranked near the bottom in the ranking that follows the first round of poules, and that ranking is used to create the poules for the second round. Our rookie gets another bunch of opponents that mostly are completely outside of his caliber for the second round, and the poule match results feature a bunch of lopsided losses against our rookie. Based on those poule results, our rookie is ranked dead last coming into the DE stage, which means that he has to fence the very best fencer in the competition in his first DE. By now the grandparents of our rookie have gotten around to come to the sports hall, and he has the dubious pleasure of getting a 3-15 clobbering in front of his grandparents.
Then, we in fencing debate why the retention rate of beginners is so low.
Meanwhile, the best fencer in the competition gets maybe two matches in each poule that really tests his ability, and he starts off his climb through the DE tree by winning a bunch of easy bouts, only really being tested from the quarterfinal onwards. A full day worth of fencing, with maybe seven bouts that really force him to fight at the anything near the limits of his ability, and thus improving him.
Soapbox mode off
Instead of considering all those lopsided matches as an unavoidable annoyance, I have set out to create a new competition format that combines fairness and a high proportion of hard-fought matches. As outlined in the previous post, it is possible to do so and retain the computational simplicity of the current competition format if one has very good ranking information prior to the first round of poules, and if no fencer significantly overperforms, or underperforms, during the first round of poules. I consider those two limitations far too restrictive, and the following is my format that is designed to be both fair and produce a lot of hard-fought matches, no matter what quality of prior ranking information is available, and how well the match results adhere to those rankings.
This is done by realizing that just about everything involves tradeoffs and prioritizations – including a clear idea about what should be deprioritized. In the case of this competition format, we have three items that have been optimized, or will be so: fairness, computational simplicity, and a high proportion or hard-fought matches. Of those, the first and last directly affect the experience of the fencers, while computational simplicity is only a constraint imposed by the abilities of the competition leadership. Now, it is important to note that the current competition format was designed at a time when computers were much slower than today, and it was considered normal to run all the calculations involved in a competition by hand. Moore´s law has worked its magic, and that is not the case anymore.
Let us revisit the concept of fairness in a competition format context. Under the current competition format, fairness is seen as the concept of all fencers getting a roughly equally strong set of opponents, so that no one is unfairly helped or penalized by getting too many or too few weak opponents. A central, but unstated, part of this is that the rough equivalence of aggregate opponent strength is necessary, since the calculation system does not take into account the abilities that the opponents have.
But what if the calculations that transform match results into competitor rankings also could take into account the abilities of the opponents that a given fencer is up against? If that were the case, the competition format would not be constrained by a need to make the aggregate opponent strength roughly equal for all fencers. Instead, one could let all fencers fence mostly against other fencers who have similar abilities, thus creating a whole lot of hard-fought matches.
I started looking around on the web for a calculation model that does just that, and after some searching, I found what I was looking for: Colley’s bias-free matrix rankings.
This ranking method contains iterative steps, matrix inversions, matrix algebra, and a bit more math that is not covered in math lessons before college level, so describing it will require a blog post of its own.