Balanced Incomplete Block Designs for FairScore

March 14, 2018 / Edward Cheng

The FairScore score normalization program tries to address the problem of judge variability in competition scoring. Some judges may grade harshly, whereas others may grade generously. Without normalization, participants can unfairly face an especially harsh or generous group of graders as a matter of chance.

FairScore works best when there is a significant amount of "mixing" between judges and participants. In other words, we want Judge A to judge a different group of participants from Judges B, C, and D. Obviously there will be some overlap (and in fact the overlap is critical to the normalization), but we want different kinds of overlap. We don't want Judges A and B to judge participants 1, 2 and 3, and Judges C and D to judge participants 4, 5 and 6.

Sometimes, some of the overlap is structural. In a moot court competition, judges sit in panels, and participants present to the panels in teams. Those judges will necessarily all judge those teams. In other contexts, however, we may have complete control over the matching of judges with participants. In these more flexible cases, what's the optimal way to match judges with participants? Well, it turns out that is solved through something called Balanced Incomplete Block Design (BIBD)

A BIBD is defined by five parameters. The standard description is that there are "v treatments repeated r times in b blocks of k observations." (Lambda, the fifth parameter, is the number of blocks where a pair of treatments appear. So BIBD is also known as a vrbk-lambda problem.) Translated into our scoring context: v is the number of participants or entries to be judged; b is the number of judges, who judge k participants each, resulting in each participant being judged r times. Note that there isn't always a clean solution for any given v, r, b and k. Just as you can't divide 8 evenly by 3, sometimes there will be extras left over.

There's some beautiful mathematics behind the scenes, but for our purposes, all that we need to know is that procedures exist for generating these BIBDs. So you can maximize the power of FairScore at really minimal hassle. See for example, https://www.r-bloggers.com/generating-balanced-incomplete-block-designs-bibd/.

3 Likes

Edward K. Cheng

Balanced Incomplete Block Designs for FairScore