How to Fix Track and Field's False Start Rule with Randomness

Current rules risk unfairly disqualifying athletes. Can we do better?

By: Michael Bereket Β·

Summary πŸ”—

  1. Track and field’s governing body defines a false start as a start initiated by an athlete before the athlete hears the starting gun
  2. Current rules attempt to eliminate false starts by disqualifying athletes that start in less than 0.1 seconds, despite evidence that faster reactions are possible
  3. The false start rule could be improved by allowing all starts after the starting gun while carefully randomizing its timing to make starts difficult to anticipate
  4. Given accurate measurements, the proposed rule would eliminate the risk of incorrect false start disqualifications while rigorously controlling the rate of missed calls

Introduction πŸ”—

The men’s 110m hurdles final at the 2022 Track and Field World Championships had all the makings of a great race. It featured Hansle Parchment of Jamaica, the reigning Olympic Champion, Grant Holloway of the United States, the 2019 world champion and second fastest 110m hurdler in history, and Devon Allen of the United States, the third fastest 110m hurdler in history (not to mention the rest of the outstanding field). Following strong qualifying performances by the three favorites, anticipation was high for a tightly contested battle in the finals.

Unfortunately, the race did not live up to these expectations. During warmups, Parchment struck a hurdle and injured his hamstring, forcing him to withdraw. One favorite out. Soon after, the race began: the athletes lined up, assumed their starting positions, and, after a moment, the starting gun fired. They were off! But just a moment later, the starting gun fired again.

Devon Allen had been disqualified for a false start.

But there was something unusual about this disqualification. As the start times popped up on the screen, Allen was shown to have started 0.099s after the starting gun. On the broadcast, the commentators explained that a false start is defined as any start before 0.100s after the gun.

If you are anything like me, the 0.100s threshold sounds crazy. How can we be so sure that a start at 0.100s is legitimate, but one at 0.099s is invalid? More to the point, why does the false start rule need to define a precise minimum reaction time at all? As I watched the race, it was clear to me that we can do better. Now, with no rule changes in the year since the 2022 World Championships (and a new blog in need of a first post), it seems like a good time for me to share my proposal for an improved false start rule.

The current false start rule πŸ”—

The World Athletics rule book provides the following definition for a false start in section C2.1 ? :

“16.7: An athlete, after assuming a full and final starting position, shall not commence their start until after receiving the report of the gun. If, in the judgement of the Starter (including under Rule 22.6 of the Competition Rules), they do so any earlier, it shall be a false start.”

I find this definition interesting for a couple reasons. First, it reflects a desire to include an athlete’s reaction time as part of what is measured in a track race; if World Athletics wanted to only measure running speed, all starts after the gun could be allowed and rules could be designed to minimize the impact of reaction times (for example, by starting races with a countdown). Second, because we do not measure when an athlete actually hears the starting gun, under the current rules we do not directly observe false starts. As a result, the false start rule must define a predictive model to make false start calls based on quantities that we do measure, such as when the gun was fired and when the athlete started.

So what predictive model is defined by the current false start rule? In races that use an automated system to detect the precise time an athlete starts, a false start is defined as any start that takes place less than 0.100s after the gun (C2.1, rule 16.6) ? . World Athletics justifies the 0.100s threshold by arguing that athletes cannot react in less time. If this is true, then the threshold can be used to reduce the rate of missed calls without creating false accusations. However, the 0.100s threshold, which originated from a patent application for an automated timing system in 1966 and received some early support before being codified in 1989 ? , has been widely criticized in recent years. For example, a 2009 study commissioned by World Athletics with just 7 sprinters identified athletes with faster reaction times and proposed lowering the threshold to 0.08s ? . Perplexingly, when questioned about this study in a recent Vox article ? , a World Athletics spokesperson said that this study was deemed too small to be robustβ€”as though a single legitimate counterexample would not be sufficient to show that the threshold is invalid.

While these criticisms of the current threshold are important, I believe this line of questioning should be taken a step further. Is it needed, or even useful, to define a precise minimum reaction time in the false start rule?

A probabilistic model of race starts πŸ”—

In order to evaluate the current false start rule and identify potential improvements, let’s define a simplified probabilistic model of race starts and assess how different choices affect the probability of errors.

Consider the timeline of a race with a single runner. We assume that we have access to accurate measurements, and define the time $t = 0$ to be the time that the runner is set in the starting position.

Let $G$ be the time that the starting gun fires. $G$ is a random variable whose distribution we get to define in the rules: for example, we may say that $G$ is sampled uniformly between 0 and 5 seconds, or that $G$ is always 2 seconds.

Let $R$ be the amount of time it takes the athlete to register the sound of the gun. Importantly, $R$ is a latent (unobserved) random variable; we do not directly observe when an athlete hears and processes the sound of the gun. We assume that $R$ is independent of $G$, and we observe that $R$ must be positive.

Let $S$ be the time that the runner starts running. The athlete gets to choose the distribution of $S$ with knowledge of the distribution of $G$. We would like to incentivize the athlete to start as soon as they hear the gun so that $S = G + R$.

Example timeline of a race start under our model

Example timeline of a race start under our model

Let $Y$ be an indicator variable representing whether or not a false start occurred. From the definition of a false start, we have that $Y = \mathbb{1}[S < G + R]$ ($Y$ is 1 if $S < G + R$ and 0 otherwise). Because $R$ is a latent variable, $Y$ is also a latent variable.

Let $\hat{Y}$ be our prediction of whether or not a false start occurred. We will consider false start rules of the form $\hat{Y} = \mathbb{1}[Y < G + c]$ for some chosen minimum reaction time threshold $c$. The current rules define $c=0.1\text{s}$, though I will argue that we should set $c=0\text{s}$.

Diagram of false starts and false start calls under our model. A false start occurs if an athlete starts before they register the gun, and a false start is called if an athlete starts before the minimum legal reaction time.

Diagram of false starts and false start calls under our model. A false start occurs if an athlete starts before they register the gun, and a false start is called if an athlete starts before the minimum legal reaction time.

Given this model, we are interested in how our choices for the distribution of $G$ and the threshold $c$ affect the probability of errors in our false start calls. We will separately consider two kinds of errors:

  1. False positives: a false start is predicted when the runner did not actually false start ($Y=0$ and $\hat{Y} = 1$)
  2. False negatives: no false start is predicted when the runner actually did false start ($Y=1$ and $\hat{Y} = 0$)

Both kinds of errors can be harmful, though they are not equally severe. A false positive unfairly disqualifies an athlete from a race, which can have substantial negative consequences. On the other hand, the harm from a false negative depends on the magnitude of the advantage gained and whether the race outcome is affected. Note that the advantage gained can by a false negative can be at most $R$, which is generally fairly small. Based on this balance, I submit that the false start rule should prioritize eliminating false positives while controlling the risk of false negatives.

Probability of false positives πŸ”—

Diagram of values of $S$ that result in false positives under our model. False positives occur if the athlete starts after they register the gun but before the minimum legal reaction time.

Diagram of values of $S$ that result in false positives under our model. False positives occur if the athlete starts after they register the gun but before the minimum legal reaction time.

Let’s consider the probability of false positives assuming that the athlete starts as soon as they hear the gun ($S = G + R$). Applying the definitions from the model (or just looking at the diagram), we see that

$$ \begin{aligned} P(Y = 0 \land \hat{Y} = 1 | S=G+R) &= P(S \geq G + R \land S < G + c | S = G + R) \cr &= P(G + R \geq G + R \land G + R < G + c) \cr &= P(G + R < G + c) \cr &= P(R < c) \end{aligned} $$ This is the issue that motivated this analysis in the first place: the probability of a false positive is the probability that an athlete reacts faster than the minimum legal reaction time set in the rules. It follows that setting $c = 0\text{s}$ will remove the possibility of false positives, because reaction times must be positive.

So why set $c$ to be nonzero at all? The motivation to do so is to reduce the rate of false negatives. However, we will see that there are better ways to control the false negative rate.

Probability of false negatives πŸ”—

In this section, we will characterize how our choices for the minimum legal reaction time $c$ and the distribution of starting gun firing times $G$ affect the probability of false negatives. In our analysis, we will assume that $c$ is less than or equal to the true minimum reaction time (even though we do not know exactly what this value is). This will allow us to assess how effectively a non-zero $c$ controls the false negative rate in the ideal scenario that $c$ does not create false positives.

Applying the definitions from our model, we see that the probability of a false negative is the probability that an athlete starts before they register the sound of the gun but after the minimum legal reaction time: $$ \begin{aligned} P(Y = 1 \land \hat{Y} = 0) &= P(G + c \leq S < G + R) \end{aligned} $$

Diagram of values of $S$ that result in false negatives under our model. False negatives occur if the athlete starts before they hear the gun but after the minimum legal reaction time.

Diagram of values of $S$ that result in false negatives under our model. False negatives occur if the athlete starts before they hear the gun but after the minimum legal reaction time.

What can we say about this quantity? Because the probability depends on the distributions of $S$ and $G$, we will need to make a couple more assumptions.

First, let’s consider the distribution of $G$. For now, I will simply propose that we set $G \sim \text{Uniform}(0, k)$ for some chosen maximum delay $k$. This means that the starting gun will be fired with a random delay between 0 and $k$ seconds after the runner is set, where each value has equal probability.

Next, let’s consider the distribution of $S$. What strategy will the athlete use to choose when to start? We will consider the case that the athlete chooses $S$ with the goal of maximizing the false negative rate. This will give us the worst-case probability of false negatives for different choices of $k$ and $c$.

Now that we have set the distribution for $G$, it is useful to reframe the false negative rate in terms of the timing of the starting gun. We see that $$ \begin{aligned} P(Y = 1 \land \hat{Y} = 0) &= P(G + c \leq S < G + R) \cr &= P(c \leq S - G < R) \cr &= P(-R < G - S \leq c) \cr &= P(S - R < G \leq S -c) \end{aligned} $$ Thus, the probability of a false negative can be expressed as the probability that the starting gun fires between $R$ and $c$ seconds before the athlete starts.

Diagram of values of $G$ that result in false negatives.

Diagram of values of $G$ that result in false negatives.

Next, let’s consider what strategy for choosing $S$ would maximize the false negative rate. Importantly, the maximum false negative rate can be achieved by a strategy that chooses $S$ independently of the value of $G$ (i.e. there is no benefit to a strategy that waits to learn something about the value of $G$). Intuitively, this occurs because an athlete must wait to hear the starting gun to learn its value, at which point they can no longer false start and increase the false negative rate. Thus, it is sufficient to identify what value of $S$ chosen independently of $G$ would maximize the false negative rate.

Consider setting $S = k+c$. As a reminder, $k$ is the maximum delay of the starting gun and $c$ is the minimum legal reaction time. Using our expression from above, the probability of a false negative is the probability that $G$ falls in the range $(S - R, S - c] = (k + c - R, k]$. We will start by assessing the probability of false negatives conditioned on different values of $R$.

If $R > k + c$ (which occurs if $k$ and $c$ are both very small), then $k + c - R < 0$ and all possible values of $G$, from $0$ to $k$, will result in a false negative. In other words, an athlete can always anticipate the gun in this situation without being caught. This corresponds to the scenario that the gun always fires in a very small time window, so the athlete does not need to wait to hear the gun to know when they can start legally.

Plot of the probability of a false negative if an athlete starts at $S = k + c$ and $R &gt; k + c$

Plot of the probability of a false negative if an athlete starts at $S = k + c$ and $R > k + c$

If $R \leq k + c$, then $k + c - R \geq 0$. Recall that we assumed that $c$ is a lower bound on reaction times, so we also know that $k + c - R \leq k$. This means that the range of values for $G$ that will result in a false negative, $(k + c - R, k]$, is a subset of $[0, k]$. We then see that the probability of a false negative is $\int_{g=k + c - R}^{g=k} f_G(g) dg = \int_{g=k + c - R}^{g=k} \frac{1}{k} dg = \frac{R - c}{k}$.

Plot of the probability of a false negative if an athlete starts at $S = k + c$ and $R \leq k + c$

Plot of the probability of a false negative if an athlete starts at $S = k + c$ and $R \leq k + c$

We note that the choice of $S = k + c$ achieves the largest possible false negative rate for any given $R$ and $c$. If $R > k + c$, the probability of a false negative is 1. If $R \leq k + c$, then the probability density of $G$ is at its maximum of $\frac{1}{k}$ for the entire range $(S - R, S - c]$, so no other choice of $S$ can achieve a larger value.

Let’s reflect on what we have shown so far. We found that the maximum false negative rate when $G$ is sampled from a uniform distribution with maximum delay $k$ is $\frac{R - c}{k}$ if $R \leq k + c$ or $1$ if $R > k + c$. These expressions have interesting implications:

  1. Starting gun randomization is the most important factor for controlling the false negative rate. We observe that the maximum probability of false negatives can be made as small as desired by increasing $k$, the maximum delay of the starting gun. Intuitively, this occurs because the chance of the gun firing at just the right time for the athlete to start after the gun but before they could react gets smaller and smaller as the range of possible starting times increases.
  2. Non-zero minimum legal reaction times have a relatively small impact on the worst case false negative rates. If $k$ is large, then $\frac{R-c}{k}$ will be small even if $R - c$ is non-zero. Additionally, if $k$ is small, then $c$ needs to be close to $R$ to control the false negative rate. However, only athletes with the fastest reaction times will have reactions close to the lower limit of human reaction times.

Thus, we see that a randomized starting gun is a much better tool than a minimum reaction time threshold to control the false negative rate. The current false start rules do not utilize starting gun randomization, instead stating that the starter “shall let the athletes go once they are all motionless in the correct starting position” (C2.1 16.3 comments) ? . While the current rules would result in consistent false negatives in our simplified model, in practice the starting gun cannot be perfectly anticipated due to unknown variation in the time for all athletes to get set and for the starter to fire the gun. As an alternative to relying on this uncontrolled randomness, we have shown that a carefully specified random delay in the starting process can be used to rigorously bound the maximum probability of false negatives.

Finally, we conditioned on specific values of $R$ in the expressions above. If we assume that $R \leq k + c$ always holds (which is reasonable to expect even for fairly small $k$) and integrate over different values of $R$, we find that the maximum false negative rate for a given athlete is $\frac{\bar{R} - c}{k}$, where $\bar{R}$ is the average time it takes the athlete to register the sound of the starting gun.

An improved false start rule πŸ”—

Let’s recap what we have learned so far in the context of our model. First, we saw that if an athlete starts as soon as they hear the gun then the probability of a false positive is $P(R < c)$, the probability that the athlete registers the starting gun faster than the minimum legal reaction time. This means that setting $c = 0\text{s}$, which allows all starts after the gun, would result in no false positives. However, there is a concern that setting $c = 0\text{s}$ could permit false negatives by missing instances of athletes starting before they register the gun but after it has fired. We showed that if $c$ is in fact a lower bound on reaction times and we choose to fire the starting gun with a uniform random delay between 0 and $k$ seconds, then the false negative rate is at most $\frac{\bar{R} - c}{k}$, where $\bar{R}$ is the average time to register the gun. This indicates that 1) $k$ can be used to make the false negative rate as small as desired and 2) $c$ will not effectively control the false negative rate across a range of reaction times.

These findings reveal a strategy for an improved false start rule:

Proposed false start rule
Let $R^*$ be an estimate of the slowest relevant average reaction time for competitive track races in seconds (ok to overestimate), and let $d%$ be the highest acceptable probability with which an athlete can successfully anticipate the gun without being caught. Let $k = \frac{R^*}{d}$. I propose that the starting gun should be fired with a uniform random delay between 0 and $k$ seconds and that all starts after the gun is fired should be allowed.

Assuming accurate measurements, such a rule would guarantee no false positives and control the false negative rate to be at most $d%$. For example, World Athletics may believe that the largest acceptable false negative rate is 5% and that the slowest relevant average reaction time for competitive athletes is 0.25s (which is quite slow). Then by randomizing the timing of the starting gun to fire with a delay between 0 and $\frac{0.25}{0.05}=5$ seconds, a worst case false negative rate of 5% can be achieved with no risk of false positives.

One undesirable aspect of this rule is that if an athlete knows exactly when $t=0\text{s}$ is and has a perfect sense of timing, the athlete can achieve the maximum false negative rate with no risk of being caught by starting when they hear the starting gun or at $k$ seconds, whichever comes first. Though this is unlikely to be an issue in practice due to the uncertainty of when all runners are considered set and the precision required to gain this small advantage, this issue could be further mitigated by adding a small random delay before the starting gun timer begins.

Practical considerations πŸ”—

We have defined an improved false start rule in the context of our model. What would it take to make it work in practice?

I propose using an automated system that generates a randomized starting signal. This system could be triggered by the race starter once all athletes are observed to be in the set position. Such a system would enable us to accurately sample from the desired distribution for $G$ and minimize the presence of other cues for when the race will start (e.g seeing the starter prepare to fire the starting gun).

Revisiting assumptions πŸ”—

In our model, we assumed that we have access to a perfectly accurate timing system. While inaccurate timing could pose an issue, note that the proposed false start rule will be less sensitive to errors than the current rule: the timing system would have to make an error on the order of an athlete’s reaction time, rather than the difference between the athlete’s reaction time and 0.1s, in order to have a false positive.

We made a number of assumptions about $R$ in the model. First, we assumed that $R$ was independent of $G$. This is likely not true: I would expect reactions to differ following long and short waits. However, for a reasonable range of maximum delays (e.g 5-10s) I expect that this variability will not play a big role. Second, we assumed that $R$ is independent of the strategy by which $S$ is set. This is also likely not true, as an athlete that does not plan on reacting to the starting gun may pay less attention and react more slowly. Perhaps $R$ could be better defined as a counterfactual time to register the sound of the starting gun assuming that the athlete is trying to react as quickly as possible, though this does not change the basic reasoning.

Finally, I proposed using a uniform distribution for $G$ without justification. In fact, we can show that this choice is optimal or close to optimal for minimizing the maximum false negative rate across values of $R$, $c$, and $k$ (see the appendix for details).

When I shared this proposal with one of my friends, he informed me that F1 races already use a randomized delay in their starting procedure (presumably for reaction-related reasons, though I am not sure if the effect on false negative rates has been carefully considered). Thus, this post can be considered part of the grand academic tradition of “discovering” ideas that are already in use in adjacent fields. That said, I still think sharing this proposal is worthwhile because 1) I find the reasoning behind a randomized start interesting and 2) the proposed rule can address a serious ongoing problem in track and field.

Conclusion πŸ”—

We have shown that a false start rule that allows all starts after the starting gun is fired and randomly samples the gun start time uniformly from between 0 and $k$ seconds will:

  1. have no false positives
  2. bound the worst case false negative rate at $\frac{\bar{R}}{k}$, where $\bar{R}$ is an athlete’s mean reaction time

Thus, given an appropriately selected $k$, this rule would do away with the risk of unfairly disqualifying athletes for false starts while keeping the probability of missed false start calls as small as desired.


Acknowledgements πŸ”—

Thanks to Zack McCaw for comments on an early draft, Herve Marie-Nelly for the initial lunchtime discussion, Nathan Orttung for his F1 knowledge, and Winston Liao for help making the website look nice. And thanks to everyone who encouraged me to make this post!

References πŸ”—

    Appendix πŸ”—

    Selection of uniform distribution for $G$ πŸ”—

    In the main text, we showed that a uniform distribution for $G$ bounds the worst case probability of a false negative to be at most $\frac{\bar{R} - c}{k}$. In this section, we will show that this matches or is very close to the optimal choice of distribution for any $R$, $c$, and $k$.

    To do so, we will find a lower bound for the worst case false negative probability $$\min_{f_G} \max_{f_S} P(S - R < G \leq S - c) $$ where $f_G$ and $f_S$ are the probability density functions of $G$ and $S$. For simplicity, we will condition on a specific value of $R=r$, rather than integrating over possible values.

    As discussed in the main text, we know that the worst case false negative rate can be achieved when $S$ is independent $G$. Thus, we have

    $$ \begin{aligned} \min_{f_G} \max_{f_S} P(S - R < G \leq S - c | R = r) &= \min_{f_G} \max_{f_S} \int_{s=0}^{s=\infty}f_S(s) \int_{g=0}^{g=\infty}\mathbb{1}[s - r < g < s - c] , f_G(g) ,dg ,ds \cr &= \min_{f_G} \max_{f_S} \int_{s=0}^{s=\infty} f_S(s) \int_{g=s-r}^{g=s-c} f_G(g) ,dg ,ds \cr \end{aligned} $$

    Let’s focus on the inner integral: $\int_{g=s-r}^{g=s-c} f_G(g) ,dg$. Because $f_G$ is a probability density function that is zero outside $[0, k]$, we know that $\int_{g=0}^{g=k} f_G(g) ,dg = 1$. We can divide the region being integrated into $\lceil \frac{k}{r-c} \rceil$ disjoint regions of length $r - c$ and write this integral as a sum of integrals over these smaller regions (if it does not divide cleanly, the last small region will extend past $k$, which by our assumptions does not add any probability mass): $$ \int_{g=0}^{g=k} f_G ,dg = \sum_{i=1}^{\lceil \frac{k}{r-c} \rceil} \int_{g=(i-1)*(r-c)}^{g=i * (r-c)} f_G(g) ,dg = 1$$

    This implies that $\max_{i=1}^{i=\lceil \frac{k}{r - c} \rceil} \int_{g=(i-1)*d}^{i * d} f_G(g) ,dg \geq \frac{1}{\lceil \frac{k}{r - c} \rceil}$: if it were not, then the $\lceil \frac{k}{r-c} \rceil$ terms would add up to be less than one. Thus, we have shown that for any choice of $f_{G}$ and $c$ and any $R=r$, there must exist a value $s=s^*$ such that $\int_{g=s^* -c}^{g=s^* - r}f_G(g) ,dg \geq \frac{1}{\lceil \frac{k}{r - c} \rceil}$ . Thus,

    $$\min_{f_G} \max_{f_S} P(S - R < G \leq S - c | R = r) \geq \frac{1}{\lceil \frac{k}{r - c} \rceil} $$

    If $f_G$ is a uniform distribution, then $\int_{g=s^* -c}^{g=s^* - r}f_G(g) ,dg = \int_{g=s^* -c}^{g=s^* - r} \frac{1}{k} ,dg = \frac{r-c}{k} = \frac{1}{\frac{k}{r-c}}$. This exactly matches the lower bound we have found if $k$ is a whole number multiple of $r-c$, and is very close otherwise: the gap is $\frac{1}{\frac{k}{r - c}} - \frac{1}{\lceil \frac{k}{r - c} \rceil} < \frac{1}{\frac{k}{r - c}} - \frac{1}{\frac{k}{r - c} + 1} = \frac{1}{(\frac{k}{r-c} + 1)(\frac{k}{r-c})} = \frac{(r-c)^2}{k(k+r-c)} \leq (\frac{r-c}{k})^2$ , which will be small when $k$ is larger than $r-c$ (the setting we want to reduce false negatives). Thus, sampling $G$ from a uniform distribution will achieve or come close to achieving the minimal worst case false negative rate for any $k$, $R$, and $c$.