#Nerdpost: Length-biased sampling

This post was inspired by a tweet. @AnnieLowrey praised a beautiful New York Times story by Pam Belluck about dementia among geriatric prison inmates. Lowrey expressed surprise that “21 percent of America’s prisoners are serving 20 years to life.” That is indeed a surprising number. It reflects a statistical and operations research principle you may not have considered, but which turns up all over the place in health care and public policy. It’s called length-biased sampling, and it’s worth some thought.

The easiest way to understand the basic issue may come from a simple example. Suppose you went over to the University of Chicago Medical Center last Thursday and surveyed every hospitalized patient. Many of the people you met would have been in the middle of a long-term stay. Why? Because people who only needed to stay one or two nights walked out before you could meet them. The stock or cross-section of hospitalized patients you met yesterday was a very different, much sicker sample than you would have found, had you specifically surveyed every patient who began their hospital stay on that same day.

In similar fashion, murder is a rare crime. Yet because each murderer contributes many person-days to the prison population, murderers are grossly over-represented in the cross-sectional population of prisoners incarcerated on any given day.

In various disguised forms, this same problem appears in many other contexts. Suppose your morning bus doesn’t really follow a schedule. There is the same probability it will arrive every minute, and on average one arrives every 15 minutes. You walk outside at a random moment. You might think that you will wait, on average only 7.5 minutes. You will actually wait, on average, that full 15 minutes because more minutes of the day happen to fall within some long interval in which the bus was delayed.

Statisticians have noted this issue for decades. When you are surveying people from an underlying population, someone’s individual characteristics may influence whether or not she will appear in your sample. So the characteristics of your study sample may systematically differ from those of the underlying population you really wish to understand. You survey lung cancer patients about their satisfaction with care. Your results may be misleading because you can only ask the question of patients well-enough to take your survey. The really sick patients and those who have died aren’t available to you.

An interesting aspect of the situation is easily missed. This isn’t just an issue for statisticians. It is often important in thinking about the substance of public policy. The group of people we actually intervene with on any given day-welfare recipients, prisoners and arrestees, patients , and others-are often a length-biased sample of an underlying population we have opinions about or want to understand. And the two populations may look quite different over time. The population of active criminals is different from the population behind bars, for example. And these differences matter.

I learned this subject from a fantastic paper on labor force dynamics and unemployment by Larry Summers and Kim Clark, written thirty years ago. Summers and Clark noted that the chronically jobless are grossly over-represented in the population of currently-unemployed people—a pattern sadly resonant with our situation today. If we want to increase employment, we ultimately must address the problems facing the long-term jobless.

With my colleagues Peter Reuter and Eric Sevigny, I’ve applied similar methods to some questions of criminal justice policy, in particular the potential of drug courts and other diversion efforts to reduce the prison population.

To get a little math-scary, suppose that prisoners are incarcerated at some constant rate of λ per unit time. Moreover, suppose that the distribution of time actually served (T) by any newly-arriving cohort of prisoners is well-described by some probability density function f(T). Suppose prison terms have some mean μ and some variance σ². This oversimplifies things in various ways, but the essentials turn out to be clarifying.

The population characteristics of prisoners who remain incarcerated on a given day (say March 17, 2012) will not follow the same distribution f(T). Someone convicted of armed robbery in 2002 is much more likely than a shoplifter to remain incarcerated ten years later. In fact, if g(T) is the sentencing distribution in among prisoners actually incarcerated today, it turns out that g(T)=Tf(T)/μ. Not surprisingly, the average prison term of currently-incarcerated prisoners-say M-is larger, too. It turns out that this average is given by the relatively simple formula M=μ[1+ (σ²/ μ ²)].*

There’s some relatively simple intuition here. The probability of observing prisoners with of any given sentence length is proportional to the original probability that prisoners will be sentenced to that term, multiplied by the length of the sentence. If one wants to know how a given crime—say armed robbery—directly influences the current prison population, what really matters is the number of person-sentencing-years associated with that crime.

Many people, on both sides of the political aisle, hope that we can reduce the prison population by finding better alternatives for low-level nonviolent drug offenders. This is a good idea, and we could notably reduce the inflow of people into prison if we followed these policies.

More sensible policies might indeed reduce mean sentences among all entering prisoners. Many practical obstacles impede these policies, including the limited capacity of drug courts and related interventions to reach all of the nonviolent offenders who don’t need to be in prison. It turns out that the great majority of drug-involved offenders aren’t eligible for the most touted current programs.

Unfortunately, the problems go deeper, too, and arise from the above formula for M. When we simulate the potential impact of such policies on samples of real prisoners, we find that these same policies applied to every eligible offender would still have surprisingly limited impact on the prison population. Prisoners helped by this policy would experience pretty short sentences anyway. So keeping them outside prison accomplishes less than you might think.

The current population is a length-biased sample of a highly varied population of newly-sentenced prisoners. So the parameters μ and M—the mean prison terms among all entering prisoners and the current population—look quite different. Reducing μ just doesn’t reduce M by very much in a varied population. We found that it is surprisingly hard to reduce prison populations by more than five or ten percent if one focuses on the most obvious populations of nonviolent drug-using offenders. One really needs to so other things, including addressing excessively long sentences imposed on older prisoners, and doing a better job on individuals who are now supervised in probation or parole.

For more on these policy issues, see here.

(My next post will apply the same principles to challenges President Clinton faced in welfare reform. It includes some cool graphs produced by my masters’ degree students. If you happen to hold stereotypes about quantitatively-challenged social worker-types, you’re welcome to perform these Monte Carlo simulations yourself….)

*If you are comfortable with calculus and probability, see Karlin and Taylor’s beautifully executed textbook: A first course in stochastic processes, p. 195.

The Reality-Based Community

Everyone is entitled to his own opinion, but not his own facts.

#Nerdpost: Length-biased sampling