There’s an interesting post over at Statistical Modeling, Causal Inference, and Social Science on calculating probabilities. Traditionally, if you observe a certain number of events (*y*) in some number of trials (*n*), you would estimate the probability (*p*) of the event as *y*/*n*. To calculate the variance around this estimate, you would use this equation: *p*(1-*p*)/*n*.

This leads to two problems. First, if you never observe the event, your estimate of the probability of the event is zero; if you observe the event in every trial, your estimate is one. This leads to a deterministic model even if the unobserved event is possible. Second, if *p* is estimated as zero or one, then the estimated variance is zero (once again, suggesting a deterministic model).

To get around these problems, the formula (*y*+1)/(*n*+2) is proposed for calculated *p*. Using this formula, you can never get a probability of zero or one, and the variance will always be greater than zero. There is further discussion of the implications of this calculation at SMCISS.