Documentation Index
Fetch the complete documentation index at: https://launchdarkly-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
This section includes an explanation of advanced statistical concepts. We provide them for informational purposes, but you do not need to understand these concepts to use Experimentation.
Overview
This guide explains the statistical methodology LaunchDarkly uses to calculate Bayesian experiment variation means, and how these analytics formulas are useful for validating your results.
For a high-level overview of Bayesian and frequentist statistics, read Bayesian versus frequentist statistics.
The core formulas include the posterior mean, the data mean, and the data weight. We describe these in detail below.
Posterior mean
In the Bayesian approach, the main quantity we report is the mean of the posterior distribution calculated by updating the prior distribution with data observed in your experiment.
At a high level, the posterior means for all experiment variations and for any metric type, including conversion metrics and numeric metrics, can be represented by a convenient formula:$$
\begin
PosteriorMean = Weight \cdot DataMean + \left(1 - Weight \right) \cdot PriorMean
\end
- **Data mean**: The mean estimated from the data
- **Prior mean**: The mean of the Bayesian prior distribution assumed for the experiment variation mean
- **Weight**: A number between 0 and 1 which broadly reflects the amount of precision in our data mean. Broadly, the posterior mean is a weighted sum of the mean of the prior distribution and the mean calculated from the data. As more data arrives to the experiment, the weight increases and the posterior mean is influenced relatively more by the observed data and relatively less by the prior distribution. The specific behavior of this differs slightly between the control variation and the treatment variations, but this general principle holds for both. When you hover over the “Conversion rate” or “Posterior mean” heading in an experiment’s results table, you can view the conversion rate or posterior mean. When you hover over an actual conversion rate or posterior mean value, you can view actual numbers in the formulas instead of descriptions.
### Data mean
The formula for the data mean differs between conversion metrics and numeric metrics:
- **Conversion metrics**, including custom conversion binary, custom conversion count, page viewed, and clicked or tapped metrics, use the total number of conversions divided by the total number of exposures: $DataMean = SampleMean = Conversions / Exposures$
- **Numeric metrics** use the total value divided by the total number of exposures: $DataMean = SampleMean = TotalValue / Exposures$CUPED may affect the exact computation of these results. To learn more, read [Covariate adjustment and CUPED methodology](/guides/statistical-methodology/cuped).
### Data weight
The precision weight is given by:$$
\begin{aligned}
Weight = \frac{DataMeanPrecision} {DataMeanPrecision + PriorPrecision}
\end{aligned}
$$
This represents the proportion of the total precision due to the data mean. However, the precision is defined differently depending on the statistical model used.
There are two statistical models for estimating the posterior mean of experiment metrics:
- **Normal-normal model**: This model has a normal prior and a normal likelihood, and is used for numeric metrics.
- **Beta-binomial model**: This model has a beta prior distribution and a binomial likelihood, and is used for binary metrics when [CUPED](/guides/statistical-methodology/cuped) is not applied. For the normal-normal model, precision is defined as the inverse of the variance, so that the precision weight is:$$
\begin{aligned}
Weight = \frac{1 / DataMeanVariance} {1 / DataMeanVariance + 1 / PriorVariance}
\end{aligned}
$$For the beta-binomial model, precision is defined as the number of units for the data sample and the number of pseudo-units for the beta prior distribution. You can consider the $\alpha_{prior}$ and $\beta_{prior}$ parameters of the beta prior distribution as, respectively, the number of converted pseudo-units and the number of non-converted pseudo-units, so that the number of pseudo-units for the prior distribution is $\alpha_{prior} + \beta_{prior}$. If we denote by $n$ the number of units in the data sample, then the precision weight is given by:$$
\begin{aligned}
Weight = \frac{n}{n + \alpha_{prior} + \beta_{prior}}
\end{aligned}
Details of our Bayesian approach
The Bayesian approach to analysis involves two steps:
- Incorporating a subjective prior belief about parameters of interest, usually means, plus objective data collected during the experiment to create posterior distributions for each variation representing our current knowledge about what values those parameters are likely to take.
- Using that posterior distribution to compute helpful statistical measures that aid in making a decision about what action to take. For example, ship the treatment, don’t ship the treatment, and so on.
The most complicated part of the setup involves creating the posterior distribution because it involves fine parameter tuning and different treatments for different types of metrics. After we compute these distributions, we indicate them to you on the results page using these summaries:
- Credible intervals that convey the spread of the posterior distribution, which represents the range of likely values for the true mean of the variation
- Posterior means that convey the center of the posterior distribution, which represents our current best estimate of the true meanAfter the posterior distribution is created, then it is a relatively simple procedure to compute the statistics we display on the results page to help you make a decision. To learn more about these results, read Results table data. Below we dive into detail on how we accomplish these two steps.
Calculating posterior distributions
At LaunchDarkly, we use different statistical models for binary data and numeric data. In both cases, we use conjugate distributions, meaning that the family of the prior distribution is the same as the family of the posterior distribution:
- For binary metrics, we start with a Beta distribution for the prior and update that into another Beta distribution for the posterior
- For numeric metrics, we start with a Normal distribution for the prior and update that into another Normal distribution for the posteriorWe give some technical details on the exact specification of the priors below, as well as some closed-form expressions for the posterior distributions once data is incorporated.
Binary data
Binary metrics are also called “occurrence” metrics in LaunchDarkly. That is, binary metrics result in either a 0 or a 1 recorded for each context in the experiment. For more information, read Custom conversion binary metrics.
The natural approach for binary data is to use a Binomial likelihood function with a Beta prior, which results in another Beta distribution for the posterior.
Suppose that yˉv is the proportion of the Nv units in variation v that are converted. Then a total of Nvyˉv units converted, and Nv(1−yˉv) units did not convert.
Numeric data
Although numeric data can take a variety of forms and be modeled by many different kinds of probability distributions, we can use a simplified approach that leverages the central limit theorem. Because the quantity of interest is usually some unknown population mean which is estimated by the sample mean, we can have reasonably high confidence that the normal distribution will be a good fit for the likelihood of the sample mean as we collect more and more data:
flike(yˉv∣μv)=Normal(μv,σ2/Nv)
To further simplify the model, we treat the variance parameter as known and simply use the natural plug-in estimate, the sample variance computed from the data. As sample sizes increase, this plug-in estimate is guaranteed to converge to the true variance. To complete the model, we need to specify a prior distribution for μv.
For the control variation, we use an improper non-informative prior fprior(μ0)∝1. For the other variations, we use priors that shrink the results towards the control variation’s mean. We generate this prior from the empirical distribution of relative differences between variations in all experiments on our platform using metrics of the same type (numeric or conversion) and aggregation function (average or sum). The equation for this prior is:
\begin{aligned}
f_{\mathrm{prior}}(\mu_v) &= \mathsf{Normal}(a_v, w_v^2), \\
a_v &= \bar{y}_0, \\
w_v^2 &= \bar{y}_0^2 \hat{\gamma}^2 + \hat{\sigma}_0^2 / N_0
\end{aligned}
$$where $\hat{\gamma}^2$ is the variance of the distribution of observed relative differences ($(\bar{y}_v - \bar{y}_
0) / \bar{y}_0$) across all experiments with numeric metrics on the platform. The first term, $\bar{y}_0^2 \hat{\gamma}^2$, scales the expected relative difference by the observed control mean. The second term, $\hat{\sigma}_0^2 / N_0$, accounts for the uncertainty in our estimate of the control mean. The value of $\hat{\gamma}^2$ is
between 0.13 and 0.19, conditional on the type of the metric.
Combining the likelihood and prior provides the posterior distribution of $\mu_v$, which represents our beliefs about $\mu_v$ *after* observing the data from the experiment.
Given the normal likelihood and prior, the posterior distribution is also a normal distribution with the following parameters:$$
\begin{aligned}
f_{\mathrm{post}}(\mu_v) &= \mathsf{Normal}(\alpha_v, \omega_v^
2) , \\
\alpha_v &= \omega_v^2 \left(\frac{N_v}{\hat{\sigma}_v^2} \bar{y}_v + \frac{1}{w_v^2} a_v \right) , \\
\omega_v^2 &= \left(\frac{1}{w_v^2} + \frac{N_v}{\hat{\sigma}^2_v} \right)^{-1}
\end{aligned}
$$The experiment results page displays the posterior distributions of each variation's mean ($f_{\mathrm{post}}(\mu_v)$) in the [probability charts](/home/experimentation/analyze).
We use the expected value of the posterior distribution as a point estimate for $\mu_v$,$$
\hat{\mu}_v = \mathbb{E}[f_{\mathrm{post}}(\mu_v)] = \alpha_v
Conversion metrics
Conversion metrics use binary data. We use a Binomial likelihood function with a Beta prior, which results in another Beta distribution for the posterior. Suppose that yˉv is the proportion of the Nv units in variation v that are converted. Then a total of Nvyˉv units converted, and Nv(1−yˉv) units did not convert. To model the total number of conversions (Nvyˉv), we use a binomial distribution with proportion parameter μv and size Nv as the likelihood function:
flike(Nvyˉv)=Binomial(Nv,μv)
We use a Beta distribution as the prior for μv:
fprior(μv)=Beta(av,bv)
The values of the prior hyperparameters av and bv differ between the control (v=0) and treatment variations (v=0). For the control variation (v=0), we use a distribution with a0=1 and b0=1. For the treatment variations (v=0), we use a prior similar to the one used for numeric metrics. The prior for treatment variations is a Beta distribution with hyperparameters av, bv parameters such that its expected value and variance are:
E[fprior(μv)]Var(fprior(μv))=yˉ0,=yˉ02γ^2+N0yˉ0(1−yˉ0)
The value of γ2 is the variance of the empirical distribution of relative differences of experiments using a binary metric, and is currently set to γ2≈0.04. The posterior distribution of μv is also a Beta distribution:
undefined