To calculate a Bayes factor you need to specify three things:
A likelihood that provides a description of the data
A prior that specifies the predictions of the first model to be compared
And a prior that specifies the predictions of the second model to be compared.
By convention, the two models to be compared are usually called the null and the alternative models.
The Bayes factor is defined as the ratio of two (weighted) average likelihoods where the prior provides the weights. Mathematically, the weighted average likelihood is given by the following integral
\[\mathcal{M}_H = \int_{\theta\in\Theta_H}\mathcal{l}_H(\theta|\mathbf{y})p(\theta)d\theta\]
Where \(\mathcal{l}_H(\theta|\mathbf{y})\) represents the likelihood function, \(p(\theta)\) represents the prior on the parameter, with the integral defined over the parameter space of the hypothesis (\(\Theta_H\)).
To demonstrate how to calculate Bayes factors using bayesplay, we can reproduce examples from Dienes & Mclatchie (2018), Dienes (2014), and from Rouder, Speckman, Sun, & Morey (2009).
library(bayesplay)The first example from Dienes & Mclatchie (2018) that we’ll reproduce describes a study from Brandt, IJzerman, & Blaken (2014). In the study by Brandt et al. (2014), they obtained a mean difference for 5.5 Watts (t statistic = 0.17, SE = 32.35).
We can describe this observation using a normal likelihood using the likelihood() function. We first specify the family, and then the mean and se parameters.
data_mod <- likelihood(family = "normal", mean = 5.5, sd = 32.35)Following this, Dienes & Mclatchie (2018) describe the two models they intend to compare. First, the null model is described as a point prior centred at 0. We can specify this with the prior() function, setting family to point and setting point as 0 (the default value).
h0_mod <- prior(family = "point", point = 0)Next, Dienes & Mclatchie (2018) describe the alternative model. For this they use a half-normal distribution with a mean of 0 and a standard deviation of 13.3. This can again be specified using the prior() function setting family to normal and setting the mean and sd parameters as required. Additionally, because they specify a half-normal distribution, an additional range value is needed to restrict the parameter range to 0 (the mean) to positive infinity.
h1_mod <- prior(family = "normal", mean = 0, sd = 13.3, range = c(0, Inf))With the three parts specified we can compute the Bayes factor. Following the equation above, the first step is to calculate \(\mathcal{M}_H\) for each model. To do this, we multiply the likelihood by the prior and integrate.
m1 <- integral(data_mod * h1_mod)
m0 <- integral(data_mod * h0_mod)With \(\mathcal{M}_{H_1}\) and \(\mathcal{M}_{H_0}\) calculated, we now simply divide the two values to obtain the Bayes factor.
bf <- m1 / m0
bf
#> 0.9745934This gives a Bayes factor of ~0.97, the same value reported by Dienes & Mclatchie (2018).
The second example, from Dienes (2014), we’ll reproduce relates to an experiment where a mean difference of 5% was observed with a standard error of 10%. We can describe this observation using a normal likelihood.
data_mod <- likelihood(family = "normal", mean = 5, sd = 10)Next, we specify a prior which described the alternative hypothesis. In this case, Dienes (2014) uses a uniform prior that ranges from 0 to 20.
h1_mod <- prior(family = "uniform", min = 0, max = 20)Following this, we specify a prior that describes the null hypothesis. Here, Dienes (2014) again uses a point null centred at 0.
h0_mod <- prior(family = "point", point = 0)This only thing left is to calculate the Bayes factor.
bf <- integral(data_mod * h1_mod) / integral(data_mod * h0_mod)
bf
#> 0.8871298This gives a Bayes factor of ~0.89, the same value reported by Dienes (2014).
In Example three we’ll reproduce an example from Rouder et al. (2009). Rouder et al. (2009) specify their models in terms of effect size units (d) rather than raw units as in the example above. In this example by Rouder et al. (2009), they report a finding of a t value of 2.24, with n of 80. To compute the Bayes factor, we first convert this t value to a standardized effect size d. This t value equates to a d of 0.25044. To model the data we use the noncentral_d likelihood function, which is a rescaled noncentral t distribution, with is parametrised in terms of d and n. We specify a null model using a point prior at 0, and we specify the alternative model using a Cauchy distribution centred at 0 (location parameter) with a scale parameter of 1.
d <- 2.03 / sqrt(80) # convert t to d
data_model <- likelihood(family = "noncentral_d", d, 80)
h0_mod <- prior(family = "point", point = 0)
h1_mod <- prior(family = "cauchy",  scale = 1)
bf <- integral(data_model * h0_mod) / integral(data_model * h1_mod)
bf
#> 1.557447Performing the calculation as a above yields Bayes factor of ~1.56, the same value reported by Rouder et al. (2009).
To demonstrate the sensitivity of Bayes factor to prior specification, Rouder et al. (2009) recompute the Bayes factor for this example using a unit-information (a standard normal) prior for the alternative model.
d <- 2.03 / sqrt(80) # convert t to d
data_model <- likelihood(family = "noncentral_d", d, 80)
h0_mod <- prior(family = "point", point = 0)
h1_mod <- prior(family = "normal", mean = 0, sd = 1)
bf <- integral(data_model * h0_mod) / integral(data_model * h1_mod)
bf
#> 1.208093Similarly recomputing our Bayes factor yields a value of ~1.21, the same value reported Rouder et al. (2009).
Although the Bayes factor outlined above is parametrised in terms of the effect size d, it’s also possible to performed the calculation directly on the t statistic. To do this, however, we can’t use the same Cauchy prior as above. Instead, the Cauchy prior needs to be rescaled according to the same size. This is because t values scale with sample size in a way that d values do not. That is, for a given d the corresponding t value will be different depending on the sample size. We can employ this alternative parametrisation in the Bayesplay package by using the noncentral_t likelihood distribution. The scale value for the Cauchy prior is now just multiplied by \(\sqrt{n}\)
data_model <- likelihood(family = "noncentral_t", 2.03, 79)
h0_mod <- prior(family = "point", point = 0)
h1_mod <- prior(family = "cauchy", location = 0, scale = 1 * sqrt(80))
bf <- integral(data_model * h0_mod) / integral(data_model * h1_mod)
bf
#> 1.557447Recomputing our Bayes factor now yields a value of ~1.56, the same value reported Rouder et al. (2009), and the same value reported above.
Brandt, M. J., IJzerman, H., & Blaken, I. (2014). Does recalling moral behavior change the perception of brightness? Social Psychology, 45, 246–252. https://doi.org/10.1027/1864-9335/a000191
Dienes, Z. (2014). Using Bayes to get the most out of non-significant results. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.00781
Dienes, Z., & Mclatchie, N. (2018). Four reasons to prefer Bayesian analyses over significance testing. Psychonomic Bulletin & Review, 25, 207–218. https://doi.org/10.3758/s13423-017-1266-z
Rouder, J. N., Speckman, P. L., Sun, D., & Morey, R. D. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 20, 225–237. https://doi.org/10.3758/PBR.16.2.225