\[\newcommand{\E}{\mathrm{E}} \newcommand{\Var}{\mathrm{Var}} \newcommand{\Cov}{\mathrm{Cov}} \newcommand{\se}{\text{se}} \newcommand{\Lagr}{\mathcal{L}} \newcommand{\lagr}{\mathcal{l}}\]
For two events \(A\) and \(B\), Bayes’ theorem states that:
\[\Pr(B|A) = \frac{\Pr(A|B) \times \Pr(B)}{\Pr(A)}\]
Find \(\Pr(B|A)\) from \(\Pr(A|B)\)
\(\Pr(H_1 | H_A)\)
\[\Pr(H_A | H_1) = \frac{\Pr(H_A | H_1) \times \Pr(H_1)}{\Pr(H_A)} = \frac{\frac{1}{16} \times \frac{1}{2}}{\frac{1}{32}} = 1\]
\(\Pr(B | A = 0) = 0.05\)
\[ \begin{align} \Pr(A|B) & = \frac{\Pr(A) \times \Pr(B|A)}{\Pr(B)} \\ & = \frac{\Pr(A) \times \Pr(B|A)}{\Pr(A) \times \Pr(B|A) + \Pr(A = 0) \times(B | A = 0)} \\ & = \frac{0.001 \times 0.95}{0.001 \times 0.95 + 0.999 \times 0.05} \\ & = 0.0187 \end{align} \]
Discrete random variable
\[ \begin{align} \Pr(\Theta = \theta | X = x) &= \frac{\Pr(X = x, \Theta = \theta)}{\Pr(X = x)} \\ &= \frac{\Pr(X = x | \Theta = \theta) \Pr(\Theta = \theta)}{\sum_\theta \Pr (X = x| \Theta = \theta) \Pr (\Theta = \theta)} \end{align} \]
Continuous random variable
\[f(\theta | x) = \frac{f(x | \theta) f(\theta)}{\int f(x | \theta) f(\theta) d\theta}\]
Replace \(f(x | \theta)\) with
\[f(x_1, \ldots, x_n | \theta) = \prod_{i = 1}^n f(x_i | \theta) = \Lagr_n(\theta)\]
Write \(x^n\) to mean \((x_1, \ldots, x_n)\)
\[ \begin{align} f(\theta | x^n) &= \frac{f(x^n | \theta) f(\theta)}{\int f(x^n | \theta) f(\theta) d\theta} \\ &= \frac{\Lagr_n(\theta) f(\theta)}{c_n} \\ &\propto \Lagr_n(\theta) f(\theta) \end{align} \]
Posterior is proportional to Likelihood times Prior
\[f(\theta | x^n) \propto \Lagr_n(\theta) f(\theta)\]
Point estimate
\[\bar{\theta}_n = \int \theta f(\theta | x^n) d\theta = \frac{\int \theta \Lagr_n(\theta) f(\theta)}{\int \Lagr_n(\theta) f(\theta) d\theta}\]
Find \(a\) and \(b\) such that
\[\int_{-\infty}^a f(\theta | x^n) d\theta = \int_b^\infty f(\theta | x^n) d\theta = \frac{\alpha}{2}\]
Let \(C = (a,b)\). Then
\[\Pr (\theta \in C | x^n) = \int_a^b f(\theta | x^n) d\theta = 1 - \alpha\]
Posterior/credible interval
Posterior distribution
\[ \begin{align} f(p | x^n) &\propto f(p) \Lagr_n(p) \\ &= p^s (1 - p)^{n - s} \\ &= p^{s + 1 - 1} (1 - p)^{n - s + 1 - 1} \end{align} \]
Beta distribution
\[f(p; \alpha, \beta) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)}p^{\alpha - 1} (1 - p)^{\beta - 1}\]
Posterior for \(p\) is a Beta distribution with parameters \(s + 1\) and \(n - s + 1\)
\[f(p | x^n) = \frac{\Gamma(n + 2)}{\Gamma(s + 1) \Gamma(n - s + 1)}p^{(s + 1) - 1} (1 - p)^{(n - s + 1) - 1}\]
\[p | x^n \sim \text{Beta} (s + 1, n - s + 1)\]
Mean of a \(\text{Beta}(\alpha, \beta)\) distribution is \(\frac{\alpha}{\alpha + \beta}\)
\[\bar{p} = \frac{s + 1}{n + 2}\]
\[\bar{p} = \lambda_n \hat{p} + (1 - \lambda_n) \tilde{p}\]
Use the prior \(p \sim \text{Beta} (\alpha, \beta)\)
\[p | x^n \sim \text{Beta} (\alpha + s, \beta + n - s)\]
Posterior mean is
\[\bar{p} = \frac{\alpha + s}{\alpha + \beta + n} = \left( \frac{n}{\alpha + \beta + n} \right) \hat{p} + \left( \frac{\alpha + \beta}{\alpha + \beta + n} \right) p_0\]
Need to determine
\[\Pr(A|D), \Pr(B|D), \Pr(C|D)\]
Prior probability
\[\Pr(A) = 0.4, \Pr(B) = 0.4, \Pr(C) = 0.2\]
Likelihood
\[\Pr(D|A) = 0.5, \Pr(D|B) = 0.6, \Pr(D|C) = 0.9\]
Posterior probability
\[\Pr(A|D), \Pr(B|D), \Pr(C|D)\]
\[ \begin{align} \Pr(A|D) &= \frac{\Pr(D|A) \times \Pr(A)}{\Pr(D)} \\ \Pr(B|D) &= \frac{\Pr(D|B) \times \Pr(B)}{\Pr(D)} \\ \Pr(C|D) &= \frac{\Pr(D|C) \times \Pr(C)}{\Pr(D)} \end{align} \]
\[ \begin{align} \Pr(D) & = \Pr(D|A) \times \Pr(A) + \Pr(D|B) \times \Pr(B) + \Pr(D|C) \times \Pr(C) \\ & = 0.5 \times 0.4 + 0.6 \times 0.4 + 0.9 \times 0.2 = 0.62 \end{align} \]
\[ \begin{align} \Pr(A|D) &= \frac{\Pr(D|A) \times \Pr(A)}{\Pr(D)} = \frac{0.5 \times 0.4}{0.62} = \frac{0.2}{0.62} \\ \Pr(B|D) &= \frac{\Pr(D|B) \times \Pr(B)}{\Pr(D)} = \frac{0.6 \times 0.4}{0.62} = \frac{0.24}{0.62} \\ \Pr(C|D) &= \frac{\Pr(D|C) \times \Pr(C)}{\Pr(D)} = \frac{0.9 \times 0.2}{0.62} = \frac{0.18}{0.62} \end{align} \]
hypothesis | prior | likelihood | Bayes numerator | posterior |
---|---|---|---|---|
\(H\) | \(\Pr(H)\) | \(\Pr(D\mid H)\) | \(\Pr(D \mid H) \times \Pr(H)\) | \(\Pr(H \mid D)\) |
A | 0.4 | 0.5 | 0.2 | 0.3226 |
B | 0.4 | 0.6 | 0.24 | 0.3871 |
C | 0.2 | 0.9 | 0.18 | 0.2903 |
total | 1 | 0.62 | 1 |
Credible \(1 - \alpha\) interval
\[\approx (\theta_{\alpha / 2}, \theta_{1 - \alpha /2})\]Can still compute the posterior density by multiplying the prior and the likelihood:
\[f(\theta) \propto \Lagr_n(\theta) f(\theta) = \Lagr_n(\theta)\]
Improper priors are not a problem as long as the resulting posterior is a well-defined probability distribution
\(\psi = \log(p / (1 - p))\)
\[f_\Psi (\psi) = \frac{e^\psi}{(1 + e^\psi)^2}\]Transformation invariant
Posterior density
\[f(\theta | x^n) \propto \Lagr_n(\theta) f(\theta)\]
Marginal posterior density
\[f(\theta_1 | x^n) = \int \cdots \int f(\theta_1, \ldots, \theta_p | x^n) d\theta_2 \cdots d\theta_p\]
Via simulation
\[\theta^1, \ldots, \theta^B \sim f(\theta | x^n)\]
Estimate \(\tau = g(p_1, p_2) = p_2 - p_1\)
\[X_1 \sim \text{Binomial} (n_1, p_1) \, \text{and} \, X_2 \sim \text{Binomial} (n_2, p_2)\]
If \(f(p_1, p_2) = 1\), the posterior is
\[f(p_1, p_2 | x_1, x_2) \propto p_1^{x_1} (1 - p_1)^{n_1 - x_1} p_2^{x_2} (1 - p_2)^{n_2 - x_2}\]
\[f(p_1, p_2 | x_1, x_2) = f(p_1 | x_1) f(p_2 | x_2)\]
\[f(p_1 | x_1) \propto p_1^{x_1} (1 - p_1)^{n_1 - x_1} \, \text{and} \, f(p_2 | x_2) \propto p_2^{x_2} (1 - p_2)^{n_2 - x_2}\]
Independent posteriors
\[ \begin{align} p_1 | x_1 &\sim \text{Beta} (x_1 + 1, n_1 - x_1 + 1) \\ p_2 | x_2 &\sim \text{Beta} (x_2 + 1, n_2 - x_2 + 1) \end{align} \]
Simulate draws from the posteriors
\[ \begin{align} P_{1,1}, \ldots, P_{1,B} &\sim \text{Beta} (x_1 + 1, n_1 - x_1 + 1) \\ P_{2,1}, \ldots, P_{2,B} &\sim \text{Beta} (x_2 + 1, n_2 - x_2 + 1) \end{align} \]