Discrete random variables in R

MACS 33001 University of Chicago

\[\newcommand{\E}{\mathrm{E}} \newcommand{\Var}{\mathrm{Var}} \newcommand{\Cov}{\mathrm{Cov}}\]

Functions for working with probabilities

  • Common core functions for probability distributions
  • d*() - density function
  • p*() - distribution function
  • r*() - random data generation

Binomial

To find this binomial probability Use this R command
\(\Pr(x = a)\) dbinom(x = a, size = n, prob = p)
\(\Pr(x \leq a)\) pbinom(q = a, size = n, prob = p)
\(\Pr(x < a)\) pbinom(q = a - 1, size = n, prob = p)
\(\Pr(x \geq a) = 1 - \Pr(x \leq a) = 1 - \Pr (x \leq a-1)\) 1 - pbinom(q = a - 1, size = n, prob = p)
\(\Pr(x > a) = 1 - \Pr(x \leq a)\) 1 - pbinom(q = a, size = n, prob = p)

Calculating probabilities

\[ \begin{eqnarray} p_X(k) & = & {{N}\choose{k}}\pi^{k} (1- \pi)^{n-k} \end{eqnarray} \]

Calculating probabilities

\[N = 10, k = 5, \pi = .5\]

N <- 10
k <- 5
prob <- .5
To find this binomial probability Use this R command Result
\(\Pr(x = 5)\) dbinom(x = k, size = N, prob = prob) 0.246
\(\Pr(x \leq 5)\) pbinom(q = k, size = N, prob = prob) 0.623
\(\Pr(x < 5)\) pbinom(q = k - 1, size = N, prob = prob) 0.377
\(\Pr(x \geq 5) = 1 - \Pr(x \leq 5) = 1 - \Pr (x \leq 5-1)\) 1 - pbinom(q = k - 1, size = N, prob = prob) 0.623
\(\Pr(x > 5) = 1 - \Pr(x \leq 5)\) 1 - pbinom(q = k - 1, size = N, prob = prob) 0.623

Calculating probabilities

\[N = 20, k = 5, \pi = .4\]

N <- 20
k <- 5
prob <- .4
To find this binomial probability Use this R command Result
\(\Pr(x = 5)\) dbinom(x = k, size = N, prob = prob) 0.075
\(\Pr(x \leq 5)\) pbinom(q = k, size = N, prob = prob) 0.126
\(\Pr(x < 5)\) pbinom(q = k - 1, size = N, prob = prob) 0.051
\(\Pr(x \geq 5) = 1 - \Pr(x \leq 5) = 1 - \Pr (x \leq 5-1)\) 1 - pbinom(q = k - 1, size = N, prob = prob) 0.949
\(\Pr(x > 5) = 1 - \Pr(x \leq 5)\) 1 - pbinom(q = k - 1, size = N, prob = prob) 0.949

Calculating probabilities

\[N = 1, k = 1, \pi = .4\]

N <- 1
k <- 1
prob <- .4
To find this Bernoulli probability Use this R command Result
\(\Pr(x = 1)\) dbinom(x = k, size = N, prob = prob) 0.4
\(\Pr(x \leq 1)\) pbinom(q = k, size = N, prob = prob) 1
\(\Pr(x < 1)\) pbinom(q = k - 1, size = N, prob = prob) 0.6
\(\Pr(x \geq 1) = 1 - \Pr(x \leq 1) = 1 - \Pr (x \leq 1-1)\) 1 - pbinom(q = k - 1, size = N, prob = prob) 0.4
\(\Pr(x > 1) = 1 - \Pr(x \leq 1)\) 1 - pbinom(q = k - 1, size = N, prob = prob) 0.4

Simulating random variables

  • \(N = 10, \pi = .5\)
  • Simulate 1000 observations

    set.seed(1234)
    
    # store in a vector
    rbinom(1000, size = 10, prob = .5)
    ##    [1]  3  5  5  6  7  6  1  4  6  5  6  5  4  7  4  7  4  4  4  4  4  4  3
    ##   [24]  2  4  6  5  7  7  2  5  4  4  5  4  6  4  4  9  6  5  6  4  5  4  5
    ##   [47]  6  5  4  6  3  4  6  5  3  5  5  6  4  7  7  2  4  2  4  6  4  5  2
    ##   [70]  5  3  7  2  6  3  5  5  3  4  6  7  5  3  5  4  7  5  4  3  7  3  7
    ##   [93]  3  3  3  5  4  2  4  6  2  5  4  4  3  4  3  3  5  2  6  3  8  3  4
    ##  [116]  7  8  4  3  6  6  7  9  7  5  4  4  5  5  4  8  6  3  5  7  5  7  5
    ##  [139]  6  7  5  8  4  5  4  6  6  5  8  5  5  4  3  7  4  8  5  9  4  5  5
    ##  [162]  5  5  4  3  6  5  3  6  4  6  5  6  5  4  6  5  5  3  4  5  4  5  3
    ##  [185]  8  2  7  6  4  6  6  9  3  7  6  6  7  6  8  6  6  5  4  6  5  6  4
    ##  [208]  5  4  8  5  4  4  6  5  7  6  6  5  7  5  2  4  4  3  5  6  4  5  5
    ##  [231]  6  5  3  6  5  4  6  4  5  9  5  4  4  6  8  5  6  5  8  6  5  5  4
    ##  [254]  4  3  5  4  1  5  5  7  2  5  4  3  3  6  2  2  6  4  6  4  6  2  5
    ##  [277]  6  5  4  4  4  5  7  6  6  4  5  4  6  6  6  5  8  4  5  6  6  3  6
    ##  [300]  4  4  8  5  5  8  5  4  9  5  6  7  6  5  5  3  4  4  5  4  7  4  7
    ##  [323]  5  3  4  4  5  4  6  7  6  7  8  4  5  5  4  2  7  5  8  4  5  8  4
    ##  [346]  4  2  6  7  2  4  6  3  6  9  7  5  6  5  6  1  5  5  4  7  8  6  4
    ##  [369]  6  6  8  4  7  5  4  5  6  3  5  6  4  4  1  3  4  3  6  4  8  8  6
    ##  [392]  6  6  6  3  7  5  5  2  4  6  5  6  6  5  4  6  3  5  5  6  6  8  4
    ##  [415]  7  5  5  6  3  5  6  7  4  3  7  4  2  5  4  4  6  6  5  7  4  6  5
    ##  [438]  7  6  5  6  7  6  5  8  5  7  6  5  2  4  6  9  8  6  6  4  5  5  2
    ##  [461]  4  7  5  5  4  4  5  5  6  5  3  4  0  8  5  8  5  5  6  3  5  5  6
    ##  [484]  4  3  3  6  6  6  3  5  6  7  4  6  4  5  3  5  5  6  6  7  2  6  7
    ##  [507]  6  7  3  6  4  8  7  5  3  6  5  3  6  4  7  4  4  7  5  6  2  2  6
    ##  [530]  4  4  6  5  6  3  9  6  6  6  5  5  5  8  4  6  5  7  7  8  7  4  4
    ##  [553]  6  3  7  8  5  2  6  7  3  9  6  7  4  7  6  5  4  3  6  6  4  6  5
    ##  [576]  5  3  5  5  6  7  2  5  4  2  4  4  8  8  5  8  7  3  5  4  6  6  6
    ##  [599]  6  3  4  4  3  5  5  5  7  4  5  5  4  6  4  3  8  5  4  6  6  5  7
    ##  [622]  2  6  6  4  7  5  8  4  5  7  5  5  5  7  7  4  7  4  4  5  4  5  7
    ##  [645]  4  4  3  7  5  6  4  6  7  5  3  5  7  2  4  7  5  8  6  2  4  5  6
    ##  [668]  4  5  6  3  7  5  4  8  7  5  7  4  8  5  3  6  4  7  4  7  2  5  5
    ##  [691]  7  5  4  5  1  3  4  5  4  6  1  5  6  5  4  3  5  6  3  5  6  8  8
    ##  [714]  3  4  4  5  4  8  3  6  3  8  1  5  8  5  4  4  8  7  2  3  5  5  6
    ##  [737]  5  6  8  7  4  3  3  4  5  0  5  6  4  6  5  6  5  7  4  7  6  7  4
    ##  [760]  5  2  6  1  5  4  5  7  5  6  4  7  7  7  3  6  5  3  4  5  4  7  3
    ##  [783]  0  4  4  6  7  2  6  3  6  3  6  5  6  4  5  6  4  7  3  7  5  4  4
    ##  [806]  5  4  5  6  5  5  4  7  6  3  4  4  6  5  5  3  4  5  5  4  6  2  5
    ##  [829]  5  5  4  6  7  7  4  6  6  4  5  6  3  5  6  4  5  7  5  7  6  6  5
    ##  [852]  8  7  4  5  6  3  2  6  8  3  5  2  5  4  7  4  8  4  5  8  7  8  4
    ##  [875]  6  4  3  3  5  5  6  4  5  5  5  7  4  2  3  2  7  6  5  5  4  4  6
    ##  [898]  6  8  7  6  5  7  5  7  2  4  4  6  4  4  5  3  4  3  3  5  8  8  6
    ##  [921]  5  5  4  4  5  4  6  5  3  3  3  9  2  5  3  5  4  5  5  5  7  5  6
    ##  [944]  6  5  5  4  3  5  4  6  8  4  6  7  2  7  5  3  6  4  5  7  6  2  3
    ##  [967]  6  4  5  4 10  6  1  6  4  6  6  4  8  5  6  7  3  6  8  4  2  8  3
    ##  [990]  4  5  6  8  3  6  1  6  4  8  4
    # store in a data frame
    data_frame(x = rbinom(1000, size = 10, prob = .5))
    ## # A tibble: 1,000 x 1
    ##        x
    ##    <int>
    ##  1     7
    ##  2     5
    ##  3     3
    ##  4     4
    ##  5     6
    ##  6     5
    ##  7     5
    ##  8     3
    ##  9     8
    ## 10     8
    ## # ... with 990 more rows

Simulating multiple draws

rerun(.n = 10, rbinom(10, size = 10, prob = .5)) %>%
  bind_cols()
## # A tibble: 10 x 10
##       V1    V2    V3    V4    V5    V6    V7    V8    V9   V10
##    <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
##  1     3     2     7     5     5     4     7     4     5     7
##  2     4     7     7     5     5     5     5     3     2     6
##  3     5     6     7     4     4     7     3     7     3     6
##  4     5     4     2     5     8     5     4     4     5     3
##  5     3     5     3     1     5     6     3     5     4     5
##  6     6     4     7     5     8     7     8     6     5     4
##  7     6     4     4     5     4     5     4     4     6     2
##  8     2     6     4     4     4     5     4     6     1     7
##  9     6     6     5     3     4     3     3     6     8     4
## 10     4     4     3     5     5     5     2     4     4     5

Simulating multiple draws

rerun(.n = 9, rbinom(1000, size = 10, prob = .5)) %>%
  bind_cols() %>%
  gather(var, value) %>%
  ggplot(aes(value)) +
  geom_bar() +
  facet_wrap(~var) +
    labs(title = "Simulated draws from a Binomial PMF",
         subtitle = expression(n == 10 ~ pi == .5),
         x = expression(x),
         y = expression(p[X] (k)))

Setting the seed

# draw #1
set.seed(1234)
rbinom(10, size = 10, prob = .5)
##  [1] 3 5 5 6 7 6 1 4 6 5
# draw #2
set.seed(1234)
rbinom(10, size = 10, prob = .5)
##  [1] 3 5 5 6 7 6 1 4 6 5

Poisson

To find this Poisson probability Use this R command
\(\Pr(x = a)\) dpois(x = a, lambda = p)
\(\Pr(x \leq a)\) ppois(q = a, lambda = p)
\(\Pr(x < a)\) ppois(q = a - 1, lambda = p)
\(\Pr(x \geq a) = 1 - \Pr(x \leq a) = 1 - \Pr (x \leq a-1)\) 1 - ppois(q = a - 1, lambda = p)
\(\Pr(x > a) = 1 - \Pr(x \leq a)\) 1 - ppois(q = a, lambda = p)

Calculating probabilities

\[N = 3, \lambda = 5\]

N <- 3
lambda <- 5
To find this Poisson probability Use this R command Results
\(\Pr(x = 3)\) dpois(x = N, lambda = lambda) 0.14
\(\Pr(x \leq 3)\) ppois(q = N, lambda = lambda) 0.265
\(\Pr(x < 3)\) ppois(q = N - 1, lambda = lambda) 0.125
\(\Pr(x \geq 3) = 1 - \Pr(x \leq 3) = 1 - \Pr (x \leq 3-1)\) 1 - ppois(q = N - 1, lambda = lambda) 0.875
\(\Pr(x > 3) = 1 - \Pr(x \leq 3)\) 1 - ppois(q = N, lambda = lambda) 0.735

Calculating probabilities

\[N = 12, \lambda = 17.3\]

N <- 12
lambda <- 17.3
To find this Poisson probability Use this R command Results
\(\Pr(x = 3)\) dpois(x = N, lambda = lambda) 0.046
\(\Pr(x \leq 3)\) ppois(q = N, lambda = lambda) 0.121
\(\Pr(x < 3)\) ppois(q = N - 1, lambda = lambda) 0.075
\(\Pr(x \geq 3) = 1 - \Pr(x \leq 3) = 1 - \Pr (x \leq 3-1)\) 1 - ppois(q = N - 1, lambda = lambda) 0.925
\(\Pr(x > 3) = 1 - \Pr(x \leq 3)\) 1 - ppois(q = N, lambda = lambda) 0.879

Simulating random variables

  • \(N = 12, \lambda = 17.3\)
  • Simulate 1000 observations

    set.seed(1234)
    
    # store in a vector
    (X <- rpois(1000, lambda = 17.3))
    ##    [1] 12 18 18 14 19 14 21 14 15 10 17 21 16 15 20 14 17 15 17 17 20 15 13
    ##   [24] 20 21 31 24 11 23 12 22 15 13 12 15 20 17 13 13 25 20 27 17 14 15 12
    ##   [47] 16 18 21 26 17 18 17 18 14 14 29 17 18 14 16 20 20 19 15 17 15 18 24
    ##   [70] 21 15 27 22 21 19 18 17 20 19 16 26 14 17 18 17 16 14 17 15 20 12 13
    ##   [93] 17 16 14 17 18 20 18 14 14 17 29 15 17 17 15 23 19 16 19 18 24 18 19
    ##  [116] 18 15 16 16 27 20 19 16 13 15 21 13 16 18 19 24 17 15 16 14 15 19 10
    ##  [139] 12 23 19 20 17 15 19 21 25 22 15 11 15 12 19 26 19 19 13 17 15 20 18
    ##  [162] 20 18 20 24 21 17 12 22 12 10 14 16 14 23 18 22 17 17 18  9 29 20 14
    ##  [185] 10 16 13 16 18 14 16 17 19 17 19 12 19 13 22 18 18 18 19 21 19 20 12
    ##  [208] 24 18 20 11 21 15 20 10 14 20 27 18 17 18 14 21 25 14 13 17 18 13 21
    ##  [231] 21 18 10 13 16 16 21 27 24 24 12 19 19 12 13 16 13 14 11 15 17 10 15
    ##  [254] 23 18 18 17 21 21 15 17 15 22 20 19 16 17  9 18 21 15 14 12 14 16 13
    ##  [277] 12 21 22 17 22 14 11 15 18 16 12 10 25 12 16 25 21 25 17 16 15  7 18
    ##  [300] 17 26 14 28 20 21 22 23 18 18 18 18 18 15 21 18 11 15 26 22 19 19 17
    ##  [323] 13 22 14 13 16 13 12 20 16 15 14 16 16 20 23 19 13 12 13 17 20 17 23
    ##  [346] 17 12 24 16 16 23 26 17 22 15 12 18 14 18 13  8 18 15 18 21 16 16 22
    ##  [369] 14 14 16 24 17 13 14 11 27 18 17 17 18 18 20 17 12 19 15 10 12 16 20
    ##  [392] 11 18 30 26 14 20 11 14 12 18 22 24 21 12 16 12 16 10 15 17 12 25 15
    ##  [415] 15 20 23 14 22 23 16 22 17 14 22 14 17 18 18 20 16 24 19 12 14 15 16
    ##  [438] 28 15 17 14 12 18 19 19 11  9 14 16 15  5 15 14 21 16 11 23 13 13 13
    ##  [461] 18 21 16 14 14 11 15 12 19 18 16 19 14 14 16 15 15 19 21 17 15 18 15
    ##  [484] 13 15 10 21 20 13 23 12 13 17 19 17 11 13 18 13 25 13 15 24 13 16 17
    ##  [507] 22 16 22 17 18 18 18 18 19 19 21 21 17 12 22 23 19 18 21 12 19 16 18
    ##  [530] 21 16 16 19 12 19 16 26 13 15 18 15 19 18 16 19 12 14 19 24 22 27 18
    ##  [553] 18 21 23 24 14 13 20 21 19 23 16 12 25 13 26 15 16 17 20 13 17 24 25
    ##  [576] 15 14 19 16 12 22 21 22 15 14 23 16 17 18 20 16 27 13 20 12 17 17 20
    ##  [599] 16 17 22 17 14 19 19 14 12 14 14 16 20 18 21 28 12 14 18 16 11 18 28
    ##  [622] 19 23 21 17 15 22 15 19 24 16 18 20 12 13 15 19 22 16 12 16 17 17 14
    ##  [645] 21 12 13 23 13 16 15 24 23 14 14 13 21 26 18 14 12 17 15 14 11 19 19
    ##  [668] 14 16 19 10 18 13 17 15 14 15 17 13 16 20 18 17  9 20 15 16 18 10 18
    ##  [691] 20 19 14 15 15 21 24 21  9 17 17 21 20 16 16 11 20 17 17 23 20 31 11
    ##  [714] 21 19 17 16 15 18 14 15 14 25  9 15 15 14 18 14 17 12 11 26 23 23 20
    ##  [737] 15 19 12 19 17 18 20 13 12 13 18 23 21 17 20 16 21 22 12 16 18 16 24
    ##  [760]  9 27 16 23 20 24 27 18 13 16 21 20 18 17 19 21 22 14 19 18 17 21 16
    ##  [783] 18 13 17 18 26 19 20 15 13 10  9  9 13 14 15 23 12 15 17 15 16 15 17
    ##  [806] 13 25 15 15 16 21 18 18 17 14 14 25 14 14 12 16 22 17 10 16 23 21 18
    ##  [829] 19 17 17  9 17 13 16 20 17 19 15 15 19 18 18 19 16 13 22 18 17 13 16
    ##  [852] 14 20 18 14 17 19 33 18 23 17 19 16 26 22 16 16 17 17 14 20 11 20 20
    ##  [875] 14 14 16 13 16 23 21 18 19 25 17 17 23 19  7 13 17 16 20 13 18 19 23
    ##  [898] 18 20 22 15 20 18 24 17 29 18 16 17 29 13 13 17 13 18 17 20 24 21 14
    ##  [921] 18 19 17 14 15 17 26 24 12 17 16 16 21 12 17 13 19 12 19 25 12 22  7
    ##  [944] 14 14 14 17 13 18 13 16 14 22 18 11 27 20 10 23 20 14 17 19 13 17 22
    ##  [967] 17 23 13 23 19 22 21 14 13 17 13 21 17 14 21 20 25 22 19 26 19 16 15
    ##  [990] 16 18 11 10 19 15 17 28 14 21 23
  • What is the expected value of \(X\)? Its variance?

    mean(X)
    ## [1] 17.4
    var(X)
    ## [1] 17

Lincoln Tunnel

If 85% of vehicles arriving at the Lincoln Tunnel (connecting New Jersey and New York City) have either New York or New Jersey license plates, what is the probability that, of the next 20 vehicles, 2 or fewer (that is, 0, 1, or 2) will bear license plates from states other than New Jersey or New York?

Lincoln Tunnel

\[ \begin{eqnarray} p_X(k) & = & {{N}\choose{k}}\pi^{k} (1- \pi)^{n-k} \end{eqnarray} \]

  • \(N = 20\) and \(p=0.15\)
  • Calculate the probability that 0, 1, or 2 of the next cars will bear license plates from states other than New Jersey or New York

Lincoln Tunnel

One way to state the problem is:

\[ \sum_{k = 0}^2 {{N}\choose{k}}\pi^{k} (1- \pi)^{n-k} = \sum_{k = 0}^2 {{20}\choose{k}} 0.15^{k} (1- 0.15)^{20-k} \]

dbinom(0, 20, 0.15) + dbinom(1, 20, 0.15) + dbinom(2, 20, 0.15)
## [1] 0.405

Lincoln Tunnel

Alternatively, we can frame this in terms of the CDF:

\[F_X(x) = \Pr (X \leq 2) = \sum_{k \leq 2} p_X(k)\]

pbinom(2, 20, 0.15)
## [1] 0.405

Booking travel

Book4Less.com is an online travel website that offers competitive prices on airline and hotel bookings. During a typical weekday, the website averages 10 visits per minute.

  1. What is the probability that there will be at least 7 but no more than 12 visits in the next minute?
  2. What is the probability there will be more than 12 visits in the next minute?

Probability that there will be at least 7 but no more than 12 visits in the next minute

\[F_X(x) = \Pr (7 \leq x \leq 12) = \sum_{k = 7}^{12} e^{-10} \frac{10^{k}}{k!}\]

dpois(7:12, 10)
## [1] 0.0901 0.1126 0.1251 0.1251 0.1137 0.0948
sum(dpois(7:12, 10))
## [1] 0.661
ppois(12, 10) - ppois(6, 10)
## [1] 0.661

Probability there will be more than 12 visits in the next minute

\[F_X(x) = \Pr (x > 12) = \sum_{k = 13}^{\infty} e^{-10} \frac{10^{k}}{k!}\]

sum(dpois(13:50, 10))
## [1] 0.208
1 - ppois(12, 10)
## [1] 0.208