# swirl Lesson 13: Simulation

| Please choose a course, or type 0 to exit swirl.
1: R Programming
2: Take me to the swirl course repository!
Selection: 1
1: Basic Building Blocks      2: Workspace and Files
3: Sequences of Numbers       4: Vectors
5: Missing Values             6: Subsetting Vectors
7: Matrices and Data Frames   8: Logic
9: Functions                 10: lapply and sapply
11: vapply and tapply         12: Looking at Data
13: Simulation                14: Dates and Times
15: Base Graphics
Selection: 13
|                                                          |   0%
| One of the great advantages of using a statistical programming
| language like R is its vast collection of tools for simulating
| random numbers.
...
|==                                                        |   3%
| This lesson assumes familiarity with a few common probability
| distributions, but these topics will only be discussed with
| respect to random number generation. Even if you have no prior
| experience with these concepts, you should be able to complete
| the lesson and understand the main ideas.
...
|====                                                      |   6%
| The first function we'll use to generate random numbers is
| sample(). Use ?sample to pull up the documentation.
> ?sample
|=====                                                     |   9%
| Let's simulate rolling four six-sided dice: sample(1:6, 4,
| replace = TRUE).
> sample(1:6, 4, replace = TRUE)
[1] 5 5 2 6
|=======                                                   |  12%
| Now repeat the command to see how your result differs. (The
| probability of rolling the exact same result is (1/6)^4 =
| 0.00077, which is pretty small!)
> sample(1:6, 4, replace = TRUE)
[1] 3 4 4 3
| Nice work!
|=========                                                 |  15%
| sample(1:6, 4, replace = TRUE) instructs R to randomly select
| four numbers between 1 and 6, WITH replacement. Sampling with
| replacement simply means that each number is "replaced" after it
| is selected, so that the same number can show up more than once.
| This is what we want here, since what you roll on one die
| shouldn't affect what you roll on any of the others.
...
|===========                                               |  18%
| Now sample 10 numbers between 1 and 20, WITHOUT replacement. To
| sample without replacement, simply leave off the 'replace'
| argument.
> sample(1:20,10)
[1]  7  1  6 16 10 18 12  5 11  2
|============                                              |  21%
| Since the last command sampled without replacement, no number
| appears more than once in the output.
...
|==============                                            |  24%
| LETTERS is a predefined variable in R containing a vector of all
| 26 letters of the English alphabet. Take a look at it now.
> LETTERS
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P"
[17] "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
| Excellent job!
|================                                          |  27%
| The sample() function can also be used to permute, or rearrange,
| the elements of a vector. For example, try sample(LETTERS) to
| permute all 26 letters of the English alphabet.
> sample(LETTERS)
[1] "Q" "T" "J" "O" "W" "D" "Y" "U" "E" "A" "G" "H" "L" "Z" "P" "S"
[17] "K" "C" "I" "R" "B" "F" "X" "V" "M" "N"
| You got it right!
|==================                                        |  30%
| This is identical to taking a sample of size 26 from LETTERS,
| without replacement. When the 'size' argument to sample() is not
| specified, R takes a sample equal in size to the vector from
| which you are sampling.
...
|===================                                       |  33%
| Now, suppose we want to simulate 100 flips of an unfair two-sided
| coin. This particular coin has a 0.3 probability of landing
| 'tails' and a 0.7 probability of landing 'heads'.
...
|=====================                                     |  36%
| Let the value 0 represent tails and the value 1 represent heads.
| Use sample() to draw a sample of size 100 from the vector c(0,1),
| with replacement. Since the coin is unfair, we must attach
| specific probabilities to the values 0 (tails) and 1 (heads) with
| a fourth argument, prob = c(0.3, 0.7). Assign the result to a new
| variable called flips.
> flips<-sample(c(0,1),100,replace=TRUE,prob=c(0.3,0.7))
| You are really on a roll!
|=======================                                   |  39%
| View the contents of the flips variable.
> flips
[1] 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 0 0 1 0 1 0 1 1 1 0 1 1 1 1 1 1
[32] 0 1 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 1 1
[63] 1 1 1 1 0 1 1 1 1 1 0 1 0 1 0 1 0 1 0 0 1 1 1 0 0 0 0 0 1 1 1
[94] 1 0 1 1 1 1 0
| You're the best!
|=========================                                 |  42%
| Since we set the probability of landing heads on any given flip
| to be 0.7, we'd expect approximately 70 of our coin flips to have
| the value 1. Count the actual number of 1s contained in flips
| using the sum() function.
> sum(flips)
[1] 71
| You got it right!
|==========================                                |  45%
| A coin flip is a binary outcome (0 or 1) and we are performing
| 100 independent trials (coin flips), so we can use rbinom() to
| simulate a binomial random variable. Pull up the documentation
| for rbinom() using ?rbinom.
> ?rbinom
| All that hard work is paying off!
|============================                              |  48%
| Each probability distribution in R has an r*** function (for
| "random"), a d*** function (for "density"), a p*** (for
| "probability"), and q*** (for "quantile"). We are most interested
| in the r*** functions in this lesson, but I encourage you to
| explore the others on your own.
...
|==============================                            |  52%
| A binomial random variable represents the number of 'successes'
| (heads) in a given number of independent 'trials' (coin flips).
| Therefore, we can generate a single random variable that
| represents the number of heads in 100 flips of our unfair coin
| using rbinom(1, size = 100, prob = 0.7). Note that you only
| specify the probability of 'success' (heads) and NOT the
| probability of 'failure' (tails). Try it now.
> rbinom(1, size = 100, prob = 0.7)
[1] 76
| You are really on a roll!
|================================                          |  55%
| Equivalently, if we want to see all of the 0s and 1s, we can
| request 100 observations, each of size 1, with success
| probability of 0.7. Give it a try, assigning the result to a new
| variable called flips2.
> flips2<-rbinom(n=100, size = 1, prob = 0.7)
| That's correct!
|=================================                         |  58%
| View the contents of flips2.
> flips2
[1] 0 0 1 0 0 0 0 1 0 1 1 1 1 1 0 1 0 1 0 1 1 1 1 0 1 1 1 1 1 1 1
[32] 0 1 1 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 1 1 0 1 0
[63] 1 1 1 0 1 1 0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 1 1 1 0 0 1 1 1 0
[94] 1 0 0 1 1 1 1
| Keep up the great work!
|===================================                       |  61%
| Now use sum() to count the number of 1s (heads) in flips2. It
| should be close to 70!
> sum(flips2)
[1] 67
| That's correct!
|=====================================                     |  64%
| Similar to rbinom(), we can use R to simulate random numbers from
| many other probability distributions. Pull up the documentation
| for rnorm() now.
> ?rnorm
| Keep working like that and you'll get there!
|=======================================                   |  67%
| The standard normal distribution has mean 0 and standard
| deviation 1. As you can see under the 'Usage' section in the
| documentation, the default values for the 'mean' and 'sd'
| arguments to rnorm() are 0 and 1, respectively. Thus, rnorm(10)
| will generate 10 random numbers from a standard normal
| distribution. Give it a try.
> rnorm(10)
[1]  1.6794203  0.2139809  2.5742928 -0.6305014 -0.7384053
[6] -0.1994382  0.9871799 -1.4090833  0.8196085  1.2761266
|========================================                  |  70%
| Now do the same, except with a mean of 100 and a standard
| deviation of 25.
> rnorm(10,mean=100,sd=25)
[1]  75.14305  91.40011  88.22923  76.51028  88.24616  84.71632
[7] 128.47592  56.19216 133.73401  87.42630
| Great job!
|==========================================                |  73%
| Finally, what if we want to simulate 100 *groups* of random
| numbers, each containing 5 values generated from a Poisson
| distribution with mean 10? Let's start with one group of 5
| numbers, then I'll show you how to repeat the operation 100 times
| in a convenient and compact way.
...
|============================================              |  76%
| Generate 5 random values from a Poisson distribution with mean
| 10. Check out the documentation for rpois() if you need help.
> ?rpois
> rpois(5,10)
[1] 17  8 12  2 12
| Excellent job!
|==============================================            |  79%
| Now use replicate(100, rpois(5, 10)) to perform this operation
| 100 times. Store the result in a new variable called my_pois.
> my_pois <- replicate(100, rpois(5, 10))
| You are really on a roll!
|===============================================           |  82%
| Take a look at the contents of my_pois.
> my_pois
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]    8    8   10   13   11    5    7   10   15    15     9     9
[2,]    8   11    7   10    5    6    9    9    9     3    11     7
[3,]    5    9   13    7   10   13   10    9   12    12    15    12
[4,]    9    7   10   10   15   10   16    8   12     9     7     8
[5,]   17   10    9   10    2   10   12   11    7     5     9     9
[,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22]
[1,]    15    10     9     9    10    13    11     9     4    10
[2,]    10     9    14     8     8    15    12    12    11     9
[3,]    14    14    11     7    10    13     9    11     9     7
[4,]    11    11    13     4    12     7    14     6     9     9
[5,]    12     6    14    11    14     3     6     7    16     9
[,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32]
[1,]    16     8    10    12     4    12    11     4    13    17
[2,]     8    11    13    13    11     4    10    13    12    12
[3,]     6    14    14     8    12     7    10    12    13     7
[4,]    15     6    11     6    11    10     8    13    10     6
[5,]    14     6     8     7     9    15     9     6    12    10
[,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42]
[1,]     9     9    15    10    11     4     6    14    11     9
[2,]     8     7     7    10    12    11     6     8    11    12
[3,]     5     9    13     7     8    13    15     5     8     6
[4,]     8    16     7     6    15    14    13    11    11    12
[5,]    14    12    13    13     6    11    10    12    12    17
[,43] [,44] [,45] [,46] [,47] [,48] [,49] [,50] [,51] [,52]
[1,]    12    16    19    14     9    13     8     7     6     8
[2,]     8     8     6    12    11     8     7     9     8    15
[3,]     8     8    12    11     9     6    12    10     8    15
[4,]    14     6     7    11     9    11    10    12     8     7
[5,]    10     5    13    11    11     6     6    10    19     9
[,53] [,54] [,55] [,56] [,57] [,58] [,59] [,60] [,61] [,62]
[1,]     9    13     8    12     9     7    10     9     7    10
[2,]    15    12    11    11    12     5     8    11     9     7
[3,]    13     9    13     9    21     8     9     9    13     9
[4,]    13     7    12    11    10    14    11    17     8    15
[5,]    20    11    13    12    12     5     5     9    12     9
[,63] [,64] [,65] [,66] [,67] [,68] [,69] [,70] [,71] [,72]
[1,]    11     9     7     9     9     5     9    11    11     9
[2,]    13     7     8    10    11    11     8     9    10     7
[3,]     7     7     5    16     9     8     7    11     9    13
[4,]     5     7    14     9     8    11    10    13     8    10
[5,]    12    11    10    12     9    10     5     8    12     7
[,73] [,74] [,75] [,76] [,77] [,78] [,79] [,80] [,81] [,82]
[1,]    14     7     9    11     8     5    17    10     7     8
[2,]     9    10     8    11     9    10     8    10    12    10
[3,]    10     8     9     5     8    14    11    11    13     6
[4,]    15     8    14    10     2    14    11     7    12    11
[5,]     7     9     8     9    11     5    14     9     8     8
[,83] [,84] [,85] [,86] [,87] [,88] [,89] [,90] [,91] [,92]
[1,]     2    12    11     6    10     8     5     8     9    11
[2,]    13    14    12    13     8    10     8     8     8     9
[3,]    12     8     9    10    10    19     8    13     9     2
[4,]    12     8    16     7    12    14    10    12     6    14
[5,]     7     5     7     8     4     7    13     9    11     9
[,93] [,94] [,95] [,96] [,97] [,98] [,99] [,100]
[1,]    20     9     8    14     8    13    13     12
[2,]    12     6    10    14    16     8    10      7
[3,]     3    15    13     9    10    11    17     12
[4,]    11     9     8     8     8     8    13      9
[5,]     7    11    13     9    13    11    15      8
| Keep working like that and you'll get there!
|=================================================         |  85%
| replicate() created a matrix, each column of which contains 5
| random numbers generated from a Poisson distribution with mean
| 10. Now we can find the mean of each column in my_pois using the
| colMeans() function. Store the result in a variable called cm.
> cm <- colMeans(my_pois)
|===================================================       |  88%
| And let's take a look at the distribution of our column means by
| plotting a histogram with hist(cm).
> hist(cm)
| You are doing so well!
|=====================================================     |  91%
| Looks like our column means are almost normally distributed,
| right? That's the Central Limit Theorem at work, but that's a
| lesson for another day!
...
|======================================================    |  94%
| All of the standard probability distributions are built into R,
| including exponential (rexp()), chi-squared (rchisq()), gamma
| (rgamma()), .... Well, you see the pattern.
...
|========================================================  |  97%
| Simulation is practically a field of its own and we've only
| skimmed the surface of what's possible. I encourage you to
| explore these and other functions further on your own.
...
|==========================================================| 100%