[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
SlideShare a Scribd company logo
Stat405                Simulation


                              Hadley Wickham
Thursday, 23 September 2010
1. Homework comments
               2. Mathematical approach
               3. More randomness
               4. Random number generators




Thursday, 23 September 2010
Homework
                   Just graded your organisation and code, and
                   focused my comments there.
                   Biggest overall tip: use floating figures (with figure
                   {...}) with captions. Use ref{} to refer to the figure in
                   the text.
                   Captions should start with brief description of plot
                   (including bin width if applicable) and finish with
                   brief description of what the plot reveals.
                   Will grade captions more aggressively in the future.


Thursday, 23 September 2010
Code
                   Gives explicit technical details.
                   Your comments should remind you why
                   you did what you did.
                   Most readers will not look at it, but it’s
                   very important to include it, because it
                   means that others can check your work.



Thursday, 23 September 2010
Mathematical
                               approach

                   Why are we doing this simulation? Could
                   work out the expected value and variance
                   mathematically. So let’s do it!
                   Simplifying assumption: slots are iid.




Thursday, 23 September 2010
calculate_prize <- function(windows) {
       payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40,
         "BB" = 25, "B" = 10, "C" = 10, "0" = 0)

          same <- length(unique(windows)) == 1
          allbars <- all(windows %in% c("B", "BB", "BBB"))

          if (same) {
            prize <- payoffs[windows[1]]
          } else if (allbars) {
            prize <- 5
          } else {
            cherries <- sum(windows == "C")
            diamonds <- sum(windows == "DD")

              prize <- c(0, 2, 5)[cherries + 1] *
                c(1, 2, 4)[diamonds + 1]
          }
          prize
     }

Thursday, 23 September 2010
slots <- read.csv("slots.csv", stringsAsFactors = F)

     # Calculate empirical distribution
     dist <- table(c(slots$w1, slots$w2, slots$w3))
     dist <- dist / sum(dist)

     slots <- names(dist)




Thursday, 23 September 2010
poss <- expand.grid(
       w1 = slots, w2 = slots, w3 = slots,
       stringsAsFactors = FALSE
     )

     poss$prize <- NA
     for(i in seq_len(nrow(poss))) {
       window <- as.character(poss[i, 1:3])
       poss$prize[i] <- calculate_prize(window)
     }




Thursday, 23 September 2010
Your turn
                   How can you calculate the probability of each
                   combination?
                   (Hint: think about subsetting. Another hint:
                   think about the table and character
                   subsetting. Final hint: you can do this in one
                   line of code)
                   Then work out the expected value (the payoff).



Thursday, 23 September 2010
poss$prob <- with(poss,
       dist[w1] * dist[w2] * dist[w3])

     (poss_mean <- with(poss, sum(prob * prize)))

     # How do we determine the variance of this
     # estimator?




Thursday, 23 September 2010
More
                randomness

Thursday, 23 September 2010
Sample

                   Very useful for selecting from a discrete
                   set (vector) of possibilities.
                   Four arguments: x, size, replace, prob




Thursday, 23 September 2010
How can you?
                   Choose 1 from vector
                   Choose n from vector, with replacement
                   Choose n from vector, without replacement
                   Perform a weighted sample
                   Put a vector in random order
                   Put a data frame in random order


Thursday, 23 September 2010
# Choose 1 from vector
     sample(letters, 1)

     # Choose n from vector, without replacement
     sample(letters, 10)
     sample(letters, 40)

     # Choose n from vector, with replacement
     sample(letters, 40, replace = T)

     # Perform a weighted sample
     sample(names(dist), prob = dist)


Thursday, 23 September 2010
# Put a vector in random order
     sample(letters)

     # Put a data frame in random order
     slots[sample(1:nrow(slots)), ]




Thursday, 23 September 2010
Your turn
                   Source of randomness in random_prize is
                   sample. Other options are:
                   runif, rbinom, rnbinom, rpois, rnorm,
                   rt, rcauchy
                   What sort of random variables do they
                   generate and what are their parameters?
                   Practice generating numbers from them.


Thursday, 23 September 2010
Function              Distribution       Parameters
                 runif            Uniform            min, max
               rbinom             Binomial         size, prob
             rnbinom          Negative binomial    size, prob
                 rpois            Poisson             lambda
                 rnorm             Normal            mean, sd
                      rt              t                 df
             rcauchy              Cauchy          location, scale

Thursday, 23 September 2010
Distributions
                   Other functions
                    •         r to generate random numbers
                    •         d to compute density f(x)
                    •         p to compute distribution F(x)
                    •         q to compute inverse distribution F-1(x)



Thursday, 23 September 2010
# Easy to combine random variables

     n <- rpois(10000, lambda = 10)
     x <- rbinom(10000, size = n, prob = 0.3)
     qplot(x, binwidth = 1)

     p <- runif(10000)
     x <- rbinom(10000, size = 10, prob = p)
     qplot(x, binwidth = 0.1)

     # cf.
     qplot(runif(10000), binwidth = 0.1)


Thursday, 23 September 2010
# Simulation is a powerful tool for exploring
     # distributions. Easy to do computationally; hard
     # to do analytically

     qplot(1 / rpois(10000, lambda = 20))
     qplot(1 / runif(10000, min = 0.5, max = 2))

     qplot(rnorm(10000) ^ 2)
     qplot(rnorm(10000) / rnorm(10000))

     # http://www.johndcook.com/distribution_chart.html



Thursday, 23 September 2010
Your turn




Thursday, 23 September 2010
RNG
                              Computers are deterministic, so how
                                do they produce randomness?




Thursday, 23 September 2010
Thursday, 23 September 2010
How do computers
                generate random numbers?

                   They don’t! Actually produce pseudo-
                   random sequences.
                   Common approach: Xn+1 = (aXn + c) mod m
                   (http://en.wikipedia.org/wiki/
                   Linear_congruential_generator)




Thursday, 23 September 2010
next_val <- function(x, a, c, m) {
       (a * x + c) %% m
     }

     x <- 1001
     (x <- next_val(x, 1664525, 1013904223, 2^32))

     # http://en.wikipedia.org/wiki/
     List_of_pseudorandom_number_generators

     # R uses
     # http://en.wikipedia.org/wiki/Mersenne_twister


Thursday, 23 September 2010
# Random numbers are reproducible!

     set.seed(1)
     runif(10)

     set.seed(1)
     runif(10)

     # Very useful when required to make a reproducible
     # example that involves randomness




Thursday, 23 September 2010
True randomness
                   Atmospheric radio noise: http://
                   www.random.org. Use from R with
                   random package.
                   Not really important unless you’re running
                   a lottery. (Otherwise by observing a long
                   enough sequence you can predict the
                   next value)


Thursday, 23 September 2010

More Related Content

What's hot (18)

PDF
MLIP - Chapter 4 - Image classification and CNNs
Charles Deledalle
 
PDF
알고리즘 중심의 머신러닝 가이드 Ch04
HyeonSeok Choi
 
PDF
Rexx summary
Sylvie Migneault
 
PPTX
Trident International Graphics Workshop 2014 4/5
Takao Wada
 
PDF
TensorFlow 深度學習快速上手班--自然語言處理應用
Mark Chang
 
PDF
Lec2
Amba Research
 
PDF
Iclr2016 vaeまとめ
Deep Learning JP
 
PDF
Lesson19 Maximum And Minimum Values 034 Slides
Matthew Leingang
 
PDF
Stage3D and AGAL
Daniel Freeman
 
ZIP
Adobe AIR: Stage3D and AGAL
Daniel Freeman
 
PPTX
Pointers lesson 4 (malloc and its use)
SetuMaheshwari1
 
PDF
SA09 Realtime education
fumoto kazuhiro
 
PDF
Discrete Models in Computer Vision
Yap Wooi Hen
 
PDF
Lesson 18: Maximum and Minimum Vaues
Matthew Leingang
 
PDF
Lesson 18: Maximum and Minimum Vaues
Matthew Leingang
 
PDF
A Generative Model for Joint Natural Language Understanding and Generation
taeseon ryu
 
PDF
Computational Linguistics week 10
Mark Chang
 
PPT
MIDP: Form Custom and Image Items
Jussi Pohjolainen
 
MLIP - Chapter 4 - Image classification and CNNs
Charles Deledalle
 
알고리즘 중심의 머신러닝 가이드 Ch04
HyeonSeok Choi
 
Rexx summary
Sylvie Migneault
 
Trident International Graphics Workshop 2014 4/5
Takao Wada
 
TensorFlow 深度學習快速上手班--自然語言處理應用
Mark Chang
 
Iclr2016 vaeまとめ
Deep Learning JP
 
Lesson19 Maximum And Minimum Values 034 Slides
Matthew Leingang
 
Stage3D and AGAL
Daniel Freeman
 
Adobe AIR: Stage3D and AGAL
Daniel Freeman
 
Pointers lesson 4 (malloc and its use)
SetuMaheshwari1
 
SA09 Realtime education
fumoto kazuhiro
 
Discrete Models in Computer Vision
Yap Wooi Hen
 
Lesson 18: Maximum and Minimum Vaues
Matthew Leingang
 
Lesson 18: Maximum and Minimum Vaues
Matthew Leingang
 
A Generative Model for Joint Natural Language Understanding and Generation
taeseon ryu
 
Computational Linguistics week 10
Mark Chang
 
MIDP: Form Custom and Image Items
Jussi Pohjolainen
 

Similar to 10 simulation (20)

PDF
08 functions
Hadley Wickham
 
PDF
09 bootstrapping
Hadley Wickham
 
PDF
04 reports
Hadley Wickham
 
PDF
06 data
Hadley Wickham
 
PDF
07 problem-solving
Hadley Wickham
 
PDF
04 Reports
Hadley Wickham
 
PDF
24 modelling
Hadley Wickham
 
PDF
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
The Statistical and Applied Mathematical Sciences Institute
 
PDF
11 Simulation
Hadley Wickham
 
PDF
Project Fortress
Alex Miller
 
PDF
Clojure night
Aria Haghighi
 
PDF
Peyton jones-2009-fun with-type_functions-slide
Takayuki Muranushi
 
PDF
Poetry with R -- Dissecting the code
Peter Solymos
 
PDF
how to rate a Rails application
ehuard
 
PDF
Applied machine learning for search engine relevance 3
Charles Martin
 
PDF
Computational Complexity: Introduction-Turing Machines-Undecidability
Antonis Antonopoulos
 
PPTX
R Language Introduction
Khaled Al-Shamaa
 
PDF
Rubinius - What Have You Done For Me Lately?
evanphx
 
PPTX
2. R-basics, Vectors, Arrays, Matrices, Factors
krishna singh
 
PDF
Outliers
Xuan Le
 
08 functions
Hadley Wickham
 
09 bootstrapping
Hadley Wickham
 
04 reports
Hadley Wickham
 
07 problem-solving
Hadley Wickham
 
04 Reports
Hadley Wickham
 
24 modelling
Hadley Wickham
 
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
The Statistical and Applied Mathematical Sciences Institute
 
11 Simulation
Hadley Wickham
 
Project Fortress
Alex Miller
 
Clojure night
Aria Haghighi
 
Peyton jones-2009-fun with-type_functions-slide
Takayuki Muranushi
 
Poetry with R -- Dissecting the code
Peter Solymos
 
how to rate a Rails application
ehuard
 
Applied machine learning for search engine relevance 3
Charles Martin
 
Computational Complexity: Introduction-Turing Machines-Undecidability
Antonis Antonopoulos
 
R Language Introduction
Khaled Al-Shamaa
 
Rubinius - What Have You Done For Me Lately?
evanphx
 
2. R-basics, Vectors, Arrays, Matrices, Factors
krishna singh
 
Outliers
Xuan Le
 
Ad

More from Hadley Wickham (20)

PDF
27 development
Hadley Wickham
 
PDF
27 development
Hadley Wickham
 
PDF
23 data-structures
Hadley Wickham
 
PDF
Graphical inference
Hadley Wickham
 
PDF
R packages
Hadley Wickham
 
PDF
22 spam
Hadley Wickham
 
PDF
21 spam
Hadley Wickham
 
PDF
20 date-times
Hadley Wickham
 
PDF
19 tables
Hadley Wickham
 
PDF
18 cleaning
Hadley Wickham
 
PDF
17 polishing
Hadley Wickham
 
PDF
16 critique
Hadley Wickham
 
PDF
15 time-space
Hadley Wickham
 
PDF
14 case-study
Hadley Wickham
 
PDF
13 case-study
Hadley Wickham
 
PDF
12 adv-manip
Hadley Wickham
 
PDF
11 adv-manip
Hadley Wickham
 
PDF
11 adv-manip
Hadley Wickham
 
PDF
10 simulation
Hadley Wickham
 
PDF
05 subsetting
Hadley Wickham
 
27 development
Hadley Wickham
 
27 development
Hadley Wickham
 
23 data-structures
Hadley Wickham
 
Graphical inference
Hadley Wickham
 
R packages
Hadley Wickham
 
20 date-times
Hadley Wickham
 
19 tables
Hadley Wickham
 
18 cleaning
Hadley Wickham
 
17 polishing
Hadley Wickham
 
16 critique
Hadley Wickham
 
15 time-space
Hadley Wickham
 
14 case-study
Hadley Wickham
 
13 case-study
Hadley Wickham
 
12 adv-manip
Hadley Wickham
 
11 adv-manip
Hadley Wickham
 
11 adv-manip
Hadley Wickham
 
10 simulation
Hadley Wickham
 
05 subsetting
Hadley Wickham
 
Ad

10 simulation

  • 1. Stat405 Simulation Hadley Wickham Thursday, 23 September 2010
  • 2. 1. Homework comments 2. Mathematical approach 3. More randomness 4. Random number generators Thursday, 23 September 2010
  • 3. Homework Just graded your organisation and code, and focused my comments there. Biggest overall tip: use floating figures (with figure {...}) with captions. Use ref{} to refer to the figure in the text. Captions should start with brief description of plot (including bin width if applicable) and finish with brief description of what the plot reveals. Will grade captions more aggressively in the future. Thursday, 23 September 2010
  • 4. Code Gives explicit technical details. Your comments should remind you why you did what you did. Most readers will not look at it, but it’s very important to include it, because it means that others can check your work. Thursday, 23 September 2010
  • 5. Mathematical approach Why are we doing this simulation? Could work out the expected value and variance mathematically. So let’s do it! Simplifying assumption: slots are iid. Thursday, 23 September 2010
  • 6. calculate_prize <- function(windows) { payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0) same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { cherries <- sum(windows == "C") diamonds <- sum(windows == "DD") prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } prize } Thursday, 23 September 2010
  • 7. slots <- read.csv("slots.csv", stringsAsFactors = F) # Calculate empirical distribution dist <- table(c(slots$w1, slots$w2, slots$w3)) dist <- dist / sum(dist) slots <- names(dist) Thursday, 23 September 2010
  • 8. poss <- expand.grid( w1 = slots, w2 = slots, w3 = slots, stringsAsFactors = FALSE ) poss$prize <- NA for(i in seq_len(nrow(poss))) { window <- as.character(poss[i, 1:3]) poss$prize[i] <- calculate_prize(window) } Thursday, 23 September 2010
  • 9. Your turn How can you calculate the probability of each combination? (Hint: think about subsetting. Another hint: think about the table and character subsetting. Final hint: you can do this in one line of code) Then work out the expected value (the payoff). Thursday, 23 September 2010
  • 10. poss$prob <- with(poss, dist[w1] * dist[w2] * dist[w3]) (poss_mean <- with(poss, sum(prob * prize))) # How do we determine the variance of this # estimator? Thursday, 23 September 2010
  • 11. More randomness Thursday, 23 September 2010
  • 12. Sample Very useful for selecting from a discrete set (vector) of possibilities. Four arguments: x, size, replace, prob Thursday, 23 September 2010
  • 13. How can you? Choose 1 from vector Choose n from vector, with replacement Choose n from vector, without replacement Perform a weighted sample Put a vector in random order Put a data frame in random order Thursday, 23 September 2010
  • 14. # Choose 1 from vector sample(letters, 1) # Choose n from vector, without replacement sample(letters, 10) sample(letters, 40) # Choose n from vector, with replacement sample(letters, 40, replace = T) # Perform a weighted sample sample(names(dist), prob = dist) Thursday, 23 September 2010
  • 15. # Put a vector in random order sample(letters) # Put a data frame in random order slots[sample(1:nrow(slots)), ] Thursday, 23 September 2010
  • 16. Your turn Source of randomness in random_prize is sample. Other options are: runif, rbinom, rnbinom, rpois, rnorm, rt, rcauchy What sort of random variables do they generate and what are their parameters? Practice generating numbers from them. Thursday, 23 September 2010
  • 17. Function Distribution Parameters runif Uniform min, max rbinom Binomial size, prob rnbinom Negative binomial size, prob rpois Poisson lambda rnorm Normal mean, sd rt t df rcauchy Cauchy location, scale Thursday, 23 September 2010
  • 18. Distributions Other functions • r to generate random numbers • d to compute density f(x) • p to compute distribution F(x) • q to compute inverse distribution F-1(x) Thursday, 23 September 2010
  • 19. # Easy to combine random variables n <- rpois(10000, lambda = 10) x <- rbinom(10000, size = n, prob = 0.3) qplot(x, binwidth = 1) p <- runif(10000) x <- rbinom(10000, size = 10, prob = p) qplot(x, binwidth = 0.1) # cf. qplot(runif(10000), binwidth = 0.1) Thursday, 23 September 2010
  • 20. # Simulation is a powerful tool for exploring # distributions. Easy to do computationally; hard # to do analytically qplot(1 / rpois(10000, lambda = 20)) qplot(1 / runif(10000, min = 0.5, max = 2)) qplot(rnorm(10000) ^ 2) qplot(rnorm(10000) / rnorm(10000)) # http://www.johndcook.com/distribution_chart.html Thursday, 23 September 2010
  • 21. Your turn Thursday, 23 September 2010
  • 22. RNG Computers are deterministic, so how do they produce randomness? Thursday, 23 September 2010
  • 24. How do computers generate random numbers? They don’t! Actually produce pseudo- random sequences. Common approach: Xn+1 = (aXn + c) mod m (http://en.wikipedia.org/wiki/ Linear_congruential_generator) Thursday, 23 September 2010
  • 25. next_val <- function(x, a, c, m) { (a * x + c) %% m } x <- 1001 (x <- next_val(x, 1664525, 1013904223, 2^32)) # http://en.wikipedia.org/wiki/ List_of_pseudorandom_number_generators # R uses # http://en.wikipedia.org/wiki/Mersenne_twister Thursday, 23 September 2010
  • 26. # Random numbers are reproducible! set.seed(1) runif(10) set.seed(1) runif(10) # Very useful when required to make a reproducible # example that involves randomness Thursday, 23 September 2010
  • 27. True randomness Atmospheric radio noise: http:// www.random.org. Use from R with random package. Not really important unless you’re running a lottery. (Otherwise by observing a long enough sequence you can predict the next value) Thursday, 23 September 2010