[Mitarbeiter.zoologie] Code 2.8 from Statsitical rethinking

Fr Feb 11 17:19:00 CET 2022

Dear Statistical rethinkers,
I've thought a bit more on code 2.8 and I want to share it with you.
Correct me please if I am wrong.

the aim of the code: reconstruction of posterior probability distribution
of proportion of the Earth surface covered with water given 6 waters out of
9 observations. This will be done by repeatedly generating random numbers
under the light of observation (6 W out of 9). Each random number depends
on the previous random number hence it is a Markov chain.

# You decide on the size of the Markov chain.
n_samples <- 1000

# Memory allocation. Array p contains our Markov chain.
p <- rep( NA , n_samples )

# Setting the first number. It could be any valid value as long as you have
enough random numbers. But let's be unbiased and start with 0.5.
p[1] <- 0.5

# These are our observations.
W <- 6
L <- 3

# Through this loop, in each step we generate a random number and if it is
verified it will be stored in array p. You start with 2 because you already
set the first one.
for ( i in 2:n_samples ) {

# You generate a random number from normal dist with the mean equals to
your previous (old) random number and call it p_new. You can use any other
distribution (I think) for generating random numbers as long as it is
dependent on the previous random number.
p_new <- rnorm( 1 , p[i-1] , 0.1 )

# Since the nature of p_new is probability, it must be between 0 and 1.
Here you check for this and if it is outside this interval you bring it
back inside. This code however is not perfect. Suppose you got 8, then
based on the second checkpoint, you subtract it from 2 which you get -6
which is still outside the valid interval. You could use a command line
like this instead:
while(p_new <0 | p_new >1)  p_new <- rnorm( 1 , p[i-1] , 0.1 ).
if ( p_new < 0 ) p_new <- abs( p_new )
if ( p_new > 1 ) p_new <- 2 - p_new

# Now you have a valid new random number and you need to decide whether to
accept it or stay with the old random number. This will be done based on
likelihoods of the old random number (q0) and new one (q1). The likelihoods
are calculated under the light of observations. Here is the meaning of q0
for example (suppose the old value is still 0.5): If 50% of the Earth
surface is covered by water, what is the probability that out of 9 sampling
you get 6 waters.
q0 <- dbinom( W , W+L , p[i-1] )
q1 <- dbinom( W , W+L , p_new )

# Once you have q0 and q1, you accept or reject the new random number
proportional to q1/q0. If for example q1/q0 happens to be 3, it means that
the new random number is 3 times more plausible than the old one but this
doesn't mean that you definitely accept the new one. You must always make
some room for less plausible value as well, otherwise you don't get a
distribution.
p[i] <- ifelse( runif(1) < q1/q0 , p_new , p[i-1] )
}

PS1: Antonella noticed that upon increasing the size of the Markov chain,
you get a better approximation to the analytical solution.

PS2: If you are not sure that you started the Markov chain with an unbiased
value, it's better to discard some of the first random numbers and then
convert it to the plot. BEAST discards the first 10000 numbers by default I
think. Belinda! Is that correct?

This is my understanding. Hope it helps a bit.

All the best
Hassan
-------------- nächster Teil --------------
Ein Dateianhang mit HTML-Daten wurde abgetrennt...
URL: <http://lists.uni-halle.de/pipermail/mitarbeiter.zoologie/attachments/20220211/85325381/attachment.html>