[Mitarbeiter.zoologie] Code 2.8 from Statsitical rethinking
Antonella Soro
antonella.soro at zoologie.uni-halle.de
Sa Feb 12 09:02:43 CET 2022
dear hassan,
thank you very much for this detailed description of the code. for each line of code, i will copy and paste your corresponding description in my exercise script.
i also get your point about the perfect not being perfect in keeping the probability within the (0,1) boundaries, but i have not exactly understood your solution. Maybe I will briefly bother you on Monady about that. But my more general question about this point would be, how likely it is to sample a number greater than 2. maybe the reason why McElreath used ‘2' as a threshold in the code is because ‘2' is a safe one (that is big enough)?
if you think something might not be perfect with the code, you can also trying writing directly to the author (richard_mcelreath at eva.mpg.de <mailto:richard_mcelreath at eva.mpg.de>) . I had a small complain about something he has written in chapeter 3 and sent him an email, and he responded. This does not mean that he will always respond, but if your solution might improve his code, it would be in his interest to do so.
thanks again and a good week-end to all
antonella
> On 11. Feb 2022, at 17:19, Hassan Shafiey <h.shafiey at gmail.com> wrote:
>
> Dear Statistical rethinkers,
> I've thought a bit more on code 2.8 and I want to share it with you. Correct me please if I am wrong.
>
> the aim of the code: reconstruction of posterior probability distribution of proportion of the Earth surface covered with water given 6 waters out of 9 observations. This will be done by repeatedly generating random numbers under the light of observation (6 W out of 9). Each random number depends on the previous random number hence it is a Markov chain.
>
> # You decide on the size of the Markov chain.
> n_samples <- 1000
>
> # Memory allocation. Array p contains our Markov chain.
> p <- rep( NA , n_samples )
>
> # Setting the first number. It could be any valid value as long as you have enough random numbers. But let's be unbiased and start with 0.5.
> p[1] <- 0.5
>
> # These are our observations.
> W <- 6
> L <- 3
>
> # Through this loop, in each step we generate a random number and if it is verified it will be stored in array p. You start with 2 because you already set the first one.
> for ( i in 2:n_samples ) {
>
> # You generate a random number from normal dist with the mean equals to your previous (old) random number and call it p_new. You can use any other distribution (I think) for generating random numbers as long as it is dependent on the previous random number.
> p_new <- rnorm( 1 , p[i-1] , 0.1 )
>
> # Since the nature of p_new is probability, it must be between 0 and 1. Here you check for this and if it is outside this interval you bring it back inside. This code however is not perfect. Suppose you got 8, then based on the second checkpoint, you subtract it from 2 which you get -6 which is still outside the valid interval. You could use a command line like this instead:
> while(p_new <0 | p_new >1) p_new <- rnorm( 1 , p[i-1] , 0.1 ).
> if ( p_new < 0 ) p_new <- abs( p_new )
> if ( p_new > 1 ) p_new <- 2 - p_new
>
> # Now you have a valid new random number and you need to decide whether to accept it or stay with the old random number. This will be done based on likelihoods of the old random number (q0) and new one (q1). The likelihoods are calculated under the light of observations. Here is the meaning of q0 for example (suppose the old value is still 0.5): If 50% of the Earth surface is covered by water, what is the probability that out of 9 sampling you get 6 waters.
> q0 <- dbinom( W , W+L , p[i-1] )
> q1 <- dbinom( W , W+L , p_new )
>
> # Once you have q0 and q1, you accept or reject the new random number proportional to q1/q0. If for example q1/q0 happens to be 3, it means that the new random number is 3 times more plausible than the old one but this doesn't mean that you definitely accept the new one. You must always make some room for less plausible value as well, otherwise you don't get a distribution.
> p[i] <- ifelse( runif(1) < q1/q0 , p_new , p[i-1] )
> }
>
> PS1: Antonella noticed that upon increasing the size of the Markov chain, you get a better approximation to the analytical solution.
>
> PS2: If you are not sure that you started the Markov chain with an unbiased value, it's better to discard some of the first random numbers and then convert it to the plot. BEAST discards the first 10000 numbers by default I think. Belinda! Is that correct?
>
> This is my understanding. Hope it helps a bit.
>
> All the best
> Hassan
> _______________________________________________
> Mitarbeiter.zoologie mailing list
> Mitarbeiter.zoologie at lists.uni-halle.de
> https://lists.uni-halle.de/cgi-bin/mailman/listinfo/mitarbeiter.zoologie
-------------- nächster Teil --------------
Ein Dateianhang mit HTML-Daten wurde abgetrennt...
URL: <http://lists.uni-halle.de/pipermail/mitarbeiter.zoologie/attachments/20220212/fafc1171/attachment.html>
Mehr Informationen über die Mailingliste Mitarbeiter.zoologie