Loop Example: Monte Carlo Experiments

Home, Up: Loop and Script Examples

 

In this example:

Random Numbers, BUFFER, Single Line Loops.

 

A Monte Carlo experiment is a mathematical method that uses a series of random numbers to solve a problem for which no direct algebraic solution is known; the name alludes to playing roulette at the Monte Carlo casino.

With its three types of random numbers, Hypatia is well suited for performing Monte Carlo experiments. Some may involve nested loops, but by using your own index variables, as demonstrated on the previous page, this problem can be solved.

As a reminder, these are Hypatia's three types of random numbers:

RAND is a pseudo constant that returns a random number in the range of 0.0 to 1.0 (inclusive).

a b RANDINT is a 2-argumant operator that returns a random integer in the range of a to b (inclusive).

a b RANDND is a 2-argument operator that returns a normal distributed random number with mean a and standard deviation b.

(0 1 RANDND means standard normal distribution, for which you can also write 0 0 RANDND.)

 

Here is a rather simple Monte Carlo experiment, for which we do not even need to use a script -- only if we wanted to do it repeatedly, with different parameters, a script would make sense.

(As always, the question mark at the start of the line is Hypatia's input prompt, you do not type it.)

The problem to solve: When observing heterosexual couples, in the large majority of cases the man is taller than the woman. Is this due to a difference in average body heights between the sexes, or is there a cultural factor involved?

I cannot actually answer the question because I do not have the data about real life couples, but I can use Hypatia to calculate what percentage of couples with men larger than women, women larger than men, or equal heights (with an arbitrary measurement tolerance) would be statistically expected if the pairings were random regarding height.

There's probably a formula to calculate this but I do not know it, so, let's use a Monte Carlo experiment.

 

Human body heights are approximately normal distributed. I cannot vouch for the correctness of these numbers, but allegedly in the U.S. the mean height of men is 178 cm with a standard deviation of 7.6, while the mean height of women is 164 cm with a standard deviation of 6.35.

Let's create a hundred thousand random couples (this will take a few seconds), and write their differences in height to the buffer.

(While we usually use * in a loop command for infinite loops which run until an exit condition is met, and where the maximum number of passes that * represents only serves as a safeguard, here we use * to actually stand for this number -- you remember, one hundred thousand by default, but with the MAXLOOP command you can set any value from ten to ten million.)

 

? BUFFER START

? DO * :: 178 7.6 RANDND 164 6.35 RANDND - &

= 3.93700787401575

 

The result that is shown has no meaning, it is just the last random difference (and thus the value of $), the 100000 values are stored in the buffer. We'll assign them to a user defined element, and turn buffer mode off.

 

Now we copy the content of the buffer to a user defined element, discard the buffer which also turns off buffer mode, and then we can use the n-argument operators N+, N0 and N- to count the pairings with positive, zero or negative values (a positive value here defined as the man being taller than the woman).

To these operators "zero" means not exceeding the zero threshold (see page "The Zero Threshold" in chapter "Numbers").

By setting the zero threshold (by default the very small value of 1e-16) we can define which difference in height we still consider equal -- let's say, 1 cm.

 

? @pairs = (buffer)

? BUFFER DISCARD

? $zero = 1

? @pairs N+

= 90'500

? @pairs N0

= 3'036

? @pairs N-

= 6'464

 

(This was the result I got. If you repeat the test you will get slightly different results, but this lies in the nature of Monte Carlo experiments.)

So, if height differences in US heterosexual couples were random, we'd expect to see some 90.5% of couples where the man is taller than the woman, some 3% with differences below 1 cm, and some 6.5% where the woman is taller than the man. Unfortunatey I do not have the empirical data that we'd have to compare with these values to decide the original question, but be that as it may, this example was meant to demonstrate Hypatia's Monte Carlo abilities.

 

Monte Carlo experiments can be used for a wide variety of problems, and while some of them may require more powerful tools than Hypatia, Hypatia can do a lot in this direction.

 

To conclude, let's do a quick test to verify that we can trust Hypatia's random generator -- let's roll a die 100000 times and add up the eyes (the expected result would be 350000).

Using silent mode in the calculation is important -- otherwise each one of the one hundred thousand loop passes would write the current value of $ to hy, which would make this loop very slow. (Using buffer mode would speed it up, too, but, since in this example we do not need the loop pass results, silent mode is easier to use and even faster.)

Note that silent mode still applies when the loop ends -- the final value of $ gets displayed, but not written to hy.

? 0

? DO * :: $ 1 6 RANDINT + #

= 349'162

 

Looks good enough, but, while we're at it, let's also see if we come close enough to rolling the expected number of sixes (which would be 16667), or, if we divide 100000 (or 1e5) by that number, how many times on average we have to roll the die to get a six (which should be 6 times):

? 0

? DO * :: IF 1 6 RANDINT 6 == THEN 1 + #

= 16'708

? 1e5 16708 /

= 5.98515681110845

 

(We can write THEN 1 + because the + operator misses an argument, and $ is automatically inserted. This works only with the first operator in a calculation -- in the previous example RANDINT was the first operator, therefore we had to include $ and write $ 1 6 RANDINT + #, see page "Chain Calculations" in chapter "First Steps".)

If you repeat these tests you'll get different results, which should be scattered around the expected value -- could we observe this by repeating the test ten times in a loop?

We cannot put the line DO * :: IF 1 6 RANDINT 6 == THEN 1 + # in a script and call that in a loop -- but we can have a loop with one million passes, and at every one hundred thousand passes save our result, and reset the counter.

This requires a script, let's call it test6. Because we only have 10 results, we do not need to use the buffer -- we do have to clear hy at the start, though.

At each pass of the loop we roll the die and increase the counter by 1 if we roll a six. Every 100000 rolls of the die we calculate the average number of tries we needed to roll a six, write this result to hy, and set the counter back to 0.

Because to keep it simple we use $ instead of a variable as our counter, so we have to set $ to 0 at the beginning of the loop. Since this line is a calculation, it would write the result 0 to hy at the end of the first loop pass (after the commmand & clears hy!) -- to avoid this we have to use silent mode. We also need silent mode when we add 1 to the value of $ after we've rolled a 6, and when we set $ back to 0 after the end of each cycle of 100000 rolls of the die. (Maybe it would be easier to use a variable, after all ...)

 

I1: 0 #

I1: &

IF 1 6 RANDINT 6 == THEN 1 + #

IF I 100000 MULTIPLE THEN 1e5 $ / &&

ALSO: 0 #

I*: HY

I*: (hy) MEAN

 

To let this script perform ten tests with one hundred thousand passes each, we need to set the maximum number of loops to 1 million, and then call our script test6 with that maximum.

At the end of the loop the script displays the ten results, and their mean.

 

? MAXLOOP 1 MILLION

? _*test6

  6.01865783930184

  5.97514340344168

  5.91191250369495

  5.99916011758354

  6.01793344165614

  5.99880023995201

  6.0146758089739

  6.03864734299517

  6.0146758089739

  5.95486214494134

= 5.99444686515145

 

Even with the total of 1 million random numbers you still get different results each time you repeat this experiment. This may be because Hypatia's (that is, Phix's) random numbers are software-generated and not as strictly random as those that a hardware random number generator (or maybe a more sophisticated software random number generator) would provide.

I've tested all three types of Hypatia's random numbers, found no repeating patterns and no systematic biases, and for most practical purposes except cryptography and advanced scientific research they seem to be random enough. Be aware that they are not perfect, though.

 

Home, Up: Loop and Script Examples, Prev: Loop Example: Nested Loops, Next: Loop Example: Fibonacci Sequences