• TwitterFacebookGoogle PlusLinkedInRSS FeedEmail

The Complexity Of Nonuniform Random Number Generation Pdf Viewer

29.08.2019 

Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications David Jones, UCL Bioinformatics Group (E-mail: d.jones@cs.ucl.ac.uk) (Last revised May 7th 2010) This is a very quick guide to what you should do if you need to generate random numbers in your bioinformatics code.

Visualisation of the Box–Muller transform — the coloured points in the unit square (u1, u2), drawn as circles, are mapped to a 2D Gaussian (z0, z1), drawn as crosses. The plots at the margins are the probability distribution functions of z0 and z1. Note that z0 and z1 are unbounded; they appear to be in [-2.5,2.5] due to the choice of the illustrated points. In the SVG file, hover over a point to highlight it and its corresponding point.

The Box–Muller transform, by George Edward Pelham Box and Mervin Edgar Muller,[1] is a pseudo-random number sampling method for generating pairs of independent, standard, normally distributed (zero expectation, unit variance) random numbers, given a source of uniformly distributed random numbers. The method was in fact first mentioned explicitly by Raymond E. A. C. Paley and Norbert Wiener in 1934.[2]

The Box–Muller transform is commonly expressed in two forms. The basic form as given by Box and Muller takes two samples from the uniform distribution on the interval [0, 1] and maps them to two standard, normally distributed samples. The polar form takes two samples from a different interval, [−1, +1], and maps them to two normally distributed samples without the use of sine or cosine functions.

The Box–Muller transform was developed as a more computationally efficient alternative to the inverse transform sampling method.[3] The ziggurat algorithm gives a more efficient method for CPU, while the Box-Muller transform is superior for GPU[4]. Furthermore, the Box–Muller transform can be employed for drawing from truncated bivariate Gaussian densities.[5]

Basic form[edit]

Suppose U1 and U2 are independent samples chosen from the uniform distribution on the unit interval (0, 1). Let

Z0=Rcos(Θ)=2lnU1cos(2πU2){displaystyle Z_{0}=Rcos(Theta )={sqrt {-2ln U_{1}}}cos(2pi U_{2}),}

and

Z1=Rsin(Θ)=2lnU1sin(2πU2).{displaystyle Z_{1}=Rsin(Theta )={sqrt {-2ln U_{1}}}sin(2pi U_{2}).,}

Then Z0 and Z1 are independent random variables with a standard normal distribution.

The derivation[6] is based on a property of a two-dimensional Cartesian system, where X and Y coordinates are described by two independent and normally distributed random variables, the random variables for R2 and Θ (shown above) in the corresponding polar coordinates are also independent and can be expressed as

The Complexity Of Nonuniform Random Number Generation Pdf Viewer 1

R2=2lnU1{displaystyle R^{2}=-2cdot ln U_{1},}

and

Θ=2πU2.{displaystyle Theta =2pi U_{2}.,}

Because R2 is the square of the norm of the standard bivariate normal variable (X, Y), it has the chi-squared distribution with two degrees of freedom. In the special case of two degrees of freedom, the chi-squared distribution coincides with the exponential distribution, and the equation for R2 above is a simple way of generating the required exponential variate.

Polar form[edit]

Two uniformly distributed values, u and v are used to produce the value s = R2, which is likewise uniformly distributed. The definitions of the sine and cosine are then applied to the basic form of the Box–Muller transform to avoid using trigonometric functions.
The Complexity Of Nonuniform Random Number Generation Pdf Viewer

The polar form was first proposed by J. Bell[7] and then modified by R. Knop.[8] While several different versions of the polar method have been described, the version of R. Knop will be described here because it is the most widely used, in part due to its inclusion in Numerical Recipes.

Given u and v, independent and uniformly distributed in the closed interval [−1, +1], set s = R2 = u2 + v2. If s = 0 or s ≥ 1, discard u and v, and try another pair (u, v). Because u and v are uniformly distributed and because only points within the unit circle have been admitted, the values of s will be uniformly distributed in the open interval (0, 1), too. The latter can be seen by calculating the cumulative distribution function for s in the interval (0, 1). This is the area of a circle with radius s{displaystyle scriptstyle {sqrt {s}}}, divided by π{displaystyle scriptstyle pi }. From this we find the probability density function to have the constant value 1 on the interval (0, 1). Equally so, the angle θ divided by 2π{displaystyle scriptstyle 2pi } is uniformly distributed in the interval [0, 1) and independent of s.

We now identify the value of s with that of U1 and θ/(2π){displaystyle scriptstyle theta /(2pi )} with that of U2 in the basic form. As shown in the figure, the values of cosθ=cos2πU2{displaystyle scriptstyle cos theta =cos 2pi U_{2}} and sinθ=sin2πU2{displaystyle scriptstyle sin theta =sin 2pi U_{2}} in the basic form can be replaced with the ratios cosθ=u/R=u/s{displaystyle scriptstyle cos theta =u/R=u/{sqrt {s}}} and sinθ=v/R=v/s{displaystyle scriptstyle sin theta =v/R=v/{sqrt {s}}}, respectively. The advantage is that calculating the trigonometric functions directly can be avoided. This is helpful when trigonometric functions are more expensive to compute than the single division that replaces each one.

Just as the basic form produces two standard normal deviates, so does this alternate calculation. System of a down toxicity meaning.

z0=2lnU1cos(2πU2)=2lns(us)=u2lnss{displaystyle z_{0}={sqrt {-2ln U_{1}}}cos(2pi U_{2})={sqrt {-2ln s}}left({frac {u}{sqrt {s}}}right)=ucdot {sqrt {frac {-2ln s}{s}}}}

The Chinese University Of Hong Kong

The

and

David B Thomas
z1=2lnU1sin(2πU2)=2lns(vs)=v2lnss.{displaystyle z_{1}={sqrt {-2ln U_{1}}}sin(2pi U_{2})={sqrt {-2ln s}}left({frac {v}{sqrt {s}}}right)=vcdot {sqrt {frac {-2ln s}{s}}}.}

Contrasting the two forms[edit]

The polar method differs from the basic method in that it is a type of rejection sampling. It discards some generated random numbers, but can be faster than the basic method because it is simpler to compute (provided that the random number generator is relatively fast) and is more numerically robust.[9] It avoids the use of trigonometric functions, which can be expensive in some computing environments[citation needed]. It discards 1 − π/4 ≈ 21.46% of the total input uniformly distributed random number pairs generated, i.e. discards 4/π − 1 ≈ 27.32% uniformly distributed random number pairs per Gaussian random number pair generated, requiring 4/π ≈ 1.2732 input random numbers per output random number.

The basic form requires two multiplications, 1/2 logarithm, 1/2 square root, and one trigonometric function for each normal variate.[10] On some processors, the cosine and sine of the same argument can be calculated in parallel using a single instruction. Notably for Intel-based machines, one can use the fsincos assembler instruction or the expi instruction (usually available from C as an intrinsic function), to calculate complex

exp(iz)=eiz=cosz+isinz,{displaystyle exp(iz)=e^{iz}=cos z+isin z,}

and just separate the real and imaginary parts.

Note: To explicitly calculate the complex-polar form use the following substitutions in the general form,

Let r=2ln(u1){displaystyle r={sqrt {-2ln(u_{1})}}} and z=2πu2.{displaystyle z=2pi u_{2}.} Then

reiz=2ln(u1)ei2πu2=2ln(u1)[cos(2πu2)+isin(2πu2)].{displaystyle re^{iz}={sqrt {-2ln(u_{1})}}e^{i2pi u_{2}}={sqrt {-2ln(u_{1})}}left[cos(2pi u_{2})+isin(2pi u_{2})right].}

Wayne Luk

The polar form requires 3/2 multiplications, 1/2 logarithm, 1/2 square root, and 1/2 division for each normal variate. The effect is to replace one multiplication and one trigonometric function with a single division and a conditional loop.

Tails truncation[edit]

When a computer is used to produce a uniform random variable it will inevitably have some inaccuracies because there is a lower bound on how close numbers can be to 0. If the generator uses 32 bits per output value, the smallest non-zero number that can be generated is 232{displaystyle 2^{-32}}. When U1{displaystyle U_{1}} and U2{displaystyle U_{2}} are equal to this the Box–Muller transform produces a normal random variable equal to 2ln(232)cos(2π232)6.66{displaystyle {sqrt {-2ln(2^{-32})}}cos(2pi 2^{-32})approx 6.66}This means that the algorithm will not produce random variables more than 6.66 standard deviations from the mean. This corresponds to a proportion of 2.74×1011{displaystyle 2.74times 10^{-11}} lost due to the truncation.

Implementation[edit]

The standard Box–Muller transform generates values from the standard normal distribution (i.e.standard normal deviates) with mean 0 and standard deviation 1. The implementation below in standard C++ generates values from any normal distribution with mean μ{displaystyle mu } and variance σ2{displaystyle sigma ^{2}}. If Z{displaystyle Z} is a standard normal deviate, then X=Zσ+μ{displaystyle X=Zsigma +mu } will have a normal distribution with mean μ{displaystyle mu } and standard deviation σ{displaystyle sigma }. Note that because the random number generator rand has not been seeded, the same series of values will always be returned from the generateGaussianNoise function.

See also[edit]

  • Marsaglia polar method, similar transform to Box-Muller, which uses Cartesian coordinates, instead of polar coordinates

References[edit]

  • Howes, Lee; Thomas, David (2008). GPU Gems 3 - Efficient Random Number Generation and Application Using CUDA. Pearson Education, Inc. ISBN978-0-321-51526-1.
  1. ^Box, G. E. P.; Muller, Mervin E. (1958). 'A Note on the Generation of Random Normal Deviates'. The Annals of Mathematical Statistics. 29 (2): 610–611. doi:10.1214/aoms/1177706645. JSTOR2237361.
  2. ^Raymond E. A. C. Paley and Norbert Wiener Fourier Transforms in the Complex Domain, New York: American Mathematical Society (1934) §37.
  3. ^Kloeden and Platen, Numerical Solutions of Stochastic Differential Equations, pp. 11–12
  4. ^Howes & Thomas 2008.
  5. ^Martino, L.; Luengo, D.; Míguez, J. (2012). 'Efficient sampling from truncated bivariate Gaussians via Box-Muller transformation'. Electronics Letters. 48 (24): 1533–1534. CiteSeerX10.1.1.716.8683. doi:10.1049/el.2012.2816.
  6. ^Sheldon Ross, A First Course in Probability, (2002), pp. 279–281
  7. ^Bell, James R. (1968). 'Algorithm 334: Normal random deviates'. Communications of the ACM. 11 (7): 498. doi:10.1145/363397.363547.
  8. ^Knop, R. (1969). 'Remark on algorithm 334 [G5]: Normal random deviates'. Communications of the ACM. 12 (5): 281. doi:10.1145/362946.362996.
  9. ^Everett F. Carter, Jr., The Generation and Application of Random Numbers, Forth Dimensions (1994), Vol. 16, No. 1 & 2.
  10. ^Note that the evaluation of 2πU1 is counted as one multiplication because the value of 2π can be computed in advance and used repeatedly.

External links[edit]

  • Weisstein, Eric W.'Box-Muller Transformation'. MathWorld.
Retrieved from 'https://en.wikipedia.org/w/index.php?title=Box–Muller_transform&oldid=908863783'