[Numpy-discussion] Random int64 and float64 numbers
josef.pktd@gmai...
josef.pktd@gmai...
Thu Nov 5 22:17:33 CST 2009
On Thu, Nov 5, 2009 at 10:28 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Thu, Nov 5, 2009 at 7:53 PM, <josef.pktd@gmail.com> wrote:
>>
>> On Thu, Nov 5, 2009 at 9:23 PM, Charles R Harris
>> <charlesr.harris@gmail.com> wrote:
>> >
>> >
>> > On Thu, Nov 5, 2009 at 7:04 PM, <josef.pktd@gmail.com> wrote:
>> >>
>> >> On Thu, Nov 5, 2009 at 6:36 PM, Charles R Harris
>> >> <charlesr.harris@gmail.com> wrote:
>> >> >
>> >> >
>> >> > On Thu, Nov 5, 2009 at 4:26 PM, David Warde-Farley
>> >> > <dwf@cs.toronto.edu>
>> >> > wrote:
>> >> >>
>> >> >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote:
>> >> >>
>> >> >> > Interesting thread, which leaves me wondering two things: is it
>> >> >> > documented
>> >> >> > somewhere (e.g., at the IEEE site) precisely how many *decimal*
>> >> >> > mantissae
>> >> >> > are representable using the 64-bit IEEE standard for float
>> >> >> > representation
>> >> >> > (if that makes sense);
>> >> >>
>> >> >> IEEE-754 says nothing about decimal representations aside from how
>> >> >> to
>> >> >> round when converting to and from strings. You have to
>> >> >> provide/accept
>> >> >> *at least* 9 decimal digits in the significand for single-precision
>> >> >> and 17 for double-precision (section 5.6). AFAIK implementations
>> >> >> will
>> >> >> vary in how they handle cases where a binary significand would yield
>> >> >> more digits than that.
>> >> >>
>> >> >
>> >> > I believe that was the argument for the extended precision formats.
>> >> > The
>> >> > givien number of decimal digits is sufficient to recover the same
>> >> > float
>> >> > that
>> >> > produced them if a slightly higher precision is used in the
>> >> > conversion.
>> >> >
>> >> > Chuck
>> >>
>> >> >From the discussion for the floating point representation, it seems
>> >> that
>> >> a uniform random number generator would have a very coarse grid
>> >> in the range for example -1e30 to +1e30 compared to interval -0.5,0.5.
>> >>
>> >> How many points can be represented by a float in [-0.5,0.5] compared
>> >> to [1e30, 1e30+1.]?
>> >> If I interpret this correctly, then there are as many floating point
>> >> numbers
>> >> in [0,1] as in [1,inf), or am I misinterpreting this.
>> >>
>> >> So how does a PRNG handle a huge interval of uniform numbers?
>> >>
>> >
>> > There are several implementations, but the ones I'm familiar with reduce
>> > to
>> > scaling. If the rng produces random unsigned integers, then the range of
>> > integers is scaled to the interval [0,1). The variations involve
>> > explicit
>> > scaling (portable) or bit twiddling of the IEEE formats. In straight
>> > forward
>> > scaling some ranges of the random integers may not map 1-1, so the
>> > unused
>> > bits are masked off first; if you want doubles you only need 52 bits,
>> > etc.
>> > For bit twiddling there is an implicit 1 in the mantissa, so the basic
>> > range
>> > works out to [1,2), but that can be fixed by subtracting 1 from the
>> > result.
>> > Handling larger ranges than [0,1) just involves another scaling.
>>
>> So, since this is then a discrete distribution, what is the number of
>> points
>> in the support? (My guess would be 2**52, but I don't know much about
>> numerical representations.)
>>
>> This would be the largest set of integers that could be generated
>> without gaps in the distribution, and would determine the grid size
>> for floating point random variables. (?)
>>
>
> Yes.
>
>>
>> for Davids example:
>> low, high = -1e307, 1e307
>> np.random.uniform(low, high, 100) # much more reasonable
>>
>> this would imply a grid size of
>> >>> 2*1e307/2.**52
>> 4.4408920985006261e+291
>>
>> or something similar. (floating points are not very dense in the real
>> line.)
>>
>
> Yes, or rather, they are more or less logarithmically distributed. Floats
> are basically logarithms base 2 with a mantissa of fixed precision. Actual
> logarithm code uses the exponent and does a correction to the mantissa,
> i.e., takes the logarithm base two of numbers in the range [1,2).
>
> Random integers are treated differently. To get random integers in a given
> range, say [0,100], the bitstream produced by the rng would be broken up
> into 7 bit chunks [0, 127] that are used in a sampling algorithm that loops
> until a value in the range is produced. So any range of integers can be
> produced. Code for that comes with the Mersenne Twister, but I don't know if
> numpy uses it.
>
> Chuck
Thanks for the explanation.
It's good to understand some of the limitations, but I don't think in any
Monte Carlo, I will have a distribution with tails that are fat enough
that the discretization becomes a problem.
Josef
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
More information about the NumPy-Discussion
mailing list