[SciPy-User] rvs and broadcasting

Nathaniel Smith njs@pobox....
Fri Nov 25 16:07:04 CST 2011

On Wed, Nov 23, 2011 at 6:47 PM,  <josef.pktd@gmail.com> wrote:
> rvs in scipy.stats distributions has a nasty broadcasting if location
> or scale are arrays and size is not defined for the same shape
> http://projects.scipy.org/scipy/ticket/1544
> (also https://groups.google.com/group/pystatsmodels/browse_thread/thread/e757d73b2a06b962?hl=en
> )
> I was playing with two solutions while I was writing a rvs for the
> truncated normal.
> 1) broadcast shape parameters, loc and scale, if they are arrays
> produce rvs in that shape, and, if in this case size is not the same
> or 1, then raise a ValueError
> essentially
>    lower, upper, loc, scale = np.broadcast_arrays(lower, upper, loc, scale)
>    if (np.size(lower) > 1) and (size != (1,)) and (lower.shape != size):
>        raise ValueError('Do you really want this? Then do it yourself.')
> 2) broadcast shape parameters, loc and scale,  for each of these
> create random variables given by size, the return shape is essentially
> broadcasted shape concatenated with size, for example
> assert_equal(truncnorm_rvs(lower*np.arange(4)[:,None], upper,
>                                loc=np.arange(5), scale=1, size=(2,3)).shape,
>                     (4, 5, 2, 3))
> this version is attached.
> Any opinions about which version should be preferred?

I'm strongly in favor of option 2. The additional functionality is a
little bit tricky to understand, but not much, and I can easily
imagine cases where it'd be both useful and natural. And, option 2 is
a strict superset of option 1 -- in option 1, the shape= parameter is
useless when passing in parameter vectors, one should just leave it
off in all cases. In option 2, you can still leave off the shape=
parameter and get the same functionality; plus, you have the option of
getting additional useful functionality by specifying it.

So that's my 2 cents...

-- Nathaniel

More information about the SciPy-User mailing list