[Numpy-discussion] np.bincount raises MemoryError when given an empty array

Charles R Harris charlesr.harris@gmail....
Mon Feb 1 23:31:40 CST 2010


On Mon, Feb 1, 2010 at 10:02 PM, <josef.pktd@gmail.com> wrote:

> On Mon, Feb 1, 2010 at 11:45 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> >
> >
> > On Mon, Feb 1, 2010 at 9:36 PM, David Cournapeau <cournape@gmail.com>
> wrote:
> >>
> >> On Tue, Feb 2, 2010 at 1:05 PM,  <josef.pktd@gmail.com> wrote:
> >>
> >> > I think this could be considered as a correct answer, the count of any
> >> > integer is zero.
> >>
> >> Maybe, but this shape is random - it would be different in different
> >> conditions, as the length of the returned array is just some random
> >> memory location.
> >>
> >> >
> >> > Returning an array with one zero, or the empty array or raising an
> >> > exception? I don't see much of a pattern
> >>
> >> Since there is no obvious solution, the only rationale for not raising
> >> an exception  I could see is to accommodate often-encountered special
> >> cases. I find returning [0] more confusing than returning empty
> >> arrays, though - maybe there is a usecase I don't know about.
> >>
> >
> > In this case I would expect an empty input to be a programming error and
> > raising an error to be the right thing.
>
> Not necessarily, if you run the bincount over groups in a dataset and
> your not sure if every group is actually observed. The main question,
> is whether the user needs or wants to check for empty groups before or
> after the loop over bincount.
>
>
How would they know which bin to check? This seems like an unlikely way to
check for an empty input.


> Like
> >>> np.sum([])
> 0.0
> >>> sum([])
> 0
> the empty array or the array([0]) can be considered as the default
> argument. In this case it is not really a programming error.
>
>
I like that better than an empty array.


> Since bincount usually returns redundant zero count unless
> np.unique(data) = np.arange(data.max()+1),
> array([0]) would also make sense as a minimum answer
> >>> np.bincount([7,8,9])
> array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1])
>
> I use bincount quite a lot but only with fixed sized arrays, so I
> never actually used it in this way (yet).
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100201/5167424f/attachment-0001.html 


More information about the NumPy-Discussion mailing list