[Numpy-discussion] np.bincount raises MemoryError when given an empty array

Ernest Adrogué eadrogue@gmx....
Tue Feb 2 05:22:21 CST 2010


 2/02/10 @ 00:01 (-0700), thus spake Charles R Harris:
> On Mon, Feb 1, 2010 at 10:57 PM, <josef.pktd@gmail.com> wrote:
> 
> > On Tue, Feb 2, 2010 at 12:31 AM, Charles R Harris
> > <charlesr.harris@gmail.com> wrote:
> > >
> > >
> > > On Mon, Feb 1, 2010 at 10:02 PM, <josef.pktd@gmail.com> wrote:
> > >>
> > >> On Mon, Feb 1, 2010 at 11:45 PM, Charles R Harris
> > >> <charlesr.harris@gmail.com> wrote:
> > >> >
> > >> >
> > >> > On Mon, Feb 1, 2010 at 9:36 PM, David Cournapeau <cournape@gmail.com>
> > >> > wrote:
> > >> >>
> > >> >> On Tue, Feb 2, 2010 at 1:05 PM,  <josef.pktd@gmail.com> wrote:
> > >> >>
> > >> >> > I think this could be considered as a correct answer, the count of
> > >> >> > any
> > >> >> > integer is zero.
> > >> >>
> > >> >> Maybe, but this shape is random - it would be different in different
> > >> >> conditions, as the length of the returned array is just some random
> > >> >> memory location.
> > >> >>
> > >> >> >
> > >> >> > Returning an array with one zero, or the empty array or raising an
> > >> >> > exception? I don't see much of a pattern
> > >> >>
> > >> >> Since there is no obvious solution, the only rationale for not
> > raising
> > >> >> an exception  I could see is to accommodate often-encountered special
> > >> >> cases. I find returning [0] more confusing than returning empty
> > >> >> arrays, though - maybe there is a usecase I don't know about.
> > >> >>
> > >> >
> > >> > In this case I would expect an empty input to be a programming error
> > and
> > >> > raising an error to be the right thing.
> > >>
> > >> Not necessarily, if you run the bincount over groups in a dataset and
> > >> your not sure if every group is actually observed. The main question,
> > >> is whether the user needs or wants to check for empty groups before or
> > >> after the loop over bincount.
> > >>
> > >
> > > How would they know which bin to check? This seems like an unlikely way
> > to
> > > check for an empty input.
> >
> > # grade (e.g. SAT) distribution by school and race
> > for s in schools:
> >    for r in race:
> >      print s, r, np.bincount(allstudentgrades[(sch==s)*(ra==r)])
> >
> > allwhite schools and allblack schools raise an exception.
> >
> > I just made up the story, my first attempt was: all sectors, all
> > firmsize groups, bincount something, will have empty cells for some
> > size groups, e.g. nuclear power in family business.
> >
> >
> OK, point taken. What do you think would be the best thing to do?

In my opinion, returning an empty array makes more sense than
array([0]). An empty arrays means "there are no bins", whereas
an array of length 1 implies that there is one.

Cheers.

Ernest



More information about the NumPy-Discussion mailing list