[Numpy-discussion] bincount limitations

Alan G Isaac alan.isaac@gmail....
Wed Jun 1 21:10:02 CDT 2011


> On Thu, Jun 2, 2011 at 1:49 AM, Mark Miller<markperrymiller@gmail.com>  wrote:
>> Not quite. Bincount is fine if you have a set of approximately
>> sequential numbers. But if you don't....


On 6/1/2011 9:35 PM, David Cournapeau wrote:
> Even worse, it fails miserably if you sequential numbers but with a high shift.
> np.bincount([100000001, 100000002]) # will take a lof of memory
> Doing bincount with dict is faster in those cases.


Since this discussion has turned shortcomings of bincount,
may I ask why np.bincount([]) is not an empty array?
Even more puzzling, why is np.bincount([],minlength=6)
not a 6-array of zeros?

Use case: bincount of infected individuals by number of contacts.
(In some periods there may be no infections.)

Thank you,
Alan Isaac

PS A collections.Counter works pretty nice for Mark and David's cases,
aside from the fact that the keys are not sorted.







More information about the NumPy-Discussion mailing list