[Numpy-discussion] bincount limitations
Alan G Isaac
Wed Jun 1 21:10:02 CDT 2011
> On Thu, Jun 2, 2011 at 1:49 AM, Mark Miller<firstname.lastname@example.org> wrote:
>> Not quite. Bincount is fine if you have a set of approximately
>> sequential numbers. But if you don't....
On 6/1/2011 9:35 PM, David Cournapeau wrote:
> Even worse, it fails miserably if you sequential numbers but with a high shift.
> np.bincount([100000001, 100000002]) # will take a lof of memory
> Doing bincount with dict is faster in those cases.
Since this discussion has turned shortcomings of bincount,
may I ask why np.bincount() is not an empty array?
Even more puzzling, why is np.bincount(,minlength=6)
not a 6-array of zeros?
Use case: bincount of infected individuals by number of contacts.
(In some periods there may be no infections.)
PS A collections.Counter works pretty nice for Mark and David's cases,
aside from the fact that the keys are not sorted.
More information about the NumPy-Discussion