[Numpy-discussion] bincount limitations

Skipper Seabold jsseabold@gmail....
Thu Jun 2 12:08:32 CDT 2011


On Wed, Jun 1, 2011 at 10:10 PM, Alan G Isaac <alan.isaac@gmail.com> wrote:
>> On Thu, Jun 2, 2011 at 1:49 AM, Mark Miller<markperrymiller@gmail.com>  wrote:
>>> Not quite. Bincount is fine if you have a set of approximately
>>> sequential numbers. But if you don't....
>
>
> On 6/1/2011 9:35 PM, David Cournapeau wrote:
>> Even worse, it fails miserably if you sequential numbers but with a high shift.
>> np.bincount([100000001, 100000002]) # will take a lof of memory
>> Doing bincount with dict is faster in those cases.
>
>
> Since this discussion has turned shortcomings of bincount,
> may I ask why np.bincount([]) is not an empty array?
> Even more puzzling, why is np.bincount([],minlength=6)
> not a 6-array of zeros?
>

Just looks like it wasn't coded that way, but it's low-hanging fruit.
Any objections to adding this behavior? This commit should take care
of it. Tests pass. Comments welcome, as I'm just getting my feet wet
here.

https://github.com/jseabold/numpy/commit/133148880bba5fa3a11dfbb95cefb3da4f7970d5

Skipper


More information about the NumPy-Discussion mailing list