[Numpy-discussion] bincount limitations
Thu Jun 2 12:11:48 CDT 2011
On Thu, Jun 2, 2011 at 12:08, Skipper Seabold <email@example.com> wrote:
> On Wed, Jun 1, 2011 at 10:10 PM, Alan G Isaac <firstname.lastname@example.org> wrote:
>>> On Thu, Jun 2, 2011 at 1:49 AM, Mark Miller<email@example.com> wrote:
>>>> Not quite. Bincount is fine if you have a set of approximately
>>>> sequential numbers. But if you don't....
>> On 6/1/2011 9:35 PM, David Cournapeau wrote:
>>> Even worse, it fails miserably if you sequential numbers but with a high shift.
>>> np.bincount([100000001, 100000002]) # will take a lof of memory
>>> Doing bincount with dict is faster in those cases.
>> Since this discussion has turned shortcomings of bincount,
>> may I ask why np.bincount() is not an empty array?
>> Even more puzzling, why is np.bincount(,minlength=6)
>> not a 6-array of zeros?
> Just looks like it wasn't coded that way, but it's low-hanging fruit.
> Any objections to adding this behavior? This commit should take care
> of it. Tests pass. Comments welcome, as I'm just getting my feet wet
I would use np.zeros(5, dtype=int) in test_empty_with_minlength(), but
otherwise, it looks good.
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the NumPy-Discussion