[Numpy-discussion] ANN: carray 0.3 released

Francesc Alted faltet@pytables....
Fri Dec 24 09:09:32 CST 2010


2010/12/24, Kevin Jacobs <jacobs@bioinformed.com> <bioinformed@gmail.com>:
> On Wed, Dec 22, 2010 at 1:58 PM, Francesc Alted <faltet@pytables.org> wrote:
>
>>  >>> %time b = ca.zeros(1e12)
>>  CPU times: user 54.76 s, sys: 0.03 s, total: 54.79 s
>>  Wall time: 55.23 s
>>
>
> I know this is somewhat missing the point of your demonstration, but 55
> seconds to create an empty 3 GB data structure to represent a multi-TB dense
> array doesn't seem all that fast to me.

Yes, this was not the point of the demo, but just showing 64-bit
addressing (a feature that I implemented recently and was eager to
show).  But, agreed, I'm guilty to show times, so your observation is
pertinent.  But mind that I'm not creating an *empty* structure, but a
*zeroed* structure; that's a bit different (that does not mean that
the process cannot be speed-up, but we all surely agree that there is
little sense in optimizing this scenario ;-).

>  Compression can do a lot of things,
> but isn't this a case where a true sparse data structure would be the right
> tool for the job?  I'm more interested in seeing what a carray can do with
> census data, web logs, or somethat vaguely real world where direct binary
> representations are used by default and assumed to be reasonable optimal
> (i.e., anything sensibly stored in sqlite tables).

Well, I'm just creating the tool; it is up to the users to find
real-world applications.  I'm pretty sure that some of you will find
some good ones.

Cheers!

-- 
Francesc Alted


More information about the NumPy-Discussion mailing list