[IPython-User] ipython crash when trying to read 500M txt data in interactive mode

Wes McKinney wesmckinn@gmail....
Tue Mar 20 10:29:39 CDT 2012


On Tue, Mar 20, 2012 at 9:17 AM, Chao YUE <chaoyuejoy@gmail.com> wrote:
> I'm working on a linux server.  the memory is like:
>
> ychao@obelix2 - ...CRU_NEW - 29 >free -m
>              total       used       free     shared    buffers     cached
> Mem:          5850       5637        212          0          6       4589
> -/+ buffers/cache:       1041       4809
> Swap:        16383        386      15997
>
> There is no specific information on why ipython crashed.
>
> Now my solution is that I use awk to organize the file into ~1000 files by
> their year.
>
> cheers,
>
> Chao
>
>
>
> 2012/3/20 Wes McKinney <wesmckinn@gmail.com>
>>
>> On Tue, Mar 20, 2012 at 8:32 AM, Chao YUE <chaoyuejoy@gmail.com> wrote:
>> > Dear all,
>> >
>> > I received a file from others which contains ~30 million lines and in
>> > size
>> > of ~500M.
>> > I try read it with numpy.genfromtxt in ipython interactive mode. Then
>> > ipython crashed.
>> > The data contains lat,lon,var1,year, the year ranges from 1001 to 2006.
>> > Finally I want to write the
>> > data to netcdf for separate years and feed them into the model. I guess
>> > I
>> > need a better way to do this?
>> > anyone would be any idea is highly appreciated.
>> >
>> >
>> > lon,lat,year,area_burned
>> > -180.0,65.0,1001,0
>> > -180.0,65.0,1002,0
>> > -180.0,65.0,1003,0
>> > -180.0,65.0,1004,0
>> > -180.0,65.0,1005,0
>> > -180.0,65.0,1006,0
>> > -180.0,65.0,1007,0
>> >
>> > thanks and cheers,
>> >
>> > Chao
>> > --
>> >
>> > ***********************************************************************************
>> > Chao YUE
>> > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
>> > UMR 1572 CEA-CNRS-UVSQ
>> > Batiment 712 - Pe 119
>> > 91191 GIF Sur YVETTE Cedex
>> > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>> >
>> > ************************************************************************************
>> >
>> >
>> > _______________________________________________
>> > IPython-User mailing list
>> > IPython-User@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/ipython-user
>> >
>>
>> You're definitely going to have memory problems reading a file of that
>> size (30MM rows x 4 columns = 120 MM Python objects, 2-3 GB minimum on
>> a 64-bit platform). I've been beating the drum about doing something
>> about this, but it's still a ways off. So I suspect your machine is
>> swapping but not sure why that is causing IPython to crash
>> _______________________________________________
>> IPython-User mailing list
>> IPython-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-user
>
>
>
>
> --
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
> ************************************************************************************
>
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>

Well, you don't have very much memory free. Error might be caused by
memory errors inside the IPython system.

- Wes


More information about the IPython-User mailing list