[IPython-User] ipython crash when trying to read 500M txt data in interactive mode

Chao YUE chaoyuejoy@gmail....
Tue Mar 20 08:17:10 CDT 2012


I'm working on a linux server.  the memory is like:

ychao@obelix2 - ...CRU_NEW - 29 >free -m
             total       used       free     shared    buffers     cached
Mem:          5850       5637        212          0          6       4589
-/+ buffers/cache:       1041       4809
Swap:        16383        386      15997

There is no specific information on why ipython crashed.

Now my solution is that I use awk to organize the file into ~1000 files by
their year.

cheers,

Chao


2012/3/20 Wes McKinney <wesmckinn@gmail.com>

> On Tue, Mar 20, 2012 at 8:32 AM, Chao YUE <chaoyuejoy@gmail.com> wrote:
> > Dear all,
> >
> > I received a file from others which contains ~30 million lines and in
> size
> > of ~500M.
> > I try read it with numpy.genfromtxt in ipython interactive mode. Then
> > ipython crashed.
> > The data contains lat,lon,var1,year, the year ranges from 1001 to 2006.
> > Finally I want to write the
> > data to netcdf for separate years and feed them into the model. I guess I
> > need a better way to do this?
> > anyone would be any idea is highly appreciated.
> >
> >
> > lon,lat,year,area_burned
> > -180.0,65.0,1001,0
> > -180.0,65.0,1002,0
> > -180.0,65.0,1003,0
> > -180.0,65.0,1004,0
> > -180.0,65.0,1005,0
> > -180.0,65.0,1006,0
> > -180.0,65.0,1007,0
> >
> > thanks and cheers,
> >
> > Chao
> > --
> >
> ***********************************************************************************
> > Chao YUE
> > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> > UMR 1572 CEA-CNRS-UVSQ
> > Batiment 712 - Pe 119
> > 91191 GIF Sur YVETTE Cedex
> > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
> >
> ************************************************************************************
> >
> >
> > _______________________________________________
> > IPython-User mailing list
> > IPython-User@scipy.org
> > http://mail.scipy.org/mailman/listinfo/ipython-user
> >
>
> You're definitely going to have memory problems reading a file of that
> size (30MM rows x 4 columns = 120 MM Python objects, 2-3 GB minimum on
> a 64-bit platform). I've been beating the drum about doing something
> about this, but it's still a ways off. So I suspect your machine is
> swapping but not sure why that is causing IPython to crash
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>



-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20120320/7b222f56/attachment.html 


More information about the IPython-User mailing list