[Numpy-discussion] Numpy 2D array from a list error
Dave Wood
davejwood@gmail....
Wed Sep 23 11:56:20 CDT 2009
Appologies for the multiple posts, people. My posting to the forum was
pending for a long time, so I deleted it and tried emailing directly. I
didn't think they'd all be sent out.
Gokan, thanks for the reply, I hope you get this one.
"Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share
your results?
I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float',
skiprows=83).T
I[15]: len data
-----> len(data)
O[15]: 66
I[16]: len data[0]
-----> len(data[0])
O[16]: 117040
I[17]: whos
Variable Type Data/Info
--------------------------------
data ndarray 66x117040: 7724640 elems, type `float64`, 61797120
bytes (58 Mb)
[gsever@ccn various]$ python sysinfo.py
================================================================================
Platform :
Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas
Python : ('CPython', 'tags/r26', '66714')
IPython : 0.10
NumPy : 1.4.0.dev
Matplotlib : 1.0.svn
================================================================================
--
Gökhan"
I tried using loadtxt and got the same error as before (with a little more
information).
"
Traceback (most recent call last):
File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 140,
in <module>
main()
File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 45,
in main
data = loadtxt("inputfile.txt",dtype='string')
File
"/apps/python/2.5.4/rhel4/lib/python2.5/site-packages/numpy/lib/io.py", line
505, in loadtxt
X = np.array(X, dtype)
ValueError: setting an array element with a sequence
"
@Christopher Barker
Thanks for the information. To fix my problem, I tried taking out the row
names (leaving only numerical information), and converting the 2D list to
floats. I still had the same problem.
On 9/23/09, Christopher Barker <Chris.Barker@noaa.gov> wrote:
>
> Dave Wood wrote:
> > Well, I suppose they are all considered to be strings here. I haven't
> > tried to convert the numbers to floats yet.
>
> This could be an issue. For strings, numpy creates an array of strings,
> all of the same length, so each element is as big as the largest one:
>
> In [13]: l
> Out[13]: ['5', '34', 'this is a much longer string']
>
> In [14]: np.array(l)
> Out[14]:
> array(['5', '34', 'this is a much longer string'],
> dtype='|S28')
>
>
> Note that each element is 28 bytes (that's what the S28 means).
>
> this means that your array would be much larger than the text file if
> you have even one long string it in. Also, as mentioned in this thread,
> in order to figure out how big to make each string element, the array()
> constructor has to scan through your entire list first, and I don't know
> how much intermediate memory it may use in that process.
>
> This really isn't how numpy is meant to be used -- why would you want a
> big ol' array of mixed numbers and strings, all stored as strings?
>
> structured arrays were meant for this, and np.loadtxt() is the easiest
> way to get one.
>
> > I just tried preallocating the array and updating it one line at a time,
> > and that works fine.
>
> what dtype do you end up with?
>
> > This doesn't seem like the expected behaviour though and the error
> > message seems wrong.
>
> yes, not a good error message at all -- it's hard to make sure good
> errors get triggered every time!
>
>
> HTH,
>
> -Chris
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R (206) 526-6959 voice
> 7600 Sand Point Way NE (206) 526-6329 fax
> Seattle, WA 98115 (206) 526-6317 main reception
>
> Chris.Barker@noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20090923/684d1b48/attachment-0001.html
More information about the NumPy-Discussion
mailing list