[Numpy-discussion] Help using numPy to create a very large multi dimensional array
Bruno Santos
bacmsantos@gmail....
Wed Apr 18 06:04:37 CDT 2007
Finally I was able to read the data, by using the command you sair with some
small changes:
matrix = numpy.array([[float(x) for x in line.split()[1:]] for line in
vecfile])
But that doesn't solve my speed problem, now instead of taking 40seconds in
the slow step, takes 1min ant 10seconds :(
The slow step is this cycle:
for j in range(0, clust):
list_j= numpy.asarray(matrix[j])
for k in range(j+1, clust):
list_k=numpy.asarray(matrix[k])
dist=0
for e in range(0, columns):
result = list_j[e] - list_k[e]
dist += result * result
if (dist < min):
ind[0] = j
ind[1] = k
min = dist
I also try with list_j = numpy.array but it only slower even more the
calculation,
Does anyone have any ideia how I can speed up this step?
2007/4/18, Christian K. <ckkart@hoc.net>:
>
> Bruno Santos wrote:
> > I try to use the expression as you said, but I'm not getting the desired
> > result,
> > My text file look like this:
> >
> > # num rows=115 num columns=2634
> > AbassiM.txt 0.033023 0.033023 0.033023 0.165115 0.462321....0.000000
> > AgricoleW.txt 0.038691 0.038691 0.038691 0.232147 0.541676....0.215300
> > AliR.txt 0.041885 0.041885 0.041885 0.125656 0.586395....0.633580
> > .....
> > ....
> > ....
> > ZhangJ.txt 0.047189 0.047189 0.047189 0.155048 0.613452....0.000000
>
> I guess N.fromfile can't handle non numeric data. Use something like
> this instead (not tested):
>
> import numpy as N
>
> data = open('name of file').readlines()
>
> data = N.array([[float(x) for x in row.split(' ')[1:]] for row in
> data[1:]])
>
> (the above expression should be one line)
>
> Christian
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070418/a0b701bb/attachment.html
More information about the Numpy-discussion
mailing list