mysql -> record array

Tim Hochberg tim.hochberg at ieee.org
Tue Nov 14 22:55:56 CST 2006


Tim Hochberg wrote:
> [CHOP]
>
> The timings of these are pretty consistent with each other with the 
> previous runs except that the difference between retrieve1 and retrieve2 
> has disappeared. In fact, all of the runs that produce lists have gotten 
> faster by about the same amount.. Odd! A little digging reveals that 
> timeit turns off garbage collection to make things more repeatable. 
> Turning gc back on yields the following numbers for repeat(3,1):
>
>     retrieve1 [0.92517736192728406, 0.92109667569481601,
>     0.92390960303614023]
>     retrieve2 [1.3018456256311914, 1.2277141368525903, 1.2929785768861706]
>     retrieve3 [1.5309831277438946, 1.4998853206203577, 1.5601200711263488]
>     retrieve4 [8.6400394463542227, 8.7022300320292061, 8.6807761880350682]
>
> So there we are, back to our original numbers. This also reveals that 
> the majority of the time difference between retrieve1 and retrieve2 *is* 
> memory related. However, it's the deallocation (or more precisely 
> garbage collection) of all those floats that is the killer. 
I just realized that this sounds sort of misleading. In both cases a 
million floats are allocated and deallocated. However, in retrieve1 only 
two of those million are alive at any one time, so Python will just keep 
reusing the same two chunks of memory for all 500,000 pairs (ditto for 
the 500,000 tuples that are created). In the other cases, all million 
floats will be alive at once, requiring much more memory and possibly 
swapping to disk. Unsurprisingly, the second case is slower, but the 
details aren't clear. In particular why is it the deallocation that is slow?

Another mystery is why gc matters at all. None of the obvious actors are 
involved in cycles so they would normally go away due to reference 
counting even with gc turned off. My rather uninformed guess is that the 
cursor or the connection holds onto the list (caching it for later 
perhaps) and that cursor/connection is involved in some sort of cycle. 
This would keep the list alive until the gc ran.

-tim


[CHOP]




-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV




More information about the Numpy-discussion mailing list