[Numpy-discussion] Emulate left outer join?

David Carmean dlc@halibut....
Tue Feb 9 15:52:07 CST 2010


I've been working with numpy for less than a month, having learned about 
it after finding matplotlib.  My foundation in things like set theory is...
weak to nonexistent, so I need a little help mapping sql-like thoughts into 
set-theory thinking :)

Some context to help me explain:  I'm trying to store, chart, and analyze 
unix system performance data (sar/sadf output).  On a typical system I have 
about 75 fields/variables, all floats, with identical timestamps... or so 
we hope.   What I want to do in order to save memory/disk space is to stack 
the timeseries data all into three or four different arrays, and use a single 
timestamp field for each set.

My problem is: I don't know that I can guarantee that the shape of all the 
individual arrays will be identical along the time axis.  I may receive 
truncated textfiles to parse, or new variables may appear and disappear from 
the set being reported/recorded.

If these were in flat files or database tables, I'd do a left outer join between 
a master timestamp table and each individual variable's table.   But... I don't 
know the keywords to search for in the numpy docs/web chatter.  A thread from 
just about one year ago left the question hanging:


Examples? Pointers?  Shoves toward the correct sections of the docs?


More information about the NumPy-Discussion mailing list