[Numpy-discussion] Emulate left outer join?
Tue Feb 9 15:52:07 CST 2010
I've been working with numpy for less than a month, having learned about
it after finding matplotlib. My foundation in things like set theory is...
weak to nonexistent, so I need a little help mapping sql-like thoughts into
set-theory thinking :)
Some context to help me explain: I'm trying to store, chart, and analyze
unix system performance data (sar/sadf output). On a typical system I have
about 75 fields/variables, all floats, with identical timestamps... or so
we hope. What I want to do in order to save memory/disk space is to stack
the timeseries data all into three or four different arrays, and use a single
timestamp field for each set.
My problem is: I don't know that I can guarantee that the shape of all the
individual arrays will be identical along the time axis. I may receive
truncated textfiles to parse, or new variables may appear and disappear from
the set being reported/recorded.
If these were in flat files or database tables, I'd do a left outer join between
a master timestamp table and each individual variable's table. But... I don't
know the keywords to search for in the numpy docs/web chatter. A thread from
just about one year ago left the question hanging:
Examples? Pointers? Shoves toward the correct sections of the docs?
More information about the NumPy-Discussion