[SciPy-user] Recommendations for Distribution Class

Tom Johnson tjhnson@gmail....
Thu Sep 20 04:05:02 CDT 2007


Hi,

I'd like to hear thoughts on a good representation for discrete probability
distributions.  Currently, I am using dictionaries as they can be sparse and
they give access the probabilities via keys (this is desired).  For example,

>>> p = {'a':.3,'c':.7}
>>> print p['a']

This is nice and fine, but I'd like to add more functionality.  For example,

>>> print p['b']
0
>>> q = scipy.log2(p)
# or perhaps q = p.aslog2()
>>> print q['b']
-inf

All this says that I should think about subclassing dict.  However, I also
want to be able to compute marginal distributions.  With a dictionary of
dictionaries, p['a']['b'], it is not convenient to sum over the second
index.  With two random variables, I can store two dictionaries to solve
this problem...but I need a general solution and N-dimensional scipy arrays
seem like a possibility. But alas, they are not sparse...and scipy.sparse is
only for matrices (?).

Finally, there is the question of a good representation for conditional
probabilities.

Any thoughts on this would be very helpful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20070920/5c57842d/attachment.html 


More information about the SciPy-user mailing list