[SciPy-user] Making faster statistical distributions
John Hunter
jdhunter at ace.bsd.uchicago.edu
Thu Jan 29 09:54:44 CST 2004
>>>>> "Christopher" == Christopher Fonnesbeck <chris at fonnesbeck.org> writes:
Christopher> I am already using pieces of SciPy in my Markov chain
Christopher> Monte Carlo package (PyMC), mostly for plotting
Christopher> functionality. I would also like to exploit the
Christopher> distributions implemented in scipy.stats, but they
Christopher> are far too slow for use in statistical simulation
Christopher> applications like MCMC, where millions of random
Christopher> draws may be taken. Therefore, I am thinking of
Christopher> implementing many of these distributions (at least
Christopher> the common ones) as C or Fortran extensions. I am
Christopher> unsure whether to use Fortran through f2py for this
Christopher> task, or C through weave.inline (for example). I have
Christopher> used both in the past for various tasks, and was
Christopher> generally happy with both. Any suggestions?
I've wrote some code to fill Numeric arrays with levy distributions
from gsl. GSL has many, many distributions -
http://www.gnu.org/software/gsl/manual/gsl-ref_19.html#SEC284
Here's a sample test script
from levy import randlevy, seed
gamma = 0.003501
alpha = 1.341702
seed(1235)
x = randlevy(100000, alpha, gamma)
print min(x), max(x)
Seeding is optional - the rng is seeded by the clock if you don't
explicitly call seed. Below is the module code which you can use as a
template if you're interested. I think it would be nice to extend
this code to include all the gsl distributions, but haven't taken the
time. pygsl wraps these distributions, but suffers from the repeated
call overhead that is hurting you. It would be a natural extension of
pygsl to include initializing numeric arrays from the distributions.
#include "Python.h"
#include "Numeric/arrayobject.h"
#include <stdio.h>
#include <time.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_errno.h>
static PyObject *ErrorObject;
// get x[i] from a 1d array of type xtype
#define get1d(x,i,xtype) \
*(xtype *)(x->data+i*x->strides[0])
int _levymodRNGSeeded = 0;
gsl_rng * _levymodRNG;
static PyObject *
seed(PyObject *self, PyObject *args)
{
const gsl_rng_type * T;
int N;
T = gsl_rng_default;
_levymodRNG = gsl_rng_alloc(T);
gsl_rng_env_setup();
if (args==Py_None) {
//printf("Default seed\n");
time_t t1;
(void) time(&t1);
srand48((long) t1);
N = (int)t1;
}
else if (!PyArg_ParseTuple(args, "i", &N))
return NULL;
//printf("Setting seed %d\n", N);
gsl_rng_set (_levymodRNG, N);
_levymodRNGSeeded = 1;
Py_INCREF(Py_None);
return Py_None;
}
static PyObject *
randlevy(PyObject *self, PyObject *args)
{
if (_levymodRNGSeeded == 0) {
seed(self, Py_None);
}
int dimensions[1];
int i;
PyArrayObject *x;
int N;
float alpha;
float gamma;
if (!PyArg_ParseTuple(args, "iff", &N, &alpha, &gamma))
return NULL;
dimensions[0] = N;
x = (PyArrayObject *)PyArray_FromDims(1,dimensions,PyArray_FLOAT);
for (i=0; i<N; ++i) {
get1d(x,i,float) = gsl_ran_levy(_levymodRNG, gamma, alpha);
}
return Py_BuildValue("N", x);
}
static struct PyMethodDef _levy_methods[] = {
{"randlevy", (PyCFunction)randlevy, METH_VARARGS,
"Return a numeric array of levy numbers."},
{"seed", (PyCFunction)seed, METH_VARARGS,
"Seed the random number generator."},
{NULL, NULL} /* sentinel */
};
DL_EXPORT(void) init_levy(void)
{
PyObject *m, *d;
/* Create the module and add the functions */
m = Py_InitModule4("_levy", _levy_methods,
"Fill Numeric arrays with random numbers ",
(PyObject*)NULL,PYTHON_API_VERSION);
/* Import the array object */
import_array();
/* Add some symbolic constants to the module */
d = PyModule_GetDict(m);
ErrorObject = PyString_FromString("_levy.error");
PyDict_SetItemString(d, "error", ErrorObject);
/* Check for errors */
if (PyErr_Occurred())
Py_FatalError("can't initialize module _levy");
}
More information about the SciPy-user
mailing list