[SciPy-User] Proposal for a new data analysis toolbox

Sebastian Haase seb.haase@gmail....
Thu Nov 25 02:30:15 CST 2010

On Thu, Nov 25, 2010 at 8:32 AM, David <david@silveregg.co.jp> wrote:
> On 11/25/2010 03:40 PM, Dag Sverre Seljebotn wrote:
>> On 11/24/2010 07:09 PM, Matthew Brett wrote:
>>> Hi,
>>> On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn
>>> <dagss@student.matnat.uio.no>   wrote:
>>>> For the time being, for something like this I'd definitely go with a
>>>> template language to generate Cython code if you are not already. Myself
>>>> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in
>>>> extension and it works pretty well. Using Bento one can probably chain
>>>> Tempita so that this gets built automatically (but I haven't tried that
>>>> yet).
>>> Thanks for the update - it's excellent news that you are working on
>>> this.  If you ever have spare time, would you consider writing up your
>>> experiences in a blog post or similar?  I'm sure it would be very
>>> useful for the rest of us who have idly thought we'd like to do this,
>>> and then started waiting for someone with more expertise to do it...
>> I don't have a blog, and it'd take too much time to create one, but
>> here's something less polished:
>> What I'm really doing is to modify fwrap so that it detects functions
>> with the same functionality (but different types) in the LAPACK wrapper
>> in scipy.linalg, and emits a Cython template for that family of
>> functions. But I'll try to step into your shoes here.
>> There's A LOT of template engines out there. I chose Tempita, which has
>> the advantages of a) being recommended by Robert Kern, b) pure Python,
>> no compiled code, c) very small and simple so that it can potentially be
>> bundled with other projects in the build system without a problem.
>> Then, simply write templated code like the following. It becomes less
>> clear to read, but a lot easier to fix bugs etc. when they must only be
>> fixed in one spot.
>> {{py:
>> dtype_values = ['np.float32', 'np.float64', 'np.complex64', 'np.complex128']
>> dtype_t_values = ['%s_t' % x for x in dtype_values]
>> funcletter_values = ['f', 'd', 'c', 'z']
>> NDIM_MAX = 5
>> }}
>> ...
>> {{for ndim in range(5}}
>> {{for dtype, dtype_t, funcletter in zip(dtype_values, dtype_t_values,
>> funcletter_values)}}
>> def {{prefix}}sum_{{ndim}}{{funcletter}}(np.ndarray[{{dtype_t}},
>> ndim={{ndim}}] x,
>> np.ndarray[{{dtype_t}}, ndim={{ndim}}] y,
>> np.ndarray[{{dtype_t}}, ndim={{ndim}}] out=None):
>>         ... and so on...inside here everything looks about the same as
>> normal...
>> {{endfor}}
>> {{endfor}}
>> For integrating this into a build, David C.'s Bento is probably the best
>> way once a bug is fixed (see recent "Cython distutils" thread on
>> cython-dev where this is specifically discussed, and David points to
>> examples in the Bento distribution). For my work on fwrap I use the
>> "waf" build tool, where it is a simple matter of:
>> def run_tempita(task):
>>       import tempita
>>       assert len(task.inputs) == len(task.outputs) == 1
>>       tmpl = task.inputs[0].read()
>>       result = tempita.sub(tmpl)
>>       task.outputs[0].write(result)
>> ...
>> bld(
>>           name = 'tempita',
>>           rule = run_tempita,
>>           source = ['foo.pyx.in'],
>>           target = ['foo.pyx']
>>           )
> You may want to look at the flex example in waf tools subdir to see how
> to chain builders together.
> As for bento, I unfortunately won't be able to work on it much if at all
> until the end of the year, so I don't think I will have time to fix the
> issue until then,
> cheers,
> David

As I mentioned, I have a setup based on SWIG: it allows me to do most
of the heavy-lifting using SWIG's C++-template support, to make
"general" functions that support a multiple dtypes. With the help of a
C preprocessor macro it instantiates the functions (which is needed
for builtind dynamic libs) for a standard set of dtypes - for my image
processing needs I have: uint8, uint16, int16, int32, float32,
float64, and long -- this is also a compromise to get the dlls bloated
with dypes I never use ( and e.g. bool can be casted in a python
wrapper to uint8).
My point here, is that as far as I know cython is missing such a
template support, right ? -- how hard would it be to add this,
concentrating on this special purpose of dtype support ?


More information about the SciPy-User mailing list