[SciPy-User] Proposal for a new data analysis toolbox

David david@silveregg.co...
Thu Nov 25 01:32:39 CST 2010


On 11/25/2010 03:40 PM, Dag Sverre Seljebotn wrote:
> On 11/24/2010 07:09 PM, Matthew Brett wrote:
>> Hi,
>>
>> On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn
>> <dagss@student.matnat.uio.no>   wrote:
>>
>>
>>> For the time being, for something like this I'd definitely go with a
>>> template language to generate Cython code if you are not already. Myself
>>> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in
>>> extension and it works pretty well. Using Bento one can probably chain
>>> Tempita so that this gets built automatically (but I haven't tried that
>>> yet).
>>>
>> Thanks for the update - it's excellent news that you are working on
>> this.  If you ever have spare time, would you consider writing up your
>> experiences in a blog post or similar?  I'm sure it would be very
>> useful for the rest of us who have idly thought we'd like to do this,
>> and then started waiting for someone with more expertise to do it...
>>
>
> I don't have a blog, and it'd take too much time to create one, but
> here's something less polished:
>
> What I'm really doing is to modify fwrap so that it detects functions
> with the same functionality (but different types) in the LAPACK wrapper
> in scipy.linalg, and emits a Cython template for that family of
> functions. But I'll try to step into your shoes here.
>
> There's A LOT of template engines out there. I chose Tempita, which has
> the advantages of a) being recommended by Robert Kern, b) pure Python,
> no compiled code, c) very small and simple so that it can potentially be
> bundled with other projects in the build system without a problem.
>
> Then, simply write templated code like the following. It becomes less
> clear to read, but a lot easier to fix bugs etc. when they must only be
> fixed in one spot.
>
> {{py:
> dtype_values = ['np.float32', 'np.float64', 'np.complex64', 'np.complex128']
> dtype_t_values = ['%s_t' % x for x in dtype_values]
> funcletter_values = ['f', 'd', 'c', 'z']
> NDIM_MAX = 5
> }}
>
> ...
>
> {{for ndim in range(5}}
> {{for dtype, dtype_t, funcletter in zip(dtype_values, dtype_t_values,
> funcletter_values)}}
> def {{prefix}}sum_{{ndim}}{{funcletter}}(np.ndarray[{{dtype_t}},
> ndim={{ndim}}] x,
>
> np.ndarray[{{dtype_t}}, ndim={{ndim}}] y,
>
> np.ndarray[{{dtype_t}}, ndim={{ndim}}] out=None):
>         ... and so on...inside here everything looks about the same as
> normal...
> {{endfor}}
> {{endfor}}
>
>
> For integrating this into a build, David C.'s Bento is probably the best
> way once a bug is fixed (see recent "Cython distutils" thread on
> cython-dev where this is specifically discussed, and David points to
> examples in the Bento distribution). For my work on fwrap I use the
> "waf" build tool, where it is a simple matter of:
>
> def run_tempita(task):
>       import tempita
>       assert len(task.inputs) == len(task.outputs) == 1
>       tmpl = task.inputs[0].read()
>       result = tempita.sub(tmpl)
>       task.outputs[0].write(result)
>
> ...
> bld(
>           name = 'tempita',
>           rule = run_tempita,
>           source = ['foo.pyx.in'],
>           target = ['foo.pyx']
>           )

You may want to look at the flex example in waf tools subdir to see how 
to chain builders together.

As for bento, I unfortunately won't be able to work on it much if at all 
until the end of the year, so I don't think I will have time to fix the 
issue until then,

cheers,

David


More information about the SciPy-User mailing list