# [Numpy-discussion] [ANN]: Taylorpoly, an implementation of vectorized Taylor polynomial operations and request for opinions

Sebastian Walter sebastian.walter@gmail....
Sun Feb 28 08:22:16 CST 2010

On Sun, Feb 28, 2010 at 12:30 AM, Friedrich Romstedt
<friedrichromstedt@gmail.com> wrote:
> 2010/2/27 Sebastian Walter <sebastian.walter@gmail.com>:
>> I'm sorry this comment turns out to be confusing.
>
> Maybe it's not important.
>
>> It has apparently quite the contrary effect of what I wanted to achieve:
>> Since there is already a polynomial module  in numpy I wanted to
>> highlight their difference
>> and stress that they are used to do arithmetic, e.g. compute
>>
>> f([x],[y]) = [x] * (sin([x])**2 + [y])
>>
>> in Taylor arithmetic.
>
> That's cool!  You didn't mention that.  Now I step by step find out
> what your module (package?) is for.  You are a mathematician?  Many
> physicists complain that mathematicians cannot make their point ;-)

I studied physics but switchted to applied maths.

>
> I think I can use that to make my upy accept arbitrary functions, but
> how do you apply sin() to a TTP?

perform truncated Taylor expansion of  [y]_D = sin([x]_D), i.e.
y_d = d^d/dt^d  sin( \sum_{k=0}^{D-1} x_k t^k) |_{t=0}
to obtain an explicit algorithm.

>
> One more question: You said, t is an "external paramter".  I, and
> maybe not only me, interpreted this as a complicated name for
> "variable".  So I assumed it will be a parameter to some method of the
> TTP.  But it isn't?  It's just the way to define the ring?  You could
> define it the same in Fourier space, except that you have to make the
> array large enough from the beginning?  Why not doing that, and
> saying, your computation relies on the Fourier transform of the
> representation?  Can this give insight why TTPs are a ring and why
> they have zero divisors?
it has zero divisors because for instance multiplication of the two
polynomials t*t^{D-1}
truncated at t^{D-1} yields is zero.

>
>>>>>  In fact, you /have/ to provide
>>>>> external binary operators, because I guess you also want to have
>>>>> numpy.ndarrays as left operand.  In that case, the undarray will have
>>>>> higher precedence, and will treat your data structure as a scalar,
>>>>> applying it to all the undarray's elements.
>>>>
>>>> well, actually it should treat it as a scalar since the Taylor
>>>> polynomial is something like a real or complex number.
>>>
>>> Maybe I misunderstood you completely, but didn't you want to implement
>>> arrays of polynomials using numpy?  So I guess you want to apply a
>>> vector from numpy pairwise to the polynomials in the P-object?
>>
>> no, the vectorization is something different. It's purpose becomes
>> only clear when applied in Algorithmic Differentiation.
>
> Hey folks, here's a cool package, but the maintainer didn't tell us! ;-)

well, thanks :)

>
>>  E.g. if you have a function
>> f: R^N -> R
>> x -> y=f(x)
>> where x = [x1,...,xN]
>>
>> and you want to compute the gradient g(x) of f(x), then you can compute
>> df(x)/dxn by propagating  the following array of Taylor polynomials:
>>
>> x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ...,
>> UTPS([xN_0,0]), dtype=object)
>> y = f(x)
>
> So what is the result of applying f to some UTPS instance, is it a
> plain number, is an UTPS again?  How do you calculate?
>
> Can one calculate the derivative of some function using your method at
> a certain point without knowledge of the analytical derivative?  I
> guess that's the purpose.
Yes, that's the whole point: Obtaining (higher order) derivatives of
functions at machine precision for which no symbolic representation is
That includes computer codes with recursions (e.g. for loops) that are
a no-go for symbolic differentiation. Supposedly (I've never done
that) you can even differentiate Monte Carlo simulations in that way.

>
>> if you want to have the complete gradient, you will have to repeat N
>> times. Each time for the same zero'th coefficients [x1,...,xN].
>>
>> Using the vecorized version, you would do only one propagation
>> x = numpy.array( UTPS([x1_0, 1,0,...,0]), ..., UTPS([xn_0,
>> 0,...,1,...0]), ..., UTPS([xN_0,0,....,1]), dtype=object)
>> y = f(x)
>>
>> i.e. you save the overhead of calling the same function N times.
>
> Ok, I understand.  Today it's too late, I will reason tomorrow about it.
>
>>> Why don't you use multidimensional arrays?  Has it reasons in the C
>>> accessibility?  Now, as I see it, you implement your strides manually.
>>>  With a multidimensional array, you could even create arrays of shape
>>> (10, 12) of D-polynomials by storing an ndarray of shape (10, 12, D)
>>> or (D, 10, 12).
>>
>> the goal is to have several Taylor polynomials evaluated in the same
>> base point, e.g.
>> [x_0, x_{11}, x_{21}, x_{31}]
>> [x_0, x_{12}, x_{22}, x_{32}]
>> [x_0, x_{13}, x_{23}, x_{33}]
>>
>> i.e. P=3, D=3
>> One could use an (P,D) array. However, one would do unnecessary
>> computations since x_0 is the same for all P polynomials.
>> I.e. one implements the data structure as
>> [x]_{D,P} := [x_0, x_{1,1},...,x_{1,D-1},x_{2,1},...,x_{P,D-1}].
>>
>> This results in a non-const stride access.
>
> No?  I think, the D axis stride shold be P with offset 1 and the P
> axis stride 1?  Is there a specific reason to store it raveled?  And
> not x_0 and ndarray(shape = (P, D - 1))?

1) cosmetic reasons
2) easier to interface C to Python.

>
>>> Furthermore, I believe there is some elegant way formulating the
>>> product for a single polynomial.  Let me think a bit ...
>>>
>>> For one entry of [z]_E, you want to sum up all pairs:
>>>    x_{0} y{E} + ... + x{D} y{E - D} ,     (1)
>>> right?  In a matrix, containing all mutual products x_{i} y{j}, this
>>> are diagonals.  Rotating the matrix by 45° counterclockwise, they are
>>> sums in columns.  Hmmm... how to rotate a matrix by 45°?
>>>
>>> Another fresh look: (1) looks pretty much like the discrete
>>> convolution of x_{i} and y_{i} at argument E.  Going into Fourier
>>> space, this becomes a product.  x_i and y_i have finite support,
>>> because one sets x_{i} and y_{i} = 0 outside 0 <= i <= D.  The support
>>> of (1) in the running index is at most [0, D].  The support in E is at
>>> most [0, 2 D].  Thus you don't make mistakes by using DFT, when you
>>> pad x and y by D zeros on the right.  My strategy is: Push the padded
>>> versions of x and y into Fourier space, calculate their product,
>>> transform back, cut away the D last entries, and you're done!
>>>
>>> I guess you mainly want to parallelise the calculation instead of
>>> improving the singleton calculation itself?  So then, even non-FFT
>>> would incorporate 2 D explicit python sums for Fourier transform, and
>>> again 2 D sums for transforming back, is this an improvement over the
>>> method you are using currently?  And you could code it in pure Python
>>> :-)
>>>
>>> I will investigate this tonight.  I'm curious.  Irrespective of that I
>>> have also other fish to fry ... ;-)
>>>
>>> iirc, in fact DFT is used for efficient polynomial multiplication.
>>> Maybe Chuck or another core developer of numpy can tell whether
>>> numpy.polynomial does it actually that way?
>>
>> I believe the degree D is typically much to small (i.e. D <= 4) to
>> though there may be cases when really high order polynomials are used.
>
> I guess it's not overhead.  The number of calculations should be in
> equilibrium at very low D, am I wrong?  And you win to not have to
> compile a C library but only native text Python code.
Well, I'm really no expert on the DFT. But doesn't the  DFT compute on
the complex numbers? you'll have extra overhead (let's say factor >=
2?)
And as far as I can tell,  you do the computations on padded arrays
which possibly introduces cache misses (maybe another factor 2?)
Isn't the advantage of the DFT not that you can use the FFT which
would reduce the runtime from O(D^2) to O(D log(D))?
I'm pretty sure that only pays off for D larger than 10.

> E.g., your
> optimisation package is quite interesting for me, but I'm using
> Windows as my main system, so it will be painful to compile.  And the
> code is more straightforward, more concise, easier to maintain and
> easier to understand, ... :-)  I really do not want to diminish your
> programming skills, please do not misunderstand!  I only mean the
> subject.
The project uses scons which is available for windows as binaries.
I haven't tried it myself but I'm confident that it's a 1 minutes job
on windows.

I have implemented some of the algorithms just as you explained in
another package
(http://github.com/b45ch1/algopy/blob/master/algopy/utp/utps.py).
But I don't think the code looks easier to maintain than the C code
and it's also slower.

>
> Friedrich
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>