[SciPy-dev] Module Submission: Orthogonal Distance Regression

Robert Kern kern at caltech.edu
Wed Nov 20 02:58:44 CST 2002

```On Tue, Nov 19, 2002 at 06:25:45PM +0200, Pearu Peterson wrote:

[snip]

> Here follows more SciPy issues:
>  - odrpack wrapper will probably work fine with g77 compiler but not
>    necessarily with  other compilers (this is related to Fortran names in
>    C issue). This can be easily solved using F_FUNC macro though.

I'll look into this.

>  - odrpack wraper is handwritten and the code base is quite large. This
>    may cause some maintainance issues.
>  - Using the wrapper is sensitive to Fortran/C multi-dimensional array
>    issue (the different data ordering stuff).
>
> I think that the last one is the most serious one. Probably the
> easiest way to solve this is to use f2py generated wrappers. This would
> also reduce the maintainance issues.

There are a couple of reasons why I didn't use f2py. I do have an f2py
version that mostly works, however,

* ODRPACK allows "structured arguments." For example, say we have 5
observations of a 2-dimensional input variable and each observation
has the same weights, but the second dimension is weighted twice as
much as the first, we can either input wd=[1,2] or
wd=[[1,1,1,1,1],[2,2,2,2,2]]. The flexibility of these arguments is
difficult to recreate with f2py. My f2py implementation expands these
arguments in Python to full arrays.

* With my usage of the module, I prefer using arrays already in FORTRAN
order. N=number of observations, M=number of dimensions per
observation in the input variable. ODRPACK expects (N,M) FORTRAN
arrays which are (M,N) Python arrays. These get passed to the model
functions. For the most part, one wants to implement y=f(x) without
explicit looping over each observation; I find it convenient to use
x[i] rather than x[:,i]. I have to put explicit transpose's to get
the f2py'd code to accept the arrays I'm passing around. I think
there's some combination of options to get f2py to do what I want,
but I haven't found it yet.

"N-first" indexing works well with Python's slicing abilities.

* The FORTRAN driver routine takes a single function as a callback.
This function calculates either the function, the Jacobian wrt
parameters or the Jacobian wrt the input variables based on an
integer argument. The f2py implementation generates a Python function
which calls the appropriate Python functions (if supplied, exception
if not provided; the Jacobians are optional). The handmade
implementation has a C function that does the same thing.

* This is the showstopper: one of the examples exhibits numerical
problems in the f2py version and none other. I have no explanation,
but one of the Jacobians is consistently wrong and generates
results that are incorrect beyond machine eps. The driver routine can
check the provided Jacobians by numerically calculating the
derivatives; they are flagged as suspect.

> Otherwise, odrpack looks good and its proper place would probably be the
> interpolate module. I think that we should consider odrpack inclusion
> to SciPy after releasing SciPy-0.2 as also the interpolate module needs
> some revison and odrpack can be added while doing that.

I would have thought optimize. It does various forms of nonlinear least
squares. I agree about waiting until post-SciPy-0.2.

> Thanks,
> 	Pearu

--
Robert Kern
Ruddock House President
kern at caltech.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

```