[SciPy-Dev] Cython as build dependency, file/dll size and current issues

Ralf Gommers ralf.gommers@googlemail....
Thu Jul 5 14:53:50 CDT 2012


On Thu, Jul 5, 2012 at 9:45 PM, Matthew Brett <matthew.brett@gmail.com>wrote:

> Hi,
>
> On Thu, Jul 5, 2012 at 12:37 PM, Ralf Gommers
> <ralf.gommers@googlemail.com> wrote:
> >
> >
> > On Thu, Jul 5, 2012 at 9:26 PM, Matthew Brett <matthew.brett@gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> On Thu, Jul 5, 2012 at 12:18 PM, Ralf Gommers
> >> <ralf.gommers@googlemail.com> wrote:
> >> >
> >> >
> >> > On Thu, Jul 5, 2012 at 8:57 PM, Matthew Brett <
> matthew.brett@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> On Thu, Jul 5, 2012 at 11:35 AM, Ralf Gommers
> >> >> <ralf.gommers@googlemail.com> wrote:
> >> >> >
> >> >> >
> >> >> > On Thu, Jul 5, 2012 at 8:31 PM, Matthew Brett
> >> >> > <matthew.brett@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> On Thu, Jul 5, 2012 at 11:25 AM, Ralf Gommers
> >> >> >> <ralf.gommers@googlemail.com> wrote:
> >> >> >> > Hi all,
> >> >> >> >
> >> >> >> > On https://github.com/scipy/scipy/pull/261 the problem with
> large
> >> >> >> > size
> >> >> >> > of
> >> >> >> > generated C files from Cython came up again, and Matthew
> suggested
> >> >> >> > to
> >> >> >> > add
> >> >> >> > Cython as a build time dependency. He also pointed out that this
> >> >> >> > was
> >> >> >> > discussed before, with most people being in favor:
> >> >> >> >
> >> >> >> >
> http://mail.scipy.org/pipermail/scipy-dev/2009-November/013272.html
> >> >> >> >
> >> >> >> >
> http://mail.scipy.org/pipermail/scipy-dev/2009-November/013308.html
> >> >> >> > We discussed the same issue on
> >> >> >> > https://github.com/scipy/scipy/pull/211
> >> >> >> > recently, and also the size of the binary.
> >> >> >> >
> >> >> >> > This is probably also the right moment to point out other recent
> >> >> >> > Cython
> >> >> >> > issues we've had:
> >> >> >> > 1. A memoryview issue with Python 2.4, either a Cython or Numpy
> >> >> >> > bug:
> >> >> >> > https://github.com/numpy/numpy/pull/307
> >> >> >> > 2. We had to manually patch the generated C files when using
> >> >> >> > Cython
> >> >> >> > 0.16, to
> >> >> >> > make them work with MinGW:
> >> >> >> > http://projects.scipy.org/scipy/ticket/1673
> >> >> >> > 3. According to Ray, there's also an indexing bug in Cython 0.16
> >> >> >> > which
> >> >> >> > requires to use 0.17-dev for
> >> >> >> > https://github.com/scipy/scipy/pull/261
> >> >> >> >
> >> >> >> > I think it's clear that PR's like #261 above (Ray's
> ndimage.label
> >> >> >> > rewrite)
> >> >> >> > are in principle a good thing: faster and more general code
> which
> >> >> >> > is
> >> >> >> > easier
> >> >> >> > to maintain. Now the question is what to do though. Here's some
> >> >> >> > options
> >> >> >> > that
> >> >> >> > I see:
> >> >> >> >
> >> >> >> > a) Keep things as is for now. Accept large file/binary sizes.
> >> >> >> > Manually
> >> >> >> > patch
> >> >> >> > the generated C if necessary.
> >> >> >> > b) Keep things as is for now. Either go back to Cython 0.15, or
> >> >> >> > bump
> >> >> >> > required numpy version to latest dev version to not have to
> >> >> >> > manually
> >> >> >> > patch
> >> >> >> > the generated C files.
> >> >> >> > c) Keep things as they are now, without accepting too large
> >> >> >> > file/binary
> >> >> >> > sizes. To be defined what too large. Means we can't get the full
> >> >> >> > benefits of
> >> >> >> > fused types for example.
> >> >> >> > d) Move to Cython as a build dependency. Write down the required
> >> >> >> > versions
> >> >> >> > and incompatibilities in the docs.
> >> >> >> > e) Include a Cython version in the scipy git repo, patch it to
> >> >> >> > solve
> >> >> >> > the
> >> >> >> > above issues 2 and 3 (and any other ones that come along).
> >> >> >> > f) Some combination of the above.
> >> >> >> > g) Any other options?
> >> >> >>
> >> >> >> Am I right in thinking that Cython 0.17dev will generate usable C
> >> >> >> files without patching?
> >> >> >
> >> >> >
> >> >> > Yes.
> >> >>
> >> >> How about making Cython 0.17 a developer build-time dependency?
> >> >
> >> >
> >> > That's an option. Requiring a dev version will mean broken builds for
> >> > some
> >> > of the users that don't read the docs well but simply do "easyinstall
> >> > cython". I'm not sure how acceptable that is.
> >>
> >> We could surely raise an informative error for that case?  I hope that
> >> there won't be long before the 0.17 release - but we should check with
> >> the Cython folks.
> >
> >
> > True. Perhaps it's not a big issue.
> >>
> >>
> >> On the plus side, lowering the barrier to rewriting in Cython seems
> >> like a really big win, especially with memoryviews and fused types
> >> available.
> >
> >
> > Agreed about lower barrier and fused types.
> >
> > Memoryviews are still not OK, because of
> > https://github.com/numpy/numpy/pull/307.
>
> I'm afraid I didn't understand that discussion very well.   Does that
> only apply to python 2.4?   I had the impression we were dropping 2.4
> compatibility, but I may be remembering wrong.
>

Yes, only for python 2.4. But no, we're not dropping it. I'd like to, but
each time it's brought up the result is the same. I'd rather not mix that
discussion with this one.


>
> >>
> >> > That still leaves the (mostly orthogonal) question about binary size.
> I
> >> > just
> >> > built Ray's PR, _nd_label.so is 1.4 Mb. For one function.
> >>
> >> Hmm.   1.4 Mb seems OK to me for the binary
> >
> >
> > Really? For one function? If we do that for each function, we end up
> with 4
> > Gb.
> >
> >>
> >> - but I can see that we'd
> >> have to watch that.  Maybe it would be worth asking on the Cython list
> >> whether there is any way of reducing this, maybe by sharing across
> >> extensions.   How is the load time for that extension?
> >
> >
> > Very poor. (all hot cache):
> >
> >  $ time python -c ""
> > real    0m0.039s
> > user    0m0.017s
> > sys    0m0.017s
> >
> > $ time python -c "import numpy"
> > real    0m0.187s
> > user    0m0.080s
> > sys    0m0.100s
> >
> > $ time python -c "import _ni_label"
> > real    0m0.206s
> > user    0m0.081s
> > sys    0m0.109s
>
> I guess that's much slower than the original C extension?  I'd tend to
> prefer a slow loading but fast running and maintainable ndimage, but
> it's unfortunate we have to keep these tradeoffs in mind...
>

The whole ndimage module now imports in 0.280s. Not sure what the old
version with label() included was, but probably similar.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20120705/fcb63bc4/attachment-0001.html 


More information about the SciPy-Dev mailing list