[SciPy-Dev] Cython as build dependency, file/dll size and current issues
Ralf Gommers
ralf.gommers@googlemail....
Fri Jul 6 11:11:10 CDT 2012
> >> >> >> Am I right in thinking that Cython 0.17dev will generate usable C
> >> >> >> files without patching?
> >> >> >
> >> >> >
> >> >> > Yes.
> >> >>
> >> >> How about making Cython 0.17 a developer build-time dependency?
> >> >
> >> >
> >> > That's an option. Requiring a dev version will mean broken builds for
> >> > some
> >> > of the users that don't read the docs well but simply do "easyinstall
> >> > cython". I'm not sure how acceptable that is.
> >>
> >> We could surely raise an informative error for that case? I hope that
> >> there won't be long before the 0.17 release - but we should check with
> >> the Cython folks.
> >
> >
> > True. Perhaps it's not a big issue.
> >>
> >>
> >> On the plus side, lowering the barrier to rewriting in Cython seems
> >> like a really big win, especially with memoryviews and fused types
> >> available.
> >
> >
> > Agreed about lower barrier and fused types.
> >
> > Memoryviews are still not OK, because of
> > https://github.com/numpy/numpy/pull/307.
> >
> >>
> >> > That still leaves the (mostly orthogonal) question about binary size.
> I
> >> > just
> >> > built Ray's PR, _nd_label.so is 1.4 Mb. For one function.
> >>
> >> Hmm. 1.4 Mb seems OK to me for the binary
> >
> >
> > Really? For one function? If we do that for each function, we end up
> with 4
> > Gb.
> >
> >>
> >> - but I can see that we'd
> >> have to watch that. Maybe it would be worth asking on the Cython list
> >> whether there is any way of reducing this, maybe by sharing across
> >> extensions. How is the load time for that extension?
> >
> >
> > Very poor. (all hot cache):
> >
> > $ time python -c ""
> > real 0m0.039s
> > user 0m0.017s
> > sys 0m0.017s
> >
> > $ time python -c "import numpy"
> > real 0m0.187s
> > user 0m0.080s
> > sys 0m0.100s
> >
> > $ time python -c "import _ni_label"
> > real 0m0.206s
> > user 0m0.081s
> > sys 0m0.109s
>
> To be fair, the _ni_label module also imports numpy.
Sorry, I should have mentioned that.
> So the delta is around 0.019 s, still not great, but not as bad as the
> test seems to
> show. (Unless I was missing something, and 0.019 is actually that
> bad.)
>
Depends how you look at it. At the current rate of cythonizing, the damage
is probably fairly limited. Although 10% of the import time of numpy may
still be considered a problem by some.
If we'd convert a significant fraction of code to Cython though, this would
give a huge penalty on load time and memory usage. Scipy has a total of
1073 functions and objects at the moment - determined by the sum of
len(module.__all__) for all modules. Therefore 20 ms load time and O(100
kb) binary size for one function is a bit much.
Note that the above is not a criticism of your PR. _ni_label now has a
similar footprint to other Cython code in scipy, so this discussion
shouldn't hold up merging it in.
Ralf
