[Numpy-discussion] Announcing toydist, improving distribution and packaging situation

René Dudfield renesd@gmail....
Tue Dec 29 07:27:18 CST 2009


In the toydist proposal/release notes, I would address 'what does
toydist do better' more explicitly.

**** A big problem for science users is that numpy does not work with
pypi + (easy_install, buildout or pip) and python 2.6. ****

Working with the rest of the python community as much as possible is
likely a good goal.  At least getting numpy to work with the latest
tools would be great.

An interesting read is the history of python packaging here:

Buildout is what a lot of the python community are using now.  Getting
numpy to work nicely with buildout and pip would be a good start.
numpy used to work with buildout in python2.5, but not with 2.6.
buildout lets other team members get up to speed with a project by
running one command.  It installs things in the local directory, not
system wide.  So you can have different dependencies per project.

Plenty of good work is going on with python packaging.  Lots of the
python community are not using compiled packages however, so the
requirements are different.

There are a lot of people (thousands) working with the python
packaging system, improving it, and building tools around it.
Distribute for example has many committers, as do buildout/pip.  eg,
there are fifty or so buildout plugins, which people use to customise
their builds( see the buildout recipe list on pypi at
http://pypi.python.org/pypi?:action=browse&show=all&c=512 ).

There are build farms for windows packages and OSX uploaded to pypi.
Start uploading pre releases to pypi, and you get these for free (once
you make numpy compile out of the box on those compile farms).  There
are compile farms for other OSes too... like ubuntu/debian, macports
etc.  Some distributions even automatically download, compile and
package new releases once they spot a new file on your ftp/web site.

Speeding up the release cycle to be continuous can let people take
advantage of these tools built together.  If you get your tests
running after the build step, all of these distributions also turn
into test farms :)

pypm:  http://pypm.activestate.com/list-n.html#numpy
ubuntu PPA: https://launchpad.net/ubuntu/+ppas
the snakebite project : http://www.snakebite.org/ (seems mostly
dead... but they have a lot of hardware)
suse build service: https://build.opensuse.org/
pony-build: http://wiki.github.com/ctb/pony-build

zope, and pygame also have their own build/test farms.  They are two
other compiled python packages projects.  As do a number of other
python projects(eg twisted...).  Projects like pony-build should
hopefully make it easier for people to run their own build farms,
independently of the main projects.  You just really need a script to:
(download, build, test, post results), and then post a link to your
mailing list... and someone will be able to run a build farm.

Documentation projects are being worked on to document, give tutorials
and make python packaging be easier all round.  As witnessed by 20 or
so releases on pypi every day(and growing), lots of people are using
the python packaging tools successfully.  Documenting how people can
make numpy addon libraries(plugins) would encourage people to do so.
Currently there is no documentation from the numpy community, or
encouragement to do so.  This combined with numpy being broken with
python2.6+pypi will result in less science related packages.

There is still a whole magnitude of people not releasing on pypi
though, there are thousands of projects on the pygame.org website that
are not on the pypi website for example.  There are likely many
hundreds or thousands of scientific projects not listed on their
either.  Given all of these projects not on pypi, obviously things
could be improved.  The pygame.org website also shows that community
specific websites are very helpful.  A science view of pypi would make
it much more useful - so people don't have to look through
web/game/database etc packages.

Here is a view of 535 science/engineering related packages on pypi now:

458 science/research packages on pypi:

So there are already hundreds of science related packages and hundreds
of people making those science related packages for pypi.  Not too

Distribution of Applications is another issue that needs improving.
That is so that people can share applications without needing to
install a whole bunch of things.  Think about sending applications to
your grandma.  Do you ask her to download python, grab these
libraries, do this... do that.  It would be much better if you could
give her a url, and away you go!

Bug tracking, and diff tracking between distributions is an area where
many projects can improve.  Searching through the distributions bug
trackers, and diffs to apply to the core dramatically helps packages
getting updated.  So does maintaining good communication with
different distribution packagers.

I'm not sure making a separate build tool is a good idea.  I think
going with the rest of the python community, and improving the tools
there is a better idea.


pps. some notes on toydist itself.
- toydist convert is cool for people converting a setup.py .  This
means that most people can try out toydist right away.  but what does
it gain these people who convert their setup.py files?
- a toydist convert that generates a setup.py file might be cool :)
It could also generate a Makefile and a configure script :)
- arbitrary code execution happens when building or testing with
toydist.  However the source packaging part does not with toydist.
Compiling, running and testing the code happens most of the time
anyway, so moving the sandboxing to the OS is more useful as are
reviews, trust and reputation of different packages.
- it should be possible to build this toydist functionality as a
distutils/distribute/buildout extension.
- extending toydist?  How are extensions made?  there are 175 buildout
packages which extend buildout, and many that extend
distutils/setuptools - so extension of build tools in a necessary
- scripting builds in python for python developers is easier than
scripting a different new language.

More information about the NumPy-Discussion mailing list