[Numpy-discussion] Merging the refactor.

Pauli Virtanen pav@iki...
Thu Nov 11 15:08:55 CST 2010

On Thu, 11 Nov 2010 12:38:53 -0700, Charles R Harris wrote:
> I'd like to open a discussion about the steps to be followed in merging
> the numpy refactor. I have two concerns about this. First, the refactor
> repository branched off some time ago and I'm concerned about code
> divergence, not just in the refactoring, but in fixes going into the
> master branch on github. Second, it is likely that a flag day will look
> like the easiest solution and I think we should avoid that. 

What is a "flag day"?

> At the moment it seems to me that the changes can be broken up into
> three categories:
> 1) Movement of files and resulting changes to the build process. 
> 2) Refactoring of the files for CPython. 
> 3) Addition of an IronPython interface.
> I'd like to see 1) go into the master branch as soon as possible,
> followed by 2) so that the changes can be tested and fixes will go into
> a common repository. The main github repository can then be branched for
> adding the IronPython stuff. In short, I think it would be usefull to
> abandon the teoliphant fork at some point and let the work continue in a
> fork of the numpy repository.

The first step I would like to see is to re-graft the teoliphant branch 
onto the current Git history -- currently, it's still based on Git-SVN. 
Re-grafting would make incremental merging and tracking easier. Luckily, 
this is easy to do thanks to Git's data model (I have a script for it),
and I believe it could be useful to do it ASAP.

Pauli Virtanen

# Graft changesets $OLD_START..$OLD_BRANCH onto $NEW_START, into a branch

set -e

OLD_START=7e1e5da84fc110936035660974167cd33f9e4831   # last SVN commit in old repo
NEW_START=b056b23a27fe4f56f923168bb9931429765084d1   # corresponding Git commit in new repo


run() { echo "$ $@"; "$@"; }

if git remote|grep -q numpy-upstream; then
    run git remote add numpy-upstream git://github.com/numpy/numpy.git
    run git fetch numpy-upstream

run git checkout $OLD_BRANCH
run git branch -D $NEW_BRANCH || true
run git checkout -b $NEW_BRANCH $OLD_BRANCH

# Refilter
# - reparent the root commits
# - prune unnecessary (and huge) .sdf files from history
rm -rf .git/refs/original
run git filter-branch \
    --index-filter 'git rm --cached --ignore-unmatch **.sdf' \
    --parent-filter "sed -e '
	s/-p $OLD_START/-p $NEW_START/g; 
	s/-p c3f10ec730a5d066838b10cd7f6c9c104eb9f1cf/-p a839a427939f0c29fe4757011f86bb068ab66569/g;
	'" \

# Make a few sanity checks

git diff $OLD_START $OLD_BRANCH > old.diff
git diff $NEW_START $NEW_BRANCH > new.diff
git diff $NEW_BRANCH $OLD_BRANCH > heads.diff

test -s heads.diff && { echo "ERROR: New heads do not match!"; exit 1; }
diff -u old.diff new.diff || { echo "ERROR: patches from start do not match!"; exit 1; }
(git log $NEW_BRANCH | grep -q 'git-svn-id') && { echo "ERROR: some git-svn-id commits remain in the history"; exit 1; }

echo "Everything seems OK!"

More information about the NumPy-Discussion mailing list