[Numpy-discussion] Created NumPy 1.7.x branch
Charles R Harris
Fri Jun 22 22:14:05 CDT 2012
On Fri, Jun 22, 2012 at 2:42 PM, Travis Oliphant <firstname.lastname@example.org>wrote:
> The usual practice is to announce a schedule first.
> I just did announce the schedule.
What has been done in the past is that an intent to fork is announced some
two weeks in advance so that people can weigh in on what needs to be done
before the fork. The immediate fork was a bit hasty. Likewise, when I
suggested going to the github issue tracking, I opened a discussion on
needed tags, but voila, there it was with an incomplete set and no
discussion. That to seemed hasty.
>> There is time before the first Release candidate to make changes on the
>> 1.7.x branch. If you want to make the changes on master, and just
>> indicate the Pull requests, Ondrej can make sure they are added to the
>> 1.7.x. branch by Monday. We can also delay the first Release Candidate
>> by a few days to next Wednesday and then bump everything 3 days if that
>> will help. There will be a follow-on 1.8 release before the end of the
>> year --- so there is time to make changes for that release as well. The
>> next release will not take a year to get out, so we shouldn't feel
>> pressured to get *everything* in this release.
> What are we going to do for 1.8?
> Let's get 1.7 out the door first.
Mark proposed a schedule for the next several releases, I'd like to know if
we are going to follow it.
> Yes, the functions will give warnings otherwise.
> I think this needs to be revisited. I don't think these changes are
> necessary for *every* use of macros. It can cause a lot of effort for
> people downstream without concrete benefit.
The idea is to slowly move towards hiding the innards of the array type.
This has been under discussion since 1.3 came out. It is certainly the case
that not all macros need to go away.
>> That's not as nice to type.
> So? The point is to have correctness, not ease of typing.
> I'm not sure if a pun was intended there or not. C is not a safe and
> fully-typed system. That is one of its weaknesses according to many.
> But, I would submit that not being forced to give everything a "type" (and
> recognizing the tradeoffs that implies) is also one reason it gets used.
C was famous for bugs due to the lack of function prototypes. This was
fixed with C99 and the stricter typing was a great help.
>> Is that assuming that PyArray_NDIM will become a function and need a
>> specific object type for its argument (and everything else cast....).
>> That's one clear disadvantage of inline functions versus macros in my mind:
>> no automatic polymorphism.
> That's a disadvantage of Python. The virtue of inline functions is
> precisely type checking.
> Right, but we need to be more conscientious about this. Not every use of
> Macros should be replaced by inline function calls and the requisite
> *forced* type-checking. type-chekcing is not *universally* a virtue ---
> if it were, nobody would use Python.
>> I don't think type safety is a big win for macros like these. We need
>> to be more judicious about which macros are scheduled for function
>> inlining. Some just don't benefit from the type-safety implications as
>> much as others do, and you end up requiring everyone to change their code
>> downstream for no real reason.
>> These sorts of changes really feel to me like unnecessary spelling
>> changes that require work from extension writers who now have to modify
>> their code with no real gain. There seems to be a lot of that going on in
>> the code base and I'm not really convinced that it's useful for end-users.
> Good style and type checking are useful. Numpy needs more of both.
> You can assert it, but it doesn't make it so. "Good style" depends on
> what you are trying to accomplish and on your point of view. NumPy's style
> is not the product of one person, it's been adapted from multiple styles
> and inherits quite a bit from Python's style. I don't make any claims for
> it other than it allowed me to write it with the time and experience I had
> 7 years ago. We obviously disagree about this point. I'm sorry about
> that. I'm pretty flexible usually --- that's probably one of your big
> criticisms of my "style".
Curiously, my criticism would be more that you are inflexible, slow to
change old habits.
> But, one of the things I feel quite strongly about is how hard we make it
> for NumPy users to upgrade. There are two specific things I disagree
> with pretty strongly:
> 1) Changing defined macros that should work the same on PyArrayObjects or
> PyObjects to now *require* types --- if we want to introduce new macros
> that require types than we can --- as long as it just provides warnings but
> still compiles then I suppose I could find this acceptable.
> 2) Changing MACROS to require semicolons when they were previously not
> needed. I'm going to be very hard-nosed about this one.
>> I'm going to be a lot more resistant to that sort of change in the code
>> base when I see it.
> Numpy is a team effort. There are people out there who write better code
> than you do, you should learn from them.
> Exactly! It's a team effort. I'm part of that team as well, and while I
> don't always have strong opinions about things. When I do, I'm going to
> voice it.
> I've learned long ago there are people that write better code than me.
> There are people that write better code than you.
Of course. Writing code is not my profession, and even if it were, there
are people out there who would be immeasurable better. I have tried to
improve my style over the years by reading books and browsing code by
people who are better than me. I also recognize common bad habits naive
coders tend to pick up when they start out, not least because I have at one
time or another had many of the same bad habits.
That is not the question here at all. The question here is not
> requiring a *re-write* of code in order to get their extensions to compile
> using NumPy headers. We should not be making people change their code to
> get their extensions to compile in NumPy 1.X
I think a bit of rewrite here and there along the way is more palatable
than a big change coming in as one big lump, especially if the changes are
done with a long term goal in mind. We are working towards a Numpy 2, but
we can't just go off for a year or two and write it, we have to get there
step by step. And that requires a plan.
>> One particularly glaring example to my lens on the world: I think it
>> would have been better to define new macros which require semicolons than
>> changing the macros that don't require semicolons to now require
>> That feels like a gratuitous style change that will force users of those
>> macros to re-write their code.
> It doesn't seem to be much of a problem.
> Unfortunately, I don't trust your judgment on that. My experience and
> understanding tells a much different story. I'm sorry if you disagree
> with me.
I'm sorry I made you sorry ;) The problem here is that you don't come forth
with specifics. People tell you things, but you don't say who or what their
specific problem was. Part of working with a team is keeping folks
informed, it isn't that useful to appeal to authority. I watch the list,
which is admittedly a small window into the community, and I haven't seen
show stoppers. Bugs, sure, but that isn't the same thing.
>> Sure, it's a simple change, but it's a simple change that doesn't do
>> anything for you as an end user. I think I'm going to back this change
>> out, in fact. I can't see requiring people to change their C-code like
>> this will require without a clear benefit to them. I'm quite sure there
>> is code out there that uses these documented APIs (without the semicolon).
>> If we want to define new macros that require colons, then we do that, but
>> we can't get rid of the old ones --- especially in a 1.x release.
>> Our policy should not be to allow gratuitous style changes just because
>> we think something is prettier another way. The NumPy code base has come
>> from multiple sources and reflects several styles. It also follows an
>> older style of C-programming (that is quite common in the Python code
>> base). It can be changed, but those changes shouldn't be painful for a
>> library user without some specific gain for them that the change allows.
> You use that word 'gratuitous' a lot, I don't think it means what you
> think it means. For instance, the new polynomial coefficient order wasn't
> gratuitous, it was doing things in a way many found more intuitive and
> generalized better to different polynomial basis. People
> have different ideas, that doesn't make them gratuitous.
> That's a slightly different issue. At least you created a new object
> and api which is a *little* better. My complaint about the choice there
> is now there *must* be two interfaces and added confusion as people will
> have to figure out which assumption is being used. I don't really care
> about the coefficient order --- really I don't. Either one is fine in my
> mind. I recognize the reasons. The problem is *changing* it without a
> *really* good reason. Now, we have to have two different APIs. I would
> much preferred to have poly1d disappear and just use your much nicer
> polynomial classes. Now, it can't and we are faced with a user-story
> that is either difficult for someone transitioning from MATLAB
Most folks aren't going to transition from MATLAB or IDL. Engineers tend to
stick with the tools they learned in school, they aren't interested in the
tool itself as long as they can get their job done. And getting the job
done is what they are paid for. That said, I doubt they would have much
problem making the adjustment if they were inclined to switch tools.
or a "why did you do that?" puzzled look from a new user as to why we
> support both coefficient orders. Of course, that could be our story ---
> hey we support all kinds of orders, it doesn't really matter, you just have
> to tell us what you mean when passing in an unadorned array of
> coefficients. But, this is a different issue.
> I'm using the word 'gratuitous' to mean that it is "uncalled for and lacks
> a good reason". There needs to be much better reasons given for code
> changes that require someone to re-write working code than "it's better
> style" or even "it will help new programmers avoid errors". Let's write
> another interface that new programmers can use that fits the world the way
> you see it, don't change what's already working just because you don't like
> it or wish a different choice had been made.
Well, and that was exactly what you meant when you called to coefficient
order 'gratuitous' in your first post to me about it. The problem was that
you didn't understand why I made the change until I explained it, but
rather made the charge sans explanation. It might be that some of the other
things you call gratuitous are less so than you think. These are hasty
judgements I think.
>> There are significant users of NumPy out there still on 1.4. Even the
>> policy of deprecation that has been discussed will not help people trying
>> to upgrade from 1.4 to 1.8. They will be forced to upgrade multiple
>> times. The easier we can make this process for users the better. I
>> remain convinced that it's better and am much more comfortable with making
>> a release that requires a re-compile (that will succeed without further
>> code changes --- because of backward compatibility efforts) than to have
>> supposed ABI compatibility with subtle semantic changes and required C-code
>> changes when you do happen to re-compile.
> Cleanups need to be made bit by bit. I don't think we have done anything
> that will cause undo trouble.
> I disagree substantially on the impact of these changes. You can disagree
> about my awareness of NumPy users, but I think I understand a large number
> of them and why NumPy has been successful in getting users. I agree that
> we have been unsuccessful at getting serious developers and I'm convinced
> by you and Mark as to why that is. But, we can't sacrifice users for the
> sake of getting developers who will spend their free time trying to get
> around the organic pile that NumPy is at this point.
> Because of this viewpoint, I think there is some adaptation and cleanup
> right now, needed, so that significant users of NumPy can upgrade based on
> the changes that have occurred without causing them annoying errors (even
> simple changes can be a pain in the neck to fix).
> I do agree changes can be made. I realize you've worked hard to keep
> the code-base in a state that you find more adequate. I think you go
> overboard on that front, but I acknowledge that there are people that
> appreciate this. I do feel very strongly that we should not require
> users to have to re-write working C-code in order to use a new minor
> version number in NumPy, regardless of how the code "looks" or how much
> "better" it is according to some idealized standard.
> The macro changes are border-line (at least I believe code will still
> compile --- just raise warnings, but I need to be sure about this). The
> changes that require semi-colons are not acceptable at all.
I was tempted to back them out myself, but I don't think the upshot will be
> Look Charles, I believe we can continue to work productively together and
> our differences can be a strength to the community. I hope you feel the
> same way. I will continue to respect and listen to your perspective ---
> especially when I disagree with it.
Sounds like a threat to me. Who are you to judge? If you are going to be
the dictator, let's put that out there and make it official.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion