[Numpy-discussion] Linker script, smaller source files and symbol visibility

David Cournapeau cournape@gmail....
Wed Apr 22 01:47:08 CDT 2009


On Wed, Apr 22, 2009 at 2:24 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Mon, Apr 20, 2009 at 11:06 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
>>
>>
>> On Mon, Apr 20, 2009 at 10:13 PM, David Cournapeau
>> <david@ar.media.kyoto-u.ac.jp> wrote:
>>>
>>> Charles R Harris wrote:
>>>
>>> >
>>> > Here is a link to the start of the old discussion
>>> >
>>> > <http://article.gmane.org/gmane.comp.python.numeric.general/12974/match=exported+symbols+code+reorganization>.
>>> > You took part in it also.
>>>
>>> Thanks, I remembered we had the discussion, but could not find it. The
>>> different is that I am much more familiar with the technical details and
>>> numpy codebase now :) I know how to control exported symbols on most
>>> platform which matter (I can't test for AIX or HP-UX unfortunately - but
>>> I am perfectly fine with ignoring namespace pollution on those anyway),
>>> and I would guess that the only platforms which do not support symbol
>>> visibility in one way or the other do not support shared library anyway
>>> (some CRAY stuff, for example).
>>>
>>> Concerning the file size, I don't think anyone would disagree that they
>>> are too big, but we don't need to go the "java-way" of one
>>> file/class-function either. One first split which I personally like is
>>> API/implementation. For example, for multiarray.c, we would only keep
>>> the public PyArray_* functions, and put everything else in another file.
>>> The other very big file is arrayobject.c, and this one is already mostly
>>> organized in independent parts (buffer protocol, number protocol, etc...)
>>>
>>> Another thing I would like to do it to make the global C API array
>>> pointer a 'true' global variable instead of a static one. It took me a
>>> while when I was working on the hashing protocol for dtype to understand
>>> why it was crashing (the array pointer being static, every file has its
>>> own copy, so it was never initialized in the hashdescr.c file). I think
>>> a true global variable, hidden through a symbol map, is easier to
>>> understand and more reliable.
>>
>> I made an experiment along those lines a couple of years ago. There were
>> compilation problems because the needed include files weren't available. No
>> doubt that could be fixed in the build, but at some point I would like to
>> have real include files, not the generated variety. Generated include files
>> are kind of bogus IMHO, as they don't define an interface but rather reflect
>> whatever the function definition happens to be. So as any part of a split I
>> would also suggest writing the associated include files. That would also
>> make separate compilation possible, which would make it easier to do test
>> compilations while doing development.
>
> The list of visible symbols has grown ;)

Yes. Except PyArray_DescrHash which is a mistake on my own, for all
the npy_* symbols, there is nothing we can do ATM because they are
from a pure C (static) library. That's one of the rationale in the
original email :)

David


More information about the Numpy-discussion mailing list