[Numpy-discussion] Numpy speed ups to simple tasks - final findings and suggestions

Raul Cota raul@virtualmaterials....
Fri Jan 4 13:10:02 CST 2013


In my previous email I sent an image but I just thought that maybe the 
mailing list does not accept attachments or need approval.

I put a couple of images related to my profiling results (referenced to 
my previous email) here.


Sorted by time per function with a graph of calls at the bottom

http://raul-playground.appspot.com/static/images/numpy-profile-time.png


Sorted by Time with Children
http://raul-playground.appspot.com/static/images/numpy-profile-timewchildren.png


The test is a loop of
val = float64 * float64 * float64 * float64


Raul




On 02/01/2013 7:56 AM, Nathaniel Smith wrote:
> On Fri, Dec 21, 2012 at 7:20 PM, Raul Cota <raul@virtualmaterials.com> wrote:
>> Hello,
>>
>>
>> On Dec/2/2012 I sent an email about some meaningful speed problems I was
>> facing when porting our core program from Numeric (Python 2.2) to Numpy
>> (Python 2.6). Some of our tests went from 30 seconds to 90 seconds for
>> example.
>
> Hi Raul,
>
> This is great work! Sorry you haven't gotten any feedback yet -- I
> guess it's a busy time of year for most people; and, the way you've
> described your changes makes it hard for us to use our usual workflow
> to discuss them.
>
>> These are the actual changes to the C code,
>> For bottleneck (a)
>>
>> In general,
>> - avoid calls to PyObject_GetAttrString when I know the type is
>> List, None, Tuple, Float, Int, String or Unicode
>>
>> - avoid calls to PyObject_GetBuffer when I know the type is
>> List, None or Tuple
>
> This definitely seems like a worthwhile change. There are possible
> quibbles about coding style -- the macros could have better names, and
> would probably be better as (inline) functions instead of macros --
> but that can be dealt with.
>
> Can you make a pull request on github with these changes? I guess you
> haven't used git before, but I think you'll find it makes things
> *much* easier (in particular, you'll never have to type out long
> awkward english descriptions of the changes you made ever again!) We
> have docs here:
>    http://docs.scipy.org/doc/numpy/dev/gitwash/git_development.html
> and your goal is to get to the point where you can file a "pull request":
>    http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#asking-for-your-changes-to-be-merged-with-the-main-repo
> Feel free to ask on the list if you get stuck of course.
>
>> For bottleneck (b)
>>
>> b.1)
>> I noticed that PyFloat * Float64 resulted in an unnecessary "on the fly"
>> conversion of the PyFloat into a Float64 to extract its underlying C
>> double value. This happened in the function
>> _double_convert_to_ctype which comes from the pattern,
>> _@name@_convert_to_ctype
>
> This also sounds like an excellent change, and perhaps should be
> extended to ints and bools as well... again, can you file a pull
> request?
>
>> b.2) This is the change that may not be very popular among Numpy users.
>> I modified Float64 operations to return a Float instead of Float64. I
>> could not think or see any ill effects and I got a fairly decent speed
>> boost.
>
> Yes, unfortunately, there's no way we'll be able to make this change
> upstream -- there's too much chance of it breaking people's code. (And
> numpy float64's do act different than python floats in at least some
> cases, e.g., numpy gives more powerful control over floating point
> error handling, see np.seterr.)
>
> But, it's almost certainly possible to optimize numpy's float64 (and
> friends), so that they are themselves (almost) as fast as the native
> python objects. And that would help all the code that uses them, not
> just the ones where regular python floats could be substituted
> instead. Have you tried profiling, say, float64 * float64 to figure
> out where the bottlenecks are?
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list