[Numpy-discussion] [help needed] associativity and precedence of '@'

josef.pktd@gmai... josef.pktd@gmai...
Thu Mar 20 12:43:37 CDT 2014


On Thu, Mar 20, 2014 at 1:25 PM, Nathaniel Smith <njs@pobox.com> wrote:

> On Wed, Mar 19, 2014 at 7:45 PM, Nathaniel Smith <njs@pobox.com> wrote:
>
>> Okay, I wrote a little script [1] to scan Python source files look for
>> things like 'dot(a, dot(b, c))' or 'dot(dot(a, b), c)', or the ndarray.dot
>> method equivalents. So what we get out is:
>> - a count of how many 'dot' calls there are
>> - a count of how often we see left-associative nestings: dot(dot(a, b), c)
>> - a count of how often we see right-associative nestings: dot(a, dot(b,
>> c))
>>
>> Running it on a bunch of projects, I get:
>>
>> | project      | dots | left | right | right/left |
>> |--------------+------+------+-------+------------|
>> | scipy        |  796 |   53 |    27 |       0.51 |
>> | nipy         |  275 |    3 |    19 |       6.33 |
>> | scikit-learn |  472 |   11 |    10 |       0.91 |
>> | statsmodels  |  803 |   46 |    38 |       0.83 |
>> | astropy      |   17 |    0 |     0 |        nan |
>> | scikit-image |   15 |    1 |     0 |       0.00 |
>> |--------------+------+------+-------+------------|
>> | total        | 2378 |  114 |    94 |       0.82 |
>>
>
> Another way to visualize this, converting each contiguous "chain" of calls
> to np.dot into a parenthesized expression, and then counting how often we
> see each pattern.
>
>       1943  (_ @ _)
>        100  ((_ @ _) @ _) # left
>         86  (_ @ (_ @ _)) # right
>          2  (_ @ ((_ @ _) @ _))
>          2  (((_ @ _) @ _) @ _) # left
>          1  ((_ @ (_ @ _)) @ _)
>          1  ((_ @ _) @ (_ @ _))
>          1  (((_ @ _) @ _) @ (_ @ _))
>          1  ((_ @ ((_ @ _) @ _)) @ _)
>          1  ((_ @ _) @ (_ @ (_ @ _)))
>
> (This is pooling scipy/nipy/scikit-learn/statsmodels.) I've noted the 3
> different patterns that have a consistent associativity.
>
> From this I'm leaning towards the conclusions that:
>
> - Expressions with complex parenthesization do happen, but probably not
> often enough to justify elaborate stuff like my 'chaining' proposal -- only
> 8.7% of these cases involve more than one @.
>

just for statsmodels

We do have a very large amount of chaining, but in many cases this has been
taken out of a single expression into a temporary or permanent variable for
parts of the chain. (similar to the quadratic form example in the PEP),
either for clarity (a temp variable), or because one dot product shows up
several times in the same expression (quadratic forms) or because we need
to keep it around for reuse in other expressions.

That's what I tried to explain before, that chaining and breaking up larger
multi-dot expressions is most of the time a intentional choice and not just
random because the the dot function forces us.

The most convincing argument for me for @ is that it makes parenthesis
visible (until I realized that I didn't really care about @).
This reduces the cases where we separate out a dot product for clarity and
readibility, but still leaves us with the other two cases, where our
chaining won't change whatever numpy provides additionally.

Josef



>
> - There's very little support here for the intuition that
> right-associativity is more useful than left-associativity on a day-to-day
> basis.
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140320/ab002279/attachment.html 


More information about the NumPy-Discussion mailing list