Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upA new PEP for infix matrix multiplication #4351
Conversation
| rejected by BDFL fiat, this would be using a sledgehammer to smash | ||
| a fly. There is a strong consensus in the scientific python | ||
| community that matrix multiplication really is the only missing | ||
| infix operator that matters enough to bother about. (In |
efiring
Feb 22, 2014
Contributor
I would really like to see && and || for the element-wise logical operations; faking them with the bitwise operators and parentheses is ugly.
pv
Feb 22, 2014
Member
The overloadable boolean ops PEP was recently rejected: https://mail.python.org/pipermail/python-dev/2012-March/117510.html
I think it's best to not divert attention to that topic, but just focus on the matmul operation.
| ``>>=`` 0 0 0 0 | ||
| ======= ======= ======= ======= ======== | ||
|
|
||
| We see that sklearn and nipy together contain nearly 700 uses of |
jtratner
Feb 22, 2014
This table is slightly confusing (especially the combined column) - could you label it to say something like "Instances per 10K SLOC" at the top? It gets a little buried in the previous paragraph. I might also suggest sorting it by sklearn to drive home the point about dot products.
mforbes
Feb 23, 2014
Contributor
I also found this table very confusing. When scanning a document like this, one often skips straight to tables, and without a caption, my first reaction was that there are only 80 uses of dot (which of course makes no sense). Is there a way of adding a "caption" or similar?
Since the point of this discussion dot compared to the rest, I would put the dot row at the top. Of course it does not sort there, but it should be clear that this is what we are comparing too. (It can be left in its sorted order too.)
pv
Feb 23, 2014
Member
Should the table be restricted only to binary ops (and inplace ops). (, { are not binary ops and having fewer entries in the table would make it easier to read.
njsmith
Feb 24, 2014
Author
Member
Check out the table again, I think I handled everyone's concerns. (Partly by giving up on using the ReST table support, which has no way to get readable alignment.)
|
|
|
Hey @njsmith, we had some discussions ages ago at SciPy'07 or '08, I did my best to summarize them here. That stuff is in an old bazaar repo, but you can get the reST sources here. I never really followed this up beyond summarizing the conversations in that document. Just mentioning it here in case any of that ancient history is useful, feel free to ignore it in light of the last 5-6 years' worth of experience :) |
|
BTW, there's a trivial makefile too that I was using to make rendered versions of the rst sources; it would be handy to have such a view somewhere, as it does make it much nicer to read. |
|
|
||
| The main motivation for this PEP is the addition of a binary operator | ||
| ``@`` for matrix multiplication. No-one cares terribly much about the | ||
| matrix power operator ``@@`` -- it's useful and well-defined, but not |
pv
Feb 22, 2014
Member
I'd make the discussion of @@ much less prominent.
It can be mentioned somewhere toward the end, but I think it should not be proposed in the PEP.
shoyer
Feb 23, 2014
Member
I agree. Including @@ in this PEP is a distraction which makes it marginally more likely to be rejected. In my experience, matrix powers with arbitrary bases are rare (usually the matrix exponential is preferred).
njsmith
Feb 24, 2014
Author
Member
I moved the motivation of @@ to the end, but my feeling is that it can't hurt and might help. Remember, we're not trying to ask the BDFL for some favor -- we're helping him see the facts about how much more elegant and nice Python will become after making these changes. Suggesting that we have @, *, and ** but not @@ is actually asymmetric and inelegant.
And anyway, if he likes one half of a PEP but not the other than he'll just tell us to change it. It's still clear from the motivation that we'd be totally happy with just @.
| single duck type for all matrix-like objects. | ||
|
|
||
|
|
||
| Summary |
pv
Feb 23, 2014
Member
I think this summary would be better to put before the detailed items --- on top of the motivation section.
| among the numerical community that the best solution is to add a | ||
| single infix operator for matrix multiply (together with any other new | ||
| operators this implies like ``@=``). | ||
|
|
pv
Feb 23, 2014
Member
I wonder if we need to counter "too late, after all these years".
One argument is that adoption of Python3 in the numerical world is still not yet so big.
There's still a time window left to to introduce this feature.
njsmith
Feb 24, 2014
Author
Member
IME that's not usually an argument that people make -- otherwise it would apply to every change anyone ever considered making to Python. Why did we do this back in 1.5? :-) And it's obvious to everyone that Python's use in numerics has grown hugely over the last 5 years...
| matrix``; ``scalar * matrix`` is well-defined, but ``scalar @ | ||
| matrix`` should raise an error. | ||
|
|
||
| Therefore, we implement these methods so that numbers.Number objects |
mforbes
Feb 23, 2014
Contributor
Should this not be numbers.Numbers (as code? It looked like strange grammer with a missing space after the "period" when I first read this.) Similarly with NotImplemented below etc.
|
+1 from me. Nice work. Thanks for putting this together. |
|
This leaves out things like cross product and matrix division. What about using something like @ as a prefix for an operator to define it as a matrix operation? So @, @/, @__. You could then use @@ for the "reverse" version of that operator, so @@ would be cross product, @@/ would be back division. |
|
@toddrjen: the suggestion of using a prefix for multiple operators is similar to PEP-225, and it has not made progress. Also, I think cross product and matrix division are below in priority to matrix multiplication in necessity. |
|
Probably, but I would say matrix exponent is below all of those in importance. I think the argument for including @@ is applicable to other mathematical operators as well, and I would say they are far more useful. So I would either cut out @@, or better justify including it but not any of other operators. |
| more often than ``//`` or other integer operations. | ||
|
|
||
|
|
||
| But isn't it weird to add an operator with no stdlib uses? |
shoyer
Feb 23, 2014
Member
There is precedent for adding syntax to Python not used (to my knowledge) in the stdlib in the form of x[...] which is equivalent to x[Ellipsis].
|
It will be helpful to include a survey of what other languages use. For example, there is a precedent for using |
|
@fperez You can view a |
|
Here's the link to the rendered file: https://github.com/njsmith/numpy/blob/matmul-pep/doc/neps/return-of-revenge-of-matmul-pep.rst |
| When the going gets tough, the tough get empirical. To get a rough | ||
| estimate of how useful the ``@`` operator will be, this table shows | ||
| the rate at which different Python operators are used in the stdlib, | ||
| and also in two high-profile numerical projects -- the sklearn machine |
larsmans
Feb 23, 2014
Contributor
It's called scikit-learn, not sklearn (that's just the package name).
| The ``dot`` row counts matrix multiply operations, estimated by | ||
| assuming there to be zero matrix multiplies in the stdlib, and in | ||
| sklearn/nipy assuming -- reasonably -- that all instances of the token | ||
| ``dot`` are calls to ``np.dot``. |
larsmans
Feb 23, 2014
Contributor
This is an underestimate for scikit-learn. To get a more accurate estimate, include the ~50 calls to safe_sparse_dot, which works around the fact that matrix multiplication is spelled * in scipy.sparse :( and the dozen or so calls to fast_dot.
njsmith
Feb 24, 2014
Author
Member
Awesome, thanks! That bumps dot up above |, which makes the exposition even easier :-)
| In addition, the following tutorials could easily deal with | ||
| matrices: | ||
| * Introduction to game programming |
larsmans
Feb 23, 2014
Contributor
Speaking of which, has anyone been in touch with the PyOpenGL about this PEP?
njsmith
Feb 24, 2014
Author
Member
AFAICT PyOpenGL doesn't actually define a matrix type, it just has a few functions that can accept matrices as arguments (possibly represented as list-of-lists I guess). If you want matrix operators they refer you to numpy.
|
In addition to the APL, Kx, and R syntax given by @abalkin above, I skimmed some from Rosetta Code:
I'd say APL's use of non-ASCII characters was historically unpopular. IDL's hash clashes with Python's comment. J uses an ASCII reworking of APL, I think, so it's got a much more elaborate idea of combining multiple infix operators together; powerful but perhaps not really Pythonic. The dot of Mathematica and Maxima is nice but clashes with Python's object attribute/method/module accessor. Julia, Octave, Pari/GP, and TI-83 BASIC's star is nice but would be backward-incompatible with NumPy's elementwise multiplication; Ruby's star relies on the subclass idea eloquently rebutted by the original poster. R's and K's multicharacter things are pretty ugly. |
| * The mATrices mnemonic is cute. | ||
| * The swirly shape is reminiscent of the simultaneous sweeps over rows | ||
| and columns that define matrix multiplication. | ||
|
|
gdmcbain
Feb 24, 2014
Another mnemonic motivation for the at-sign: it slightly resembles the centred dot (LaTeX \cdot) often used in printed mathematics to represent an algebraic product.
| infix operator. There almost certainly don't exist any other | ||
| binary operations that will ever justify adding another infix | ||
| operator. | ||
|
|
gdmcbain
Feb 24, 2014
What about function composition? Haskell uses dot for this, as in
g . f
I got to that by wondering were there such an operator as proposed, besides making heavy use of its intended purpose, how might I overload it? The first thing that springs to mind is functional composition and application. The fundamental idea of a matrix is as a function on vectors; matrix-matrix multiplication is defined subsequently by associativity. If I had arbitrary compatible functions f and g and an argument x, would I like to write f @ x for f (x) and g @ f for lambda x: g (f (x))? Maybe not the former so much but the latter definitely.
njsmith
Feb 24, 2014
Author
Member
It's an interesting idea, but IMO has about a snowball's chance in a sauna of actually getting added. Python has moved away from functional programming in general, and doesn't even bother to include a compose function in the functools module.
| better than dot(A, B). But it's still much less readable than | ||
| real infix notation, and in particular still suffers from an | ||
| extreme overabundance of parentheses. See `Motivation`_ above. | ||
|
|
gdmcbain
Feb 24, 2014
Another 'alternative to adding a new operator at all': given the interpretation of a matrix multiplication as the operation of a matrix on a vector, it's a lot like a function operating on a vector (that just happens to be linear), so what about defining numpy.ndarray.call as matrix multiplication? That would allow A (B) instead of A.dot(B). It doesn't shed any parentheses but is pretty compact.
An advantage of this approach would be that it wouldn't require changing Python, only NumPy. It can nearly be done in user-code, except that
>>> np.ndarray.__call__ = np.ndarray.dot
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't set attributes of built-in/extension type 'numpy.ndarray'
njsmith
Feb 24, 2014
Author
Member
I'm going to leave this out unless someone brings it up on python-dev... if you can look at
S = (H(beta) - r).T(inv(H(V)(H.T)))(H(beta) - r)
without your eyes starting to bleed then you have higher ocular robustness than me.
charris
Feb 24, 2014
Member
If you look at the linked past conversations you will see that proposal and some criticisms of it.
|
I would like to see a more complete examination of np.Matrix before There are perhaps warts with np.Matrix, can these be identified and Colin W. On 15-Mar-2014 3:33 PM, Robert Bradshaw wrote:
|
|
@cjw43 There have been several attempts over the years. The PEP acknowledges that history without attempting to recapitulate it, proposal for proposal. The PEP is not the beginning of the examination of the problems with a two-type solution, but the culmination of many years of experience with them (yes, plural). The past threads on the subject can be found in the mailing list archive. By all means, please make concrete proposals if you have them. In the absence of concrete proposals, I think we have a reasonable case not to hold out hope that a good one will show up after all of these years. |
Every user of a library that hijacks these operators needs to be aware of both the hijacked and original meanings. Anybody who is used to a language where the power operator is written '^' gets painfully acquainted with Any new operator increases the total complexity of the language, takes valuable developer time to implement, uses up some memory in the interpreter, etc. There are costs. They are small but they affect everyone in the Python ecosystem. Given that the benefits will only be felt by a small subset of the Python community, they had better be significant. |
|
On Mar 19, 2014 5:13 AM, "Ronan Lamy" notifications@github.com wrote:
What specific confusions ate likely with this proposal? Yes, in a general sense that could be an issue with singer operators, but
First, I don't think the scientific community is a small subset, I think it Second, you need to demonstrate that these changes will be anything other Third, in terms of developer time, you can't just look at the time needed |
|
@rlamy I think that it's worth noting that no one on python-ideas, except perhaps yourself, has objected to the PEP on these grounds or asked us to do more soul searching about whether we really need the operator. These are people most of whom will never use matrix multiplication. It's nice to be concerned for them, but they can speak for themselves and have spoken in favor of the PEP. |
|
@toddrjen Implementing |
@pchanial they are suggesting left precedence and asking for compelling arguments against it on the mailing list |
|
I've just added text for "same-left" associativity/precedence: |
| converting back and forth all the time, is incredibly cumbersome and | ||
| impossible to get right at any scale. Functions that defensively try | ||
| to handle both types as input and DTRT, find themselves floundering | ||
| into a swamp of ``isinstance`` and ``if`` statements. |
larsmans
Apr 11, 2014
Contributor
Actually handling both types as input is not so hard if you do it like scikit-learn, were matrices may come in, but never come out. The trouble starts when you also produce np.matrix when you get it as input.
stefanv
Apr 11, 2014
Contributor
The point is still that libraries have to be aware of this double-class
convention, otherwise only the broken sub-class solutions works (or
doesn't).
rkern
Apr 11, 2014
Member
Dealing with matrix in this fashion is not really an additional overhead to allowing lists or other sequences as inputs: you just always use np.asarray() on the inputs. If everyone did this, though, matrix would be useless since it would disappear whenever you passed one to a function. scikit-learn accepts this; it doesn't care about supporting matrix users except to transparently consume their data. It also doesn't have many functions that one would really use in expressions, so it's not much loss for matrix users. But there are a lot of other functions that do need to output matrix objects to be useful to matrix users, and this is difficult so few people do it right. That's why the status quo sucks. You can choose between having an operator for matrix multiplication by using the matrix type or having the full range of the community's code available to you. You can't have both. And just to be clear, there is nothing about this situation that depends on the implementation details of numpy.matrix. Subclass or not, all of these issues remain as long as you have two types that you want to interoperate.
larsmans
Apr 11, 2014
Contributor
@rkern I agree with you. What I was really trying to point out is that DTRT is ambiguous in this context, because to a new user there may seem to be two "obvious" ways of "doing it right".
njsmith
Apr 11, 2014
Author
Member
Ah, I see. That sentence was meant as an alternative to the previous one,
i.e., the first point is that keeping track of which functions expect what
is impossible, so the obvious alternative is to write functions in such a
way that you don't need to keep track, but that is also impossible...
On Fri, Apr 11, 2014 at 1:27 PM, Lars Buitinck notifications@github.comwrote:
In doc/neps/return-of-revenge-of-matmul-pep.rst:
+possible to switch between the conventions, because numpy provides two
+different types with different__mul__methods. For
+numpy.ndarrayobjects,*performs elementwise multiplication,
+and matrix multiplication must use a function call (numpy.dot).
+Fornumpy.matrixobjects,*performs matrix multiplication,
+and elementwise multiplication requires function syntax. Writing code
+usingnumpy.ndarrayworks fine. Writing code using
+numpy.matrixalso works fine. But trouble begins as soon as we
+try to integrate these two pieces of code together. Code that expects
+anndarrayand gets amatrix, or vice-versa, may crash or
+return incorrect results. Keeping track of which functions expect
+which types as inputs, and return which types as outputs, and then
+converting back and forth all the time, is incredibly cumbersome and
+impossible to get right at any scale. Functions that defensively try
+to handle both types as input and DTRT, find themselves floundering
+into a swamp ofisinstanceandifstatements.@rkern https://github.com/rkern I agree with you. What I was really
trying to point out is that DTRT is ambiguous in this context, because to a
new user there may seem to be two "obvious" ways of "doing it right".—
Reply to this email directly or view it on GitHubhttps://github.com//pull/4351/files#r11529482
.
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
cjw43
Apr 11, 2014
On 11-Apr-2014 7:39 AM, Stefan van der Walt wrote:
In doc/neps/return-of-revenge-of-matmul-pep.rst:
> +possible to switch between the conventions, because numpy provides two
+different types with different
__mul__methods. For
+numpy.ndarrayobjects,*performs elementwise multiplication,
+and matrix multiplication must use a function call (numpy.dot).
+Fornumpy.matrixobjects,*performs matrix multiplication,
+and elementwise multiplication requires function syntax. Writing code
+usingnumpy.ndarrayworks fine. Writing code using
+numpy.matrixalso works fine. But trouble begins as soon as we
+try to integrate these two pieces of code together. Code that expects
+anndarrayand gets amatrix, or vice-versa, may crash or
+return incorrect results. Keeping track of which functions expect
+which types as inputs, and return which types as outputs, and then
+converting back and forth all the time, is incredibly cumbersome and
+impossible to get right at any scale. Functions that defensively try
+to handle both types as input and DTRT, find themselves floundering
+into a swamp ofisinstanceandifstatements.
The point is still that libraries have
to be aware of this double-class
convention, otherwise only the broken sub-class solutions works
(or
doesn't).
Stefan,
Thanks for this clarification.
You say "Code that expects an ``ndarray`` and gets a ``matrix``, or
vice-versa, may crash or return incorrect
results.
"
I would look at this as an error in the existing code. Would it not
be simpler to correct this than construct code for a new operator?
Has this been explored? It's been a long time since I looked at the
code.
It would help if some example of the need for "converting back and
forth" could be given
Colin W.
—
Reply to this email directly or view
it on GitHub.
njsmith
Apr 11, 2014
Author
Member
On Fri, Apr 11, 2014 at 2:43 PM, Colin J. Williams <notifications@github.com
wrote:
Stefan, Thanks for this clarification. You say "Code that expects an
ndarrayand gets amatrix, or vice-versa, may crash or return
incorrect results. " I would look at this as an error in the existing
code. Would it not be simpler to correct this than construct code for a
new operator? Has this been explored?
No, it wouldn't be simpler. Fixing every piece of code to handle both types
as inputs and outputs means that every function must have special checks,
which creates a combinatorial explosion in code complexity and is a serious
obstacle to maintainability, testing, etc.
This is why if you look at actual code there are few (possibly zero)
libraries that actually handle mixed ndarray and matrix objects correctly
(even numpy itself doesn't in many cases) -- it's just not doable. Everyone
prefers to use ndarray objects with the cumbersome dot() function call
syntax for matrix multiplication. It's annoying, but it's still better than
checking that every single code path can handle arrays and matrices (and
combinations thereof etc). (In fact, there are some places where
scikit-learn is forced to handle matrix objects because scipy.sparse only
provides a matrix-style API, and even in these cases they don't actually
use '*' for matrix multiplication -- which is the point that started this
whole thread. Instead they have a special safe_sparse_dot() function which
provides a dot()-style API that works for both arrays and matrices.)
stefanv
Apr 11, 2014
Contributor
@cjw43 Here's a library's code snippet for squaring the elements of an input array:
def square_it(A):
return A * A
That code is now broken, even though the library author has never even heard of a Matrix class. The new, correct code, becomes:
def square_it(A):
A = np.asarray(A)
return A * A
That's not a problem as long as all consumers out there are aware of this possibility, but you are forcing a convention on programmers you don't control, and one which is auxiliary their intention. Contrast that to the following library code:
def square_it(A):
return A * A
We know what this does. It squares the elements of the array.
def square_it(A):
return A @ A
We know what this does, it performs a dot product between the two matrices.
|
This is probably a day late and a dollar short (as my Mom used to say) but has any serious consideration ever been given to defining dot (".") as an inner product operator? That's the way that APL does it, but every time it's brought up, it is dismissed out of hand. Yes, it would give an additional, different, meaning to ".", but "%" is used for very different purposes without too much confusion. Uisng the string "+._" in Python code is a syntax error, so we don't have to worry about breaking pre-existing code. I envision A+._B as syntactic sugar for
where inner_product is defined (for simple sequence objects) as equivalent to this:
The compiler would be allowed to instantiate +.* only once per sequence type, no matter how many times it appeared in the code, however non-standard sequence types (such as numpy's ndarrays) would be able to provide alternative implementations which would only be used by those types. (I envision a dict attached to the PySequenceMethods struct that would cache, for example, inner_product(operator.add, operator.mul) under the key '"+.*") |
|
@samwyse, I am not sure how this could be an option. |
|
Hi Sam, I think the challenge you'd face if you tried proposing this to -n On Mon, Apr 21, 2014 at 5:39 PM, Sam Denton notifications@github.comwrote:
Nathaniel J. Smith |
|
@seberg: I'm proposing "." as a meta-operator, which would not be used by itself. That particular character is currently used to access attributes, and also to separate the integral and fractional parts of floating point numbers. This would be a third use. |
|
@njsmith: Way back when I learned APL, I recall lots of ways that f,g could be used besides matrix multiplication. For example, '<.&' would tell you if every number in a list was smaller that the corresponding number in another list. That said, my use of '&' in my example (as a logical and) has different semantics from Python's definition ('&' is bitwise and). Really, you'd want to use '<.and' which I suspect would really only be parsable because of 'and' being a keyword; and yes, Python already allows 'all(x < y for x, y in zip(a,b))' which means the same thing (albeit not as concisely). Iverson provides several examples, but they tend to use APL's 'max' and 'min' operators, which in Python aren't even keywords, and so would look to the compiler a lot like attribute access; I suspect that a proposal to implement GCC's minimum and maximum operators ('<?' and '>?') would be met with cat-calls. |
|
I think a relevant data point here is that back in the day, Guido was On Mon, Apr 21, 2014 at 8:17 PM, Sam Denton notifications@github.comwrote:
Nathaniel J. Smith |
|
|
@cjw43 There is no "destruction of the Class" to be discussed in the PEP. If you are talking about the possible deprecation of the Please, please, please use the Github interface for commenting or else trim your email replies to just your content. But as the PEP has already been accepted, continuing to press this particular point does not appear to be a good use of anyone's time. The decisionmakers in the core Python dev team consider the amount of discussion in the PEP as it stands perfectly adequate. |
|
@njsmith Ready to merge? |
|
Oh right, this PR. Sure, I guess that makes sense to save the history, though I suppose the
|
A new PEP for infix matrix multiplication
|
History saved! Thanks for that. |
A new PEP for infix matrix multiplication
|
Super late to the party, but I'd like to cross-reference JuliaLang/julia#4774 here, where it looks like very similar discussions have taken place in the context of how to define vector transposition. |
|
Re the use of a potential @@ operator. In a lot of mathematical contexts, we actually consider exp(A) for some matrix A. exp(A)=I + A + A@@2/2! + A@@3/3! + A@@4/4! + .... continuing like the Taylor Series. Other useful contexts for the @@ operator exist, but it is definitely specifically for math people and may be noise to most non-scientific programmers. |
|
On 30 January 2016 at 17:12, owendavey notifications@github.com wrote:
In computer graphics, one multiplies matrices a lot. Considering that this |
|
@dimpase - I was not referring to matrix multiplication. I was referring to matrix exponentiation - hence the double @ operator which above seems to have been considered not important enough to be added in addition to the single @ operator of matrix multiplication for this PEP. |
|
The discussion here has ended. The PEP has been accepted and implemented. If you would like to propose or discuss the merits of further changes, please find an appropriate mailing list. Thanks. |


This is now PEP 465, and a properly rendered version can be found here here.
A poorly rendered version of the latest draft can be found here.
Let's use this PR as a central place to discuss this draft PEP for adding an infix matrix multiplication operator to numpy. The PR format is nice in that it allows line by line comments, updating, etc.
Hopefully this goes without saying, but just in case: We especially welcome comments about how the proposal will work for non-numpy projects (or could work, with fixes); this proposal started in the numpy community but the idea here is to build a consensus about how to make Python better for all projects that care about linear algebra.
Some possible points of discussion: