Each Answer to this Q is separated by one/two green lines.
I came across the fact that
numpy arrays are passed by reference at multiple places, but then when I execute the following code, why is there a difference between the behavior of
import numpy as np def foo(arr): arr = arr - 3 def bar(arr): arr -= 3 a = np.array([3, 4, 5]) foo(a) print a # prints [3, 4, 5] bar(a) print a # prints [0, 1, 2]
I’m using python 2.7 and numpy version 1.6.1
In Python, all variable names are references to values.
When Python evaluates an assignment, the right-hand side is evaluated before the left-hand side.
arr - 3 creates a new array; it does not modify
arr = arr - 3 makes the local variable
arr reference this new array. It does not modify the value originally referenced by
arr which was passed to
foo. The variable name
arr simply gets bound to the new array,
arr - 3. Moreover,
arr is local variable name in the scope of the
foo function. Once the
foo function completes, there is no more reference to
arr and Python is free to garbage collect the value it references. As Reti43 points out, in order for
arr‘s value to affect
foo must return
a must be assigned to that value:
def foo(arr): arr = arr - 3 return arr # or simply combine both lines into `return arr - 3` a = foo(a)
arr -= 3, which Python translates into a call to the
__iadd__ special method, does modify the array referenced by
The first function calculates
(arr - 3), then assigns the local name
arr to it, which doesn’t affect the array data passed in. My guess is that in the second function,
np.array overrides the
-= operator, and operates in place on the array data.
Python passes the array by reference:
$:python ...python startup message >>> import numpy as np >>> x = np.zeros((2,2)) >>> x array([[0.,0.],[0.,0.]]) >>> def setx(x): ... x[0,0] = 1 ... >>> setx(x) >>> x array([[1.,0.],[0.,0.]])
The top answer is referring to a phenomenon that occurs even in compiled c-code, as any BLAS events will involve a “read-onto” step where either a new array is formed which the user (code writer in this case) is aware of, or a new array is formed “under the hood” in a temporary variable which the user is unaware of (you might see this as a
However, I can clearly access the memory of the array as if it is in a more global scope than the function called (i.e.,
setx(...)); which is exactly what “passing by reference” is, in terms of writing code.
And let’s do a few more tests to check the validity of the accepted answer:
(continuing the session above) >>> def minus2(x): ... x[:,:] -= 2 ... >>> minus2(x) >>> x array([[-1.,-2.],[-2.,-2.]])
Seems to be passed by reference. Let us do a calculation which will definitely compute an intermediate array under the hood, and see if x is modified as if it is passed by reference:
>>> def pow2(x): ... x = x * x ... >>> pow2(x) >>> x array([[-1.,-2.],[-2.,-2.]])
Huh, I thought x was passed by reference, but maybe it is not? — No, here, we have shadowed the x with a brand new declaration (which is hidden via interpretation in python), and python will not propagate this “shadowing” back to global scope (which would violate the python-use case: namely, to be a beginner level coding language which can still be used effectively by an expert).
However, I can very easily perform this operation in a “pass-by-reference” manner by forcing the memory (which is not copied when I submit x to the function) to be modified instead:
>>> def refpow2(x): ... x *= x ... >>> refpow2(x) >>> x array([[1., 4.],[4., 4.]])
And so you see that python can be finessed a bit to do what you are trying to do.