PyTorch memory model: “torch.from_numpy()” vs “torch.Tensor()”
Each Answer to this Q is separated by one/two green lines.
I’m trying to have an in-depth understanding of how PyTorch Tensor memory model works.
# input numpy array
In [91]: arr = np.arange(10, dtype=float32).reshape(5, 2)
# input tensors in two different ways
In [92]: t1, t2 = torch.Tensor(arr), torch.from_numpy(arr)
# their types
In [93]: type(arr), type(t1), type(t2)
Out[93]: (numpy.ndarray, torch.FloatTensor, torch.FloatTensor)
# ndarray
In [94]: arr
Out[94]:
array([[ 0., 1.],
[ 2., 3.],
[ 4., 5.],
[ 6., 7.],
[ 8., 9.]], dtype=float32)
I know that PyTorch tensors share the memory buffer of NumPy ndarrays. Thus, changing one will be reflected in the other. So, here I’m slicing and updating some values in the Tensor t2
In [98]: t2[:, 1] = 23.0
And as expected, it’s updated in t2
and arr
since they share the same memory buffer.
In [99]: t2
Out[99]:
0 23
2 23
4 23
6 23
8 23
[torch.FloatTensor of size 5x2]
In [101]: arr
Out[101]:
array([[ 0., 23.],
[ 2., 23.],
[ 4., 23.],
[ 6., 23.],
[ 8., 23.]], dtype=float32)
But, t1
is also updated. Remember that t1
was constructed using torch.Tensor()
whereas t2
was constructed using torch.from_numpy()
In [100]: t1
Out[100]:
0 23
2 23
4 23
6 23
8 23
[torch.FloatTensor of size 5x2]
So, no matter whether we use torch.from_numpy()
or torch.Tensor()
to construct a tensor from an ndarray, all such tensors and ndarrays share the same memory buffer.
Based on this understanding, my question is why does a dedicated function torch.from_numpy()
exists when simply torch.Tensor()
can do the job?
I looked at the PyTorch documentation but it doesn’t mention anything about this? Any ideas/suggestions?
from_numpy()
automatically inherits input array dtype
. On the other hand, torch.Tensor
is an alias for torch.FloatTensor
.
Therefore, if you pass int64
array to torch.Tensor
, output tensor is float tensor and they wouldn’t share the storage. torch.from_numpy
gives you torch.LongTensor
as expected.
a = np.arange(10)
ft = torch.Tensor(a) # same as torch.FloatTensor
it = torch.from_numpy(a)
a.dtype # == dtype('int64')
ft.dtype # == torch.float32
it.dtype # == torch.int64
The recommended way to build tensors in Pytorch is to use the following two factory functions: torch.tensor
and torch.as_tensor
.
torch.tensor
always copies the data. For example, torch.tensor(x)
is equivalent to x.clone().detach()
.
torch.as_tensor
always tries to avoid copies of the data. One of the cases where as_tensor
avoids copying the data is if the original data is a numpy array.
This comes from _torch_docs.py
; there is also a possible discussion on the “why” here.
def from_numpy(ndarray): # real signature unknown; restored from __doc__
"""
from_numpy(ndarray) -> Tensor
Creates a :class:`Tensor` from a :class:`numpy.ndarray`.
The returned tensor and `ndarray` share the same memory.
Modifications to the tensor will be reflected in the `ndarray`
and vice versa. The returned tensor is not resizable.
Example::
>>> a = numpy.array([1, 2, 3])
>>> t = torch.from_numpy(a)
>>> t
torch.LongTensor([1, 2, 3])
>>> t[0] = -1
>>> a
array([-1, 2, 3])
"""
pass
Taken from the numpy
docs:
Different
ndarrays
can share the same data, so that changes made in one ndarray may be visible in another. That is, anndarray
can be a “view” to anotherndarray
, and the data it is referring to is taken care of by the “base”ndarray
.
Pytorch docs
:
If a
numpy.ndarray
,torch.Tensor
, ortorch.Storage
is given, a new tensor that shares the same data is returned. If a Python sequence is given, a new tensor is created from a copy of the sequence.
I tried doing what you said and its working as expected:
Torch 1.8.1, Numpy 1.20.1, python 3.8.5
x = np.arange(8, dtype=np.float64).reshape(2,4)
y_4mNp = torch.from_numpy(x)
y_t = torch.tensor(x)
print(f"x={x}\ny_4mNp={y_4mNp}\ny_t={y_t}")
All variables have same values right now as expected:
x=[[0. 1. 2. 3.]
[4. 5. 6. 7.]]
y_4mNp=tensor([[0., 1., 2., 3.],
[4., 5., 6., 7.]], dtype=torch.float64)
y_t=tensor([[0., 1., 2., 3.],
[4., 5., 6., 7.]], dtype=torch.float64)
From_numpy does use the same underlying memory that the np variable uses.
So changing either the np or the .from_numpy variables impact each other but NOT the tensor variable.
But changes to y_t affect only itself and not the numpy or the from_numpy variables.
x[0,1] = 111 ## changed the numpy variable itself directly
y_4mNp[1,:] = 500 ## changed the .from_numpy variable
y_t[0,:] = 999 ## changed the tensor variable
print(f"x={x}\ny_4mNp={y_4mNp}\ny_t={y_t}")
Output now:
x=[[ 0. 111. 2. 3.]
[500. 500. 500. 500.]]
y_4mNp=tensor([[ 0., 111., 2., 3.],
[500., 500., 500., 500.]], dtype=torch.float64)
y_t=tensor([[999., 999., 999., 999.],
[ 4., 5., 6., 7.]], dtype=torch.float64)
Dunno if this was an issue with earlier versions?