60 Minutes of PyTorch - Part 1
Tensors and Autograd
Let's install PyTorch (without GPU support for now). For more variants visit PyTorch Get Started. This tutorial is just an interactive port from here.
pip install http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp27-cp27mu-linux_x86_64.whl pip install torchvision
Import PyTorch.
import torch
Now let's get familiar with PyTorch tensors.
Construct a 5x3 matrix, uninitialized:
x = torch.Tensor(5, 3) print x)
Construct a randomly initialized matrix:
x = torch.rand(5, 3) print(x
Get its size:
print(x.size()
Operations
Addition (Syntax 1):
Addition (Syntax 2):
print torch.add(x, y))
Giving a specific output tensor:
result = torch.Tensor(5, 3) torch.add(x, y, out=result) print result
Addition in-place:
# adds x to y y.add_(x print y
Note: Any operation that mutates a tensor in-place is post-fixed with an _
For example: x.copy_(y)
, x.t_()
, will change x
.
For more tensor operations visit the torch doc page.
Numpy-like PyTorch
The torch Tensor and numpy array will share their underlying memory locations, and changing one will change the other.
You can even use standard numpy-like indexing for PyTorch tensors:
print x[:, 1]
Converting Torch Tensor to Numpy Array
The torch Tensor and numpy array will share their underlying memory locations, and changing one will change the other.
Converting torch Tensor to numpy Array:
a = torch.ones(5)
b = a.numpy() print b)
See how the numpy array changed in value:
a.add_(1) print a print b
Converting Numpy Array to Torch Tensor (aka the other way 'round):
import numpy as np a = np.ones(5) b = torch.from_numpy(a) np.add(a, 1, out=a) print a print(b
All the Tensors on the CPU except a CharTensor support converting to NumPy and back.
CUDA Tensors
Tensors can be moved onto GPU using the .cuda function.
Note: Not usable as of now since this article runs on an instance without GPU support!
# let us run this cell only if CUDA is available if torch.cuda.is_available(): x = x.cuda() y = y.cuda() print x + y
Autograd: Automatic Differentiation
Central to all neural networks in PyTorch is the autograd
package. Let’s first briefly visit this, and we will then go to training our first neural network.
The autograd
package provides automatic differentiation for all operations on Tensors. It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.
Variable
autograd.Variable
is the central class of the package. It wraps a Tensor, and supports nearly all of operations defined on it. Once you finish your computation you can call .backward()
and have all the gradients computed automatically. You can access the raw tensor through the .data
attribute, while the gradient w.r.t. this variable is accumulated into .grad
.
There’s one more class which is very important for autograd implementation - a Function
.
Variable
and Function
are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each variable has a .grad_fn
attribute that references a Function
that has created the Variable
(except for Variables created by the user - their grad_fn
is None
).
If you want to compute the derivatives, you can call .backward()
on a Variable
. If Variable is a scalar (i.e. it holds a one element data), you don’t need to specify any arguments to backward()
, however if it has more elements, you need to specify a grad_output
argument that is a tensor of matching shape.
import torch from torch.autograd import Variable
Create a variable:
x = Variable(torch.ones(2, 2), requires_grad=True) print x
Do an operation of variable:
y = x + 2 print(y
Since y
was created as a result of an operation, it has a grad_fn
.
print y.grad_fn
Do more operations on y
:
z = y * y * 3 out = z.mean() print z, out)
Gradients
Let's backprop now. out.backward()
is equivalent to doing out.backward(torch.Tensor([1.0]))
.
out.backward()
Print gradients d(out)/dx:
Continue with Part 2: Neural Networks