!pip install torch
!pip install fastbook
Pytorch Basic
Pytorch is a python library for deep learning. It provides a high level abstraction which make it easy to train deep neural networks.
Let’s install pytorch first.
For this tutorial, we will use the following version:
# Check pytorch version
import torch
print(torch.__version__)
2.0.0
Tensor
Tensor is the basic data structure in pytorch. It is similar to numpy array, but it provides automatic differentiation and it can be run on GPU.
If we look at the trend, it’s current the most popular deep learning library
The capability of running on GPU is very important for deep learning. It can speed up the training process by a lot by parallelizing the computation. Watch this demo: GPU vs CPU Demonstration
Tensor is n-dimensional array. When n is 0, it is a scalar. When n is 1, it is a vector. When n is 2, it is a matrix. When n is 3, it is a cube. When n is 4, it is a 4-dimensional array. And so on.
# 0 dimension tensor
import torch
= torch.tensor(42)
x print(x)
tensor(42)
# 1 dimension tensor
import torch
= torch.tensor([42, 43])
x print(x)
tensor([42, 43])
# 2 dimension tensor
import torch
= torch.tensor([[42, 43], [44, 45]])
x print(x)
tensor([[42, 43],
[44, 45]])
# 3 dimension tensor
import torch
= torch.tensor([[[42, 43], [44, 45]], [[46, 47], [48, 49]]])
x print(x)
tensor([[[42, 43],
[44, 45]],
[[46, 47],
[48, 49]]])
To check the dimension of a tensor, we can use the dim()
method.
= torch.tensor([[[42, 43], [44, 45]], [[46, 47], [48, 49]]])
x print(x.dim())
= torch.tensor(42)
x print(x.dim())
3
0
Constructing Tensor
Random Tensor
import torch
5, 3) torch.rand(
tensor([[0.2247, 0.7425, 0.7174],
[0.6279, 0.4899, 0.3813],
[0.5859, 0.6029, 0.3960],
[0.3568, 0.9450, 0.2779],
[0.8900, 0.5270, 0.5401]])
The parameters of the rand()
method are the dimension of the tensor. For example, torch.rand(2, 3)
will create a 2x3 tensor with random values.
import torch
# One dimensional tensor containing 5 random numbers
5) torch.rand(
tensor([0.3616, 0.8951, 0.4249, 0.0801, 0.5676])
Ones Tensor
Ones tensor is a tensor with all values equal to 1
import torch
2, 3) torch.ones(
tensor([[1., 1., 1.],
[1., 1., 1.]])
Quiz: what’s the output of torch.ones(1, 2)
?
import torch
1, 2) torch.ones(
tensor([[1., 1.]])
Zeros Tensor
Zeros tensor is a tensor with all values equal to 0
import torch
1,2,3) torch.zeros(
tensor([[[0., 0., 0.],
[0., 0., 0.]]])
Range Tensor
If we want to create a tensor with values from 0 to n-1, we can use the torch.arange(n)
method.
import torch
1, 10) torch.arange(
tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
-3, 3, 2) torch.arange(
tensor([-3, -1, 1])
Full
If we want to create a tensor with all values equal to a constant, we can use the torch.full()
method.
import torch
2, ), 42) torch.full((
tensor([42, 42])
import torch
2, 3), 42) torch.full((
tensor([[42, 42, 42],
[42, 42, 42]])
If we don’t have full
method, we can use ones
method and multiply it with the constant.
import torch
2, 3) * 42 torch.ones(
tensor([[42., 42., 42.],
[42., 42., 42.]])
Tensor Operations
We can do normal arithmetic operations on tensor:
= torch.tensor(42)
x print(x + 2)
print(x * 3)
print(x ** 2)
print(x / 2)
tensor(44)
tensor(126)
tensor(1764)
tensor(21.)
We can do similar operations on higher dimensional tensors:
= torch.tensor([1, 2])
x print(x + 1)
= torch.tensor([[1, 2], [3, 4]])
x print(x + 1)
tensor([2, 3])
tensor([[2, 3],
[4, 5]])
Wow from mathematic lesson, didn’t we learn that addition operations are only defined for tensors with the same shape? How come we can add a scalar tensor (0 dim) with a 2 dim tensor?
This is because of broadcasting. Pytorch will automatically broadcast the scalar tensor to match the shape of the other tensor.
So,
\[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ \end{bmatrix} + 1 \]
would become
\[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ \end{bmatrix} + \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{bmatrix} \]
Broadcasting
So how does broadcasting work?
Taken from Pytorch Docs:
Two tensors are “broadcastable” if the following rules hold:
- Each tensor has at least one dimension.
- When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.
Broadcasting rules:
- If the number of dimensions of x and y are not equal, prepend 1 to the dimensions of the tensor with fewer dimensions to make them equal length.
- Then, for each dimension size, the resulting dimension size is the max of the sizes of x and y along that dimension.
import torch
1,), 5) + torch.ones(3) torch.full((
tensor([6., 6., 6.])
\[ \begin{bmatrix} 5 \end{bmatrix} + \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 5 \\ 5 \\ 5 \end{bmatrix} + \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \]
import torch
# Error
3) + torch.ones(5) torch.ones(
\[ \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} + \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \\ 1 \end{bmatrix} = error \]
Why is it an error?
Remember this rule:
When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.
The trailing dimension of the first tensor is 3, while the trailing dimension of the second tensor is 5. They are not equal, one of them is not 1, and one of them does not exist. So it’s an error.
Let’s try again
import torch
2, 2), 5) + torch.full((1, 2), 3) torch.full((
tensor([[8, 8],
[8, 8]])
\[ \begin{bmatrix} 5 && 5 \\ 5 && 5 \\ \end{bmatrix} + \begin{bmatrix} 3 && 3 \end{bmatrix} = \begin{bmatrix} 5 && 5 \\ 5 && 5 \\ \end{bmatrix} + \begin{bmatrix} 3 && 3 \\ 3 && 3 \\ \end{bmatrix} \]
import torch
= torch.full((3, 4, 2), 5)
x = torch.full((2,), 3)
y = x + y
z print('x =', x)
print('y =', y)
print('z = ',z)
x = tensor([[[5, 5],
[5, 5],
[5, 5],
[5, 5]],
[[5, 5],
[5, 5],
[5, 5],
[5, 5]],
[[5, 5],
[5, 5],
[5, 5],
[5, 5]]])
y = tensor([3, 3])
z = tensor([[[8, 8],
[8, 8],
[8, 8],
[8, 8]],
[[8, 8],
[8, 8],
[8, 8],
[8, 8]],
[[8, 8],
[8, 8],
[8, 8],
[8, 8]]])
Matrix Operation
Tensor can be multiplied with another tensor using the matmul()
method or the @
operator.
import torch
= torch.tensor([[3, 4]])
x print(x.shape)
= torch.tensor([[1, 2, 3], [4, 5, 6]])
y print(y.shape)
@ y
x
x.matmul(y) x.mm(y)
torch.Size([1, 2])
torch.Size([2, 3])
tensor([[19, 26, 33]])
\[ \begin{bmatrix} 3 && 4 \end{bmatrix} \dot\\ \begin{bmatrix} 1 && 2 && 3 \\ 4 && 5 && 6 \end{bmatrix} \]
Be careful not to be confused with the *
operator. The *
operator is element-wise multiplication.
import torch
= torch.tensor([[3, 4]])
x * x x
tensor([[ 9, 16]])
Transpose
We can transpose a tensor using the t
method.
import torch
= torch.tensor([[3, 4]])
x print(x)
print(x.shape)
print(x.T)
print(x.T.shape)
tensor([[3, 4]])
torch.Size([1, 2])
tensor([[3],
[4]])
torch.Size([2, 1])
Neural Network with Pytorch
Now we have learned the required knowledge to build a neural network with pytorch.
from fastbook import *
# Draw neurons with multiple inputs and weights
'''
gv(z[shape=box3d width=1 height=0.7]
bias[shape=circle width=0.3]
x_0[label=1]
x_1[label=2]
// Subgraph to force alignment on x-axis
subgraph {
rank=same;
z;
bias;
alignmentNode [style=invis, width=0]; // invisible node for alignment
bias -> alignmentNode [style=invis]; // invisible edge
z -> alignmentNode [style=invis]; // invisible edge
}
x_0->z [label="3"]
x_1->z [label="4"]
bias->z [label="5" pos="0,1.2!"]
z->a
a->output [label="ReLU"]
''')
Remember the equation
\[ z = wx + b\\ a = ReLU(z) \]
Let’s represent it in pytorch
= torch.tensor([[1],[2]])
x
= torch.tensor([3, 4])
w = torch.tensor(5)
b = w @ x + b
z = torch.relu(z)
a a
tensor([16])
How about multiple layers? Easy!
from fastbook import *
'''
gv( x_0[label=1]
x_1[label=2]
a_0_0[label="b=3"]
a_0_1[label="b=2"]
a_1_0[label="b=1"]
x_0 -> a_0_0 [label=-3]
x_0 -> a_0_1 [label=4]
x_1 -> a_0_0 [label=5]
x_1 -> a_0_1 [label=-6]
a_0_0 -> a_1_0 [label=7]
a_0_1 -> a_1_0 [label=8]
a_1_0 -> output
''')
= torch.tensor([[1],[2]])
x
= torch.tensor([[-3, 5], [4, -6]])
w_0 = torch.tensor([[3, 2]])
b_0 = torch.relu(w_0 @ x + b_0)
a_0
= torch.tensor([[7, 8]])
w_1 = torch.tensor([[1]])
b_1 = torch.relu(w_1 @ a_0 + b_1)
a_1 a_1
tensor([[71]])
Exercise
!pip install rggrader
# @title #### Student Identity
= "your student id" # @param {type:"string"}
student_id = "your name" # @param {type:"string"} name
Implement the following neural network
from fastbook import *
'''
gv( x_0[label=3]
a_0_0[label="b=-2, ReLU"]
a_0_1[label="b=5, ReLU"]
a_1_0[label="b=0, ReLU"]
x_0 -> a_0_0 [label=-2]
x_0 -> a_0_1 [label=5]
a_0_0 -> a_1_0 [label=3]
a_0_1 -> a_1_0 [label=2]
a_1_0 -> output
''')
from rggrader import submit
# Put your code here
= 0
answer
= "10_pytorch-basic"
assignment_id = "00_single-input-nn"
question_id str(answer), question_id) submit(student_id, name, assignment_id,
from fastbook import *
'''
gv( x_0[label=3]
x_1[label=5]
a_0_0[label="b=8, ReLU"]
a_0_1[label="b=-2, ReLU"]
a_0_2[label="b=4, ReLU"]
a_1_0[label="b=3, ReLU"]
x_0 -> a_0_0 [label=-2]
x_0 -> a_0_1 [label=5]
x_0 -> a_0_2 [label=3]
x_1 -> a_0_0 [label=8]
x_1 -> a_0_1 [label=-2]
x_1 -> a_0_2 [label=4]
a_0_0 -> a_1_0 [label=3]
a_0_1 -> a_1_0 [label=2]
a_0_2 -> a_1_0 [label=8]
a_1_0 -> output
''')
from rggrader import submit
# Put your code here
= 0
answer
= "10_pytorch-basic"
assignment_id = "01_multi-input-nn"
question_id str(answer), question_id) submit(student_id, name, assignment_id,