# MNIST solverimport torch# Load MNIST datafrom torchvision import datasets, transformsfrom torch.utils.data import DataLoader# Load MNIST datamnist_train = datasets.MNIST(root='./data', train=True, transform=transforms.ToTensor(), download=True)mnist_test = datasets.MNIST(root='./data', train=False, transform=transforms.ToTensor(), download=True)# Inspect dataprint(mnist_train)print(mnist_test)# Print the shape of the first image in the training setprint(mnist_train[0][0].shape)
Dataset MNIST
Number of datapoints: 60000
Root location: ./data
Split: Train
StandardTransform
Transform: ToTensor()
Dataset MNIST
Number of datapoints: 10000
Root location: ./data
Split: Test
StandardTransform
Transform: ToTensor()
torch.Size([1, 28, 28])
The data is huge, the training data consist of 60,000 entries of 28x28 images, i.e. itβs a matrix of 60,000x28x28
Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is a special type of Gradient Descent where the loss is computed on a single example. This is a very common approach in Deep Learning because it is much faster than computing the loss on the whole dataset. The loss is computed on a single example and the weights are updated after each example. This is why it is called Stochastic Gradient Descent.
Mini-batch Gradient Descent
Mini-batch Gradient Descent is a compromise between SGD and Batch Gradient Descent. In Mini-batch Gradient Descent, the loss is computed on a small number of examples (typically between 8 and 256) instead of a single example. This makes it more computationally efficient than SGD because you can use vectorized operations, especially when using GPUs.
Data Loader
Pytorch has a data loader that can be used to load the data in batches. This is very useful when the data is huge and cannot be loaded in memory.
Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz