Note: Materi ini merupakan materi supplemental, tidak bersifat wajib. Namun akan mendukung kalian menjadi seorang AI engineer yang handal.

Choosing the Right Processing Unit for Machine Learning: CPU, GPU, or TPU?

CPU vs TPU vs GPU (source miro.medium.com)

The world of machine learning is full of challenging decisions. One of these is choosing between a Central Processing Unit (CPU), Graphics Processing Unit (GPU), and Tensor Processing Unit (TPU) for your machine learning tasks. Are you unsure about which one to pick? Don’t sweat! We’re here to help you understand the implications of this decision and guide you towards the right choice.

CPU (Central Processing Unit): The Jack-of-all-Trades

Have you ever thought about what makes your computer tick? That’s the job of the CPU!

A CPU is the primary component of a computer that does most of the processing inside the computer. Basically, it’s responsible for the “thinking”. CPUs provide computational abilities for a wide variety of tasks, not just machine learning.

So, when would you choose CPU for your machine learning journey?

Pros:

Versatility: CPUs can handle any command or task, making them an all-rounder.
Optimized for single-thread tasks: CPUs excel in executing single-thread tasks with complex logic, thanks to their high clock speeds.

Cons:

Limited parallel processing capabilities: CPUs might not be the best choice for large-scale matrix operations and computations in machine learning due to their limited number of cores.
Not ideal for large-scale machine learning: CPUs aren’t designed to run parallel computations as well as some other hardware options, which can slow down performance on large-scale machine learning tasks.

GPU (Graphics Processing Unit): The Hardcore Processor

Have you ever wondered how your computer displays high-quality images and videos so quickly? For that, you can thank the GPU!

GPUs, initially designed for rendering high-quality images and videos, are now being increasingly used in machine learning due to their ability to perform parallel operations.

So, when should you opt for a GPU in your machine learning journey?

Pros:

High parallel processing capabilities: GPUs can run thousands of threads simultaneously, making them perfect for tasks like training neural networks.
Speed: In the context of machine learning, GPUs can process data faster than CPUs, reducing model training time significantly.

Cons:

High power consumption: GPUs can be a bit power-hungry, consuming more energy than CPUs.
Cost: High-performing GPUs can be quite heavy on your pocket.

TPU (Tensor Processing Unit): The Machine Learning Maestro

Ever heard of Google’s secret weapon for machine learning? It’s the TPU!

TPUs are Google’s custom-developed application-specific integrated circuits (ASICs) that are specifically designed to accelerate machine learning workloads.

So, when making the big leap into serious machine learning, is a TPU the right choice?

Pros:

Highly Optimized for Machine Learning: TPUs are purposefully built to accelerate machine learning tasks on a large-scale.
Speed: When it comes to machine learning tasks, TPUs pack a punch and often surpass the speed of CPUs and GPUs.

Cons:

Limited availability: TPUs might be hard to come by, as they aren’t as widely available as CPUs and GPUs.
Specialization: The flip side to their optimization for machine learning tasks is that TPUs are less versatile than CPUs and GPUs.

For more information, visit the following video link: https://www.youtube.com/watch

The CPU vs GPU Showdown: A Simple Experiment

Enough theory, let’s see these two in action! We will conduct a simple experiment using PyTorch and Matplotlib to illustrate the difference in execution times between CPU and GPU. But remember, your mileage may vary based on the specific task and available hardware.

Sadly, we will have to exclude the TPU in this test, as running on TPUs in PyTorch is complicated since it’s not natively supported.

Please make sure to run the below code in a system that has both a CPU and GPU, or else the GPU portion won’t work.

# Importing the necessary libraries
import torch
import time
import matplotlib.pyplot as plt

# Check for CUDA
if not torch.cuda.is_available():
    print('CUDA is not available. The script will run on a CPU.')
device_cpu = torch.device("cpu")
device_gpu = torch.device("cuda") 

# Function to test speed
def speed_test(device, size=10000, runs=100):
    print(f'\nRunning on {device}')
    times = []
    for _ in range(runs):
        start = time.time()
        _ = torch.ones((size, size), device=device)
        elapsed = time.time() - start
        times.append(elapsed)
    return times

# Running the speed test
cpu_times = speed_test(device_cpu)
gpu_times = speed_test(device_gpu)

# Printing the average execution time
print("\nAverage execution time on CPU: ", sum(cpu_times)/len(cpu_times), " seconds")
print("Average execution time on GPU: ", sum(gpu_times)/len(gpu_times), " seconds")

# Visualizing the results
plt.figure(figsize=(10, 7))

plt.hist(cpu_times, bins=30, alpha=0.5, label='CPU')
plt.hist(gpu_times, bins=30, alpha=0.5, label='GPU')

plt.xlabel('Execution Time (Seconds)', size=14)
plt.ylabel('Count', size=14)
plt.title('Execution Time Distribution by Hardware (CPU vs GPU)', size=14)

plt.legend(loc='upper right')

plt.show()

This script creates a tensor of ones on the specified device (CPU or GPU) over and over again, recording the time for each run. These times are then visualized in a handy histogram. From this, we can see the difference in execution times.

But always remember, this is a simple, naive example and real-world tasks will often involve much more complex computations and data transfers, which can increase the execution time when run on a GPU or TPU.

Bringing ML into the Picture: A Natural Language Processing Task

Let’s raise the stakes a bit and include a typical machine learning task: text classification. This time, we’ll be using the Hugging Face Transformers library with the popular BERT model. We’ll run this model on both CPU and GPU and compare execution times.

Let’s witness the showdown between CPU and GPU in a real-life scenario!

%pip install transformers

After installing the Transformers library, it’s time to get the model humming:

# Importing the necessary libraries
import torch
from transformers import pipeline
import time
import matplotlib.pyplot as plt

# Check for CUDA
if not torch.cuda.is_available():
    print('CUDA is not available. The script will run on a CPU.')
device_cpu = 'cpu'
device_gpu = 'cuda' 

# Function to test speed and print pipeline results
def speed_test(device, runs=100):
    print(f'\nRunning on {device}')
    nlp = pipeline('sentiment-analysis', model="distilbert-base-uncased-finetuned-sst-2-english", device=0 if device == device_gpu else -1)  # GPU if available, else CPU
    text = "This is a sample text for sentiment analysis."
    times = []
    for _ in range(runs):
        start = time.time()
        result = nlp(text)
        elapsed = time.time() - start
        times.append(elapsed)
        if _ == 0:
            print("Pipeline output for run 0: ", result)
    return times

# Running the speed test
cpu_times = speed_test(device_cpu)
gpu_times = speed_test(device_gpu)

# Printing the average execution time
print("\nAverage execution time on CPU: ", sum(cpu_times)/len(cpu_times), " seconds")
print("Average execution time on GPU: ", sum(gpu_times)/len(gpu_times), " seconds")

# Visualizing the results
plt.figure(figsize=(10, 7))

plt.hist(cpu_times, bins=30, alpha=0.5, label='CPU')
plt.hist(gpu_times, bins=30, alpha=0.5, label='GPU')

plt.xlabel('Execution Time (Seconds)', size=14)
plt.ylabel('Count', size=14)
plt.title('Execution Time Distribution by Hardware (CPU vs GPU)', size=14)

plt.legend(loc='upper right')

plt.show()

This script performs sentiment analysis on a piece of text using the powerful BERT model, records the execution time on both the CPU and GPU, and finally plots a histogram to compare the execution time distributions.

Real-World Applications

In the real world, the choice between CPUs, GPUs, and TPUs depends largely on the task at hand and the available resources. For instance, in image rendering or deep learning tasks, which require handling a massive volume of parallel computations, GPUs and TPUs tend to be the preferred choice. On the other hand, for general purpose tasks or when the resources are limited, CPUs might be the way to go.

Future Trends

The future of machine learning hardware is likely to see new entrants. As the demand for high-performance computing continues to grow, companies may start developing their own application-specific processors, similar to Google’s TPUs. Furthermore, advancements in hardware design, such as implementing quantum computing for machine learning tasks, could also revolutionize the landscape.

In a Nutshell

To summarize, the choice between CPUs, GPUs, and TPUs for machine learning tasks largely depends on the nature of the task and the resources at hand. CPUs are general-purpose processors and perform well for single-thread tasks. GPUs, originally designed for graphics rendering, excel in parallel computations, making them suitable for large-scale machine learning tasks. TPUs, specifically designed for machine learning tasks, boast incredible speed but are not as versatile as CPUs or GPUs.

We hope this guide has illuminated the differences between these three types of hardware and will help you make an informed decision for your next machine learning project!