import matplotlib.pyplot as plt
import numpy as np
# Original vector
= np.array([2, 3])
original_vector
# Direction vector
= np.array([4, 2])
direction_vector
# Normalizing the direction vector
= direction_vector / np.linalg.norm(direction_vector)
normalized_direction_vector
# Moving the original vector 2 units in the direction of the direction vector
= original_vector + 2 * normalized_direction_vector
new_vector
new_vector
# Creating the plot
= plt.subplots()
fig, ax
# Original vector
0, 0, original_vector[0], original_vector[1], angles='xy', scale_units='xy', scale=1, color='blue', label='Original Vector (2,3)')
ax.quiver(
0, 0, direction_vector[0], direction_vector[1], angles='xy', scale_units='xy', scale=1, color='cyan', label='Vector that we get the direction from (4,2)')
ax.quiver(
0], original_vector[1], normalized_direction_vector[0] * 2, normalized_direction_vector[1] *2, angles='xy', scale_units='xy', scale=1, color='violet',
ax.quiver(original_vector[='Direction from original vector to new vector')
label
# New vector
0, 0, new_vector[0], new_vector[1], angles='xy', scale_units='xy', scale=1, color='green', label='New Vector (~3.79, ~3.89)')
ax.quiver(
# Setting the limits
0, 10])
ax.set_xlim([0, 10])
ax.set_ylim([
# Adding labels and title
'X-axis')
ax.set_xlabel('Y-axis')
ax.set_ylabel('Vector Transformation')
ax.set_title(
ax.legend()
# Display the plot
True)
plt.grid( plt.show()
Note: Materi ini merupakan materi supplemental, tidak bersifat wajib. Namun akan mendukung kalian menjadi seorang AI engineer yang handal.
Direction of a vector
Vector consists of two components: direction and magnitude. For the direction of a vector, we basically “remove” the magnitude of the vector and keep only the direction. That’s why the formula as simple as below:
\[ \text{Direction of vector } x = \frac{x}{\|x\|} \]
So the direction of a vector is a vector that reduced to only have a “direction” component where the magnitude of the vector is 1. The vector that have the magnitude of 1 is called a unit vector.
Moving in the direction of a vector
We can move an x
vector several “steps” in the direction of y
by using the formula below:
\[ \text{Move } x \text{ in the direction of } y = x + k \frac{y}{\|y\|} \]
Where k
is the number of steps we want to move in the direction of y
.
Play around with above formula by changing the value of k
and see how the vector x
move in the direction of y
.
Dot product of a vector with itself
One formula that we need for the next section is the dot product of a vector with itself. The formula is as simple as below:
\[ x \cdot x = \|x\|^2 \]
So basically the dot product of a vector with itself is the square of the magnitude of the vector.
Deriving the margin width formula
All the equations needed to derive the margin width formula
So before we’ve already learned that SVM is about finding the most optimal hyperplane, and the most optimal hyperplane is the one that has the largest margin. But how do we calculate the margin width?
The width of the margin can be thought of as the distance of any of the possible vector inside of the hyperplane to their corresponding support vector.
Because it doesn’t matter of which class to choose to get the width (as the margin should have the same width for both classes), we can choose below formula for our calculation:
\[ \text{Support Vector for class 1}: w \cdot x + b = 1 \\ \]
Where \(w\) is the weight vector, \(x\) is the input vector, and \(b\) is the bias.
And let’s have another constraint that we’ve learned about hyperplane
\[ \text{Hyperplane}: w \cdot x + b = 0 \\ \]
Quick intuition
import matplotlib.pyplot as plt
import numpy as np
from ipywidgets import interact, widgets, Layout
=[[ 1, -0.6]]
coefficients = [-0.6]
intercept_
# Normalizing the direction vector
= coefficients / np.linalg.norm(coefficients)
normalized_direction_vector
def update_plot(number_of_moves):
= np.array([4, 5.67])
original_vector # Moving the original vector in the direction of the direction vector
= original_vector + number_of_moves * normalized_direction_vector
new_vector
# Creating the plot
= plt.subplots()
fig, ax
= np.linspace(-10, 10, 100)
x_vals = (-(coefficients[0][0] * x_vals) - intercept_[0]) / coefficients[0][1]
y_vals 'green', label='Hyperplane')
plt.plot(x_vals, y_vals,
#Plotting the line that goes through the support vector with class 1
= (1-(coefficients[0][0] * x_vals) - intercept_[0]) / coefficients[0][1]
y_vals_1 'purple', label='Line that goes through support vector with class 1', linestyle='--')
plt.plot(x_vals, y_vals_1,
#Plotting the line that goes through the support vector with class -1
= (-1-(coefficients[0][0] * x_vals) - intercept_[0]) / coefficients[0][1]
y_vals_2 'purple', label='Line that goes through support vector with class -1', linestyle='--')
plt.plot(x_vals, y_vals_2,
# Setting the limits
0, 10])
ax.set_xlim([0, 10])
ax.set_ylim([
# Adding labels and title
'X-axis')
ax.set_xlabel('Y-axis')
ax.set_ylabel('Vector Transformation')
ax.set_title(
ax.legend()
0][0], new_vector[0][1], color='red', label='New Vector')
plt.scatter(new_vector[# Display the plot
True)
plt.grid(
plt.show()
# Create an interactive slider for number_of_moves ranging from 1 to 10
= widgets.FloatSlider(value=0, min=0, max=10, step=0.1, description='k: ', layout=Layout(width='50%'))
number_of_moves_slider =number_of_moves_slider) interact(update_plot, number_of_moves
<function __main__.update_plot(number_of_moves)>
So on above interactive plot, you can try to slide the slider to change the value of \(k\) and see how the red points move. Several things to notice here:
- The red point is a point on the hyperplane, so it should satisfy the hyperplane equation \(w \cdot x + b = 0\).
- If we move the red point in the direction of the support vector, at one point it will satisfy the support vector equation \(w \cdot x + b = 1\), meaning that red point is now a support vector.
- The distance between the red point from the origin to it’s corresponding support vector is half of the margin width because it’s only moving from the hyper plane to it’s corresponding support vector, the full margin width is the distance between the two support vectors, basically twice the distance between the red point and it’s corresponding support vector.
- The direction of the red point moving from the hyper plane towards it’s corresponding support vector is \(k \frac{w}{\|w\|}\).
If you’re feeling confident with above points, let’s move on to the next section.
Calculating
So given \(x\) a vector inside the hyperplane:
\[ \text{Hyperplane}: w \cdot x + b = 0 \\ \]
And given \(y\) is a vector inside the support vector that is the counterpart of \(x\) when \(x\) moves several unit of direction towards the support vector:
\[ y = x + k \times \text{direction towards the support vector} \\ \]
The direction itself is the same direction as the \(w\) vector, and the \(k\) is the unit that \(x\) needs to reach to \(y\).
\[ y = x + k \frac{w}{\|w\|} \\ \]
The \(y\) itself as said above is a support vector of \(x\) counterpart, which means:
\[ w \cdot y + b = 1 \\ \] \[ w \cdot (x + k \frac{w}{\|w\|}) + b = 1 \\ \] \[ w \cdot x + k \frac{w \cdot w}{\|w\|} + b = 1 \\ \]
\[ w \cdot x + b + k \frac{w \cdot w}{\|w\|} = 1 \\ \]
And we already know that \(w \cdot x + b = 0\), so:
\[ k \frac{w \cdot w}{\|w\|} = 1 \\ \] \[ k \frac{\|w\|^2}{\|w\|} = 1 \\ \] \[ k \|w\| = 1 \\ \] \[ k = \frac{1}{\|w\|} \\ \]
So for the width of the margin, because it’s distance is basically the same between two classes, we can just multiply the \(k\) with 2:
\[ \text{Width of the margin} = 2k = \frac{2}{\|w\|} \\ \]
Then we can see that if we want to maximize the margin width, we need to minimize the \(\|w\|\) (the magnitude of vector w) because it’s in the denominator, the lower the value of \(\|w\|\), the higher the width of the margin.