SVM Decision Boundary Construction with RBF Kernel
In this section, we focus on the construction of decision boundaries using SVMs with the RBF kernel. Unlike the linear kernel, which assumes a linear relationship between features, the RBF kernel is capable of capturing complex, non-linear relationships in the data. This makes it particularly suitable for scenarios where classes are not easily separable by a straight line or plane in the feature space.
The RBF kernel, also known as the Gaussian kernel, computes the similarity between feature vectors in the original feature space by measuring the distance between them in a high-dimensional space. Mathematically, the RBF kernel [Tex]K(x_i, x_j)[/Tex] between two feature vectors [Tex]x_i[/Tex] and [Tex]x_j[/Tex] is calculated as:
[Tex]K(x_i,x_j)=exp(−γ⋅∣∣x_i−x_j∣∣^{2})[/Tex]
Here, [Tex]∣∣x_i−x_j∣∣^2[/Tex] represents the squared Euclidean distance between the feature vectors [Tex]x_i[/Tex] and [Tex]x_j[/Tex], and [Tex]\gamma[/Tex] is a parameter that controls the influence of each training example on the decision boundary. The exponential term [Tex]exp(−γ⋅∣∣xi−xj∣∣^2)[/Tex] ensures that closer points have a higher similarity, while farther points have a lower similarity.
Implementation: SVM Decision Boundary Construction using RBF Kernel
Now, will be creating decision boundaries using RBF kernel using the following code:
# Create an SVM classifier
clf = svm.SVC(kernel='rbf', C=1.0)
clf.fit(X_train, y_train)
To visualize the boundaries using the same code, we used before for visualizing the SVM boundaries for linear kernel.
plt.figure(figsize=(12, 6))
# Create a mesh to plot
h = .02 # step size in the mesh
x_min, x_max = X_scaled[:, 0].min() - 1, X_scaled[:, 0].max() + 1
y_min, y_max = X_scaled[:, 1].min() - 1, X_scaled[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
# Predict class labels for each point in the mesh
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
# Put the result into a color plot
plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8)
# Plot also the training points
plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=plt.cm.coolwarm, edgecolors='k')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.title('SVM Decision Surface using Breast Cancer dataset')
plt.show()
Output:
We can observe, that the decision boundary produced by SVMs with an RBF kernel is non-linear.
Now, let’s evaluated the model.
# Evaluate the model using the test set
accuracy = clf.score(X_test, y_test)
print(f'The model accuracy on the test set is: {accuracy * 100:.2f}%')
# Confusion Matrix
cm = confusion_matrix(y_test, clf.predict(X_test))
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=cancer.target_names)
disp.plot(cmap=plt.cm.Blues)
plt.show()
Output:
The model accuracy on the test set is: 90.64%
How SVM constructs boundaries?
Support Vector Machines (SVMs) are a powerful machine learning technique excelling at classifying data. Imagine a scenario where you have a collection of red and blue marbles, and your goal is to draw a clear dividing line to separate them. SVMs achieve this by not just creating a separation, but by finding the optimal separation boundary, ensuring the maximum distance between the line and the closest marbles from each color. This wide separation, known as the margin, enhances the model’s ability to handle unseen data. In this tutorial, we will construct decision boundary for breast cancer problem.
Contact Us