SVMs Classification Understanding by CHIRAG


What is a Support Vector Machine?

 1.) It is a supervised machine learning problem where we try to find a hyperplane that best separates the two classes. 

2.)Support Vector Machines (SVMs in short) are machine learning algorithms that are used for classification and regression purposes. SVMs are one of the powerful machine learning algorithms for classification, regression and outlier detection purposes. 

3.) An SVM classifier builds a model that assigns new data points to one of the given categories. 

4.)Thus, it can be viewed as a non-probabilistic binary linear classifier.


 Don’t get confused between SVM and logistic regression. Both the algorithms try to find the best hyperplane, but the main difference is logistic regression is a probabilistic approach whereas support vector machine is based on statistical approaches.

Support Vector Machines intuition 

Now, we should be familiar with some SVM terminology.

Hyperplane

A hyperplane is a decision boundary which separates between given set of data points having different class labels. The SVM classifier separates data points using a hyperplane with the maximum amount of margin. This hyperplane is known as the maximum margin hyperplane and the linear classifier it defines is known as the maximum margin classifier.

or we can understand by this further

Optimal Hyperplane

Suppose we have a dataset that has two tags (green and blue), and the dataset has two features x1 and x2. We want a classifier that can classify the pair(x1, x2) of coordinates in either green or blue. The SVM algorithm helps to find the best line or decision boundary. SVM algorithm finds the closest point of the lines from both the classes. These points are called support vectors. The distance between the vectors and the hyperplane is called as margin. And the goal of SVM is to maximize this margin. The hyperplane with maximum margin is called the optimal hyperplane.

Support Vectors

Support vectors are the sample data points, which are closest to the hyperplane. These data points will define the separating line or hyperplane better by calculating margins.

Margin

A margin is a separation gap between the two lines on the closest data points. It is calculated as the perpendicular distance from the line to support vectors or closest data points. In SVMs, we try to maximize this separation gap so that we get maximum margin.

Types of SVMs


There are two different types of SVMs, each used for different things:

  • Simple SVM: Typically used for linear regression and classification problems.


  • Kernel SVM: Has more flexibility for non-linear data because you can add more features to fit a hyperplane instead of a two-dimensional space.

Kernel functions

A kernel is a function used in SVM for helping to solve problems. With the help of kernel we can go to higher dimensions and perform smooth calculations. We can go up to an infinite number of dimensions using kernels. Kernel plays a vital role in classification and is used to analyze some patterns in the given dataset. They are very helpful in solving a non-linear problem by using a linear classifier.

Sometimes, we cannot have a hyperplane for certain problems. This problem arises when we go up to higher dimensions and try to form a hyperplane. We have various svm kernel functions to convert the non-linear data to linear. 

Linear

These are commonly recommended for text classification because most of these types of classification problems are linearly separable.

The linear kernel works really well when there are a lot of features, and text classification problems have a lot of features. Linear kernel functions are faster than most of the others and you have fewer parameters to optimize.

Here's the function that defines the linear kernel:

f(X) = w^T * X + b

In this equation, is the weight vector that you want to minimize, is the data that you're trying to classify, and is the linear coefficient estimated from the training data. This equation defines the decision boundary that the SVM returns.

Polynomial

The polynomial kernel isn't used in practice very often because it isn't as computationally efficient as other kernels and its predictions aren't as accurate.

Here's the function for a polynomial kernel:

f(X1, X2) = (a + X1^T * X2) ^ b

This is one of the more simple polynomial kernel equations you can use. f(X1, X2) represents the polynomial decision boundary that will separate your data. X1 and X2 represent your data.

Gaussian Radial Basis Function (RBF)

One of the most powerful and commonly used kernels in SVMs. Usually the choice for non-linear data.

Here's the equation for an RBF kernel:

f(X1, X2) = exp(-gamma * ||X1 - X2||^2)

In this equation, gamma specifies how much a single training point has on the other data points around it. ||X1 - X2|| is the dot product between your features.


Sigmoid

More useful in neural networks than in support vector machines, but there are occasional specific use cases.

Here's the function for a sigmoid kernel:

f(X, y) = tanh(alpha * X^T * y + C)

In this function, alpha is a weight vector and is an offset value to account for some mis-classification of data that can happen.

=================================================

IMPORTANT QUESTIONS & ANSWERS

Q1. What is a SVM algorithm?

A. SVM algorithm is used for both classification and regression tasks. It finds an optimal hyperplane to separate data points of different classes in a high-dimensional space.

Q2. Why is SVM the best algorithm?

A. SVM is considered one of the best algorithms because it can handle high-dimensional data, is effective in cases with limited training samples, and can handle non-linear classification using kernel functions.

Q3. What are the steps of SVM algorithm?

A. The steps of the SVM algorithm involve: (1) selecting the appropriate kernel function, (2) defining the parameters and constraints, (3) solving the optimization problem to find the optimal hyperplane, and (4) making predictions based on the learned model.

Q4. What does SVM do in machine learning?

A. In machine learning, SVM is used to classify data by finding the optimal decision boundary that maximally separates different classes. It aims to find the best hyperplane that maximizes the margin between support vectors, enabling effective classification even in complex, non-linear scenarios.

==============================================================

HOPE YOU GUYS LIKE THIS ARTICLE , STAY CONNECTED FOR MORE SUCH ARTICLES

THANK YOU 

CHIRAG GUPTA






    Comments

    Popular posts from this blog

    Overfitting , Underfitting Bias & Variance Understanding by CHIRAG

    Linear Regression Understanding by CHIRAG

    Understanding Confusion Matrix by CHIRAG