Uses of Support Vector Machine

776 Words2 Pages

a. Support Vector Machine(SVM): Over the past several years, there has been a significant amount of research on support vector machines and today support vector machine applications are becoming more common in text classification. In essence, support vector machines define hyperplanes, which try to separate the values of a given target field. The hyperplanes are defined using kernel functions. The most popular kernel types are supported: linear, polynomial, radial basis and sigmoid. Support Vector Machines can be used for both, classification and regression. Several characteristics have been observed in vector space based methods for text classification [15,16], including the high dimensionality of the input space, sparsity of document vectors, linear separability in most text classification problems, and the belief that few features are relevant.

Assume that training data with for are given. The dual formulation of soft margin support vector machines (SVMs) with a kernel function K and control parameter C is

(1)

s.t. , ,

The kernel function

where <,> denotes an inner product between two vectors, is introduced to handle nonlinearly separable cases without any explicit knowledge of the feature mapping . The formulation (1) shows that the computational complexity of SVM training depends on the number of training data samples which is denoted as n. The computational complexity of training depends on the dimension of the input space. This becomes clear when we consider some typical kernel functions such as the linear kernel, ,

The polynomial kernel, ,

The Gaussian RBF (Radial Base Function) kernel, ,

Where d is the degree of polynomial and γ is a parameter to control. The evaluation...

... middle of paper ...

...of the total number of correct predictions. It is calculated by the following formula

True positive (TP) is the proportion of positive cases which are correctly classified and calculated by the following formula

False positive (FP) is the proportion of negative cases that are incorrectly classified as positive and calculated by the following formula

True negative (TN) is the proportion of negative cases that are classified correctly and calculated by the following formula

False negative (FN) is the proportion of positive cases that are incorrectly classified as negative and calculated by the following formula

Precision (P) is the proportion of the predicted positive cases that are correct and calculated by using the formula

Accuracy is the proportion of the total number of correct predictions. It is calculated by the following formula

Open Document