The working operation of the SMOTE-FSVM model is depicted in Figure 7.1. As shown, the medical data has been gathered from a diverse set of sources, namely IoT sensor data, medical records, and University of California, Irvine (UCI) repository. The patient s data are acquired and transmitted to the cloud. Then, SMOTE-based upsampling process takes place to resolve the class imbalance problem. Finally, FSVM-based data classification task is carried out to determine the class labels and find the existence of the diseases.


SMOTE method has been established for neutralizing the irregular dataset issue for classification [18]. It synthesizes the instances of minimal class under the operation in “feature space” instead of “data space.” It is considered as upsampling method where the marginal data produces N% of synthetic data. The ratio enhances marginal data which is comparable with majority data. It enhances the samples of marginal data and extends the decision causes for classifiers.

Here, few variables like T, N%, and к are invoked initially, where T defines the value of marginal class instances, N% indicated the ratio of upsampling, and к refers to the к value of к nearest neighbor (kNN) of specific marginal class instances. The procedure contributes to produce synthetic instances as given in the following points:

• Once the initialization is completed, a marginal class instance has been selected where synthetic data should be produced.

After this one among kNN marginal class neighbors of instances can be selected randomly.

Framework of SMOTE-FSVM model

FIGURE 7.1 Framework of SMOTE-FSVM model

  • • It is known that an instance is composed of massive feature data, so a synthetic instance is generated by producing synthetic data for all feature data. A synthetic data is produced by including a factor to primary feature data. It is estimated in two steps. Initially, the chosen feature data is reduced from first marginal feature data. Then, the reduced value is improved with a value from 0 and 1.
  • • It is computed for feature data of a specific marginal class instance that produces a row of synthetic instance for marginal class instance.
  • • For N% upsampling, it is computed for rounded value of (N/100) to adjacent integer. It produces N% of upsampling of individual marginal class instance.
  • • This strategy is performed for T marginal class instances and provides N% upsampling of all marginal class instances.

FSVM-Based Classification Model

Generally, the SVM model is applied for classification issues. Assume that there are set of labeled training points:

Every training point x, e RN comes under two classes and which is provided as a label y, e{-l,l} for / = 1,...,/. Mostly, exploring an applicable hyperplane for input space is highly limited practically. A better solution for this problem is mapping an input space to high-dimension feature space as well as finding a best hyperplane. Suppose г = (p from RN to feature space Z. The hyperplane can be determined as:

Described by a pair (w, b), where it isolates the point x, on the basis of given function,

where w eZ and beR. In depth, the set S is meant to be linearly separable when it has (w,b) where the uncertainties are valuable for elements of set S.

For linearly separable set S, it identifies an exclusive and best hyperplane and the margin among the projections of training points of two different classes are improved. When the set S is not linearly separable, classification violations should be activated in SVM formulation. In order to overcome these not linearly separable problems, the existing analysis could be normalized by establishing nonnegative variables > 0, where equation (7.4) is changed to:

The nonzero in equation (7.5) for the point x, would not satisfy equation (7.4). Hence, the term Ej=1£, can be referred to as a measure of misclassifications. The best hyperplane issue is regarded as solution to a problem.

where C implies a constant. The parameter C is assumed to be a regularization parameter. It is a free parameter in SVM formulation. Changing this parameter develops a balance among margin maximization as well as classification violation. Detail definitions are identified in Refs. [4, 6]. Exploring best hyperplane in equation (7.6) is said to be QP problem that is resolved by developing Lagrangian and changed into the dual.

where a = (a,,..., a,) denotes a vector of nonnegative Lagrange multipliers linked with the constraints in equation (7.5). The Kuhn-Tucker statement is one of the well-known models in SVM. Based on this method, a solution a, of problem presented in equation (7.7) meets

From the equality, the nonzero values a, in equation (7.8) are named as constraints in equation (7.5) and are computed with the equality sign. The point x, is corresponding with a, > 0 and is termed as support vector. However, it has two types of support vectors in a nonseparable case. For 0 < a, < C, the corresponding support vector x, meets the equalities yt [w-zt +b) = 1 and = 0. For, cq = C, the corresponding is nonempty for corresponding support vector x, which does not meet equation (7.4). It refers the support vectors as errors. The point x, corresponds with a, = 0 is categorized accurately and clearly for decision margin. In order to develop an optimal hyperplane w-z, + b, the following notion has been applied:

and a scalar b is computed from the Kuhn-Tucker conditions in equation (7.8). The decision function is generalized from equation (7.3) as well as equation (7.10) so that,

As there is no existence of prior knowledge of <p, the processing issue in equations (7.7) and (7.11) is not possible. Also, an optimal feature of SVM which is not essential to learn regarding K(-, ) named as kernel which process the dot product of data points in feature space Z, which is:

The performance which meets the Mercer’s statement could be applied as dot products and applied as kernels. The application of polynomial kernel of degree d

Hence, the nonlinear separating hyperplane is referred as a solution of, and a decision function is provided as:

In classical SVM, every data point is assumed with identical significance and allocated the similar penal variable in objective function. Hence, in real-time classification domains, few sample points, like noises, cannot be allocated to any class, and every instance does not contain equal meaning to decision surface. In order to resolve these issues, the principle of FSVM was developed in Refs. [19]. Fuzzy membership to all sample points is established in which various sample points make diverse contributions for developing the decision surface. Let, the training samples are:

where x, efi" denotes the и-dimension sample point, у, e{—1,+1} refers the class labels, and s, (i = 1,,.. jV implies a fuzzy membership that meets 0. The quadratic optimization (QP) issue for classification is assumed in the following:

where w defines normal vector of isolating hyperplane, b implies a bias term, and C shows a parameter that has to be calculated in advance for controlling the tradeoff among a classification margin as well as expense of misclassification error. As s, is an attitude of parallel point X) to one class and a slack variables are value of error, and a term s^, is assumed as measure of error with various weights. It is pointed that bigger s, is more significant for corresponding point; the smaller the s(, minimum is the significance for corresponding point; hence, diverse input points make various contributions to learn the decision surface. However, FSVM finds maximum and robust hyperplane by increasing the margin by reducing the misclassification error.

For resolving the FSM optimal issue, equation (7.17) is converted into a dual problem under the application of Lagrangian multipliers a,:

When compared to standard SVM, the predefined statement contains minimum variations that are an upper bound of measures of a,. By resolving the dual problem in equation (7.18) for optimal a,, w and b could be eliminated as same as standard SVM.

< Prev   CONTENTS   Source   Next >