The Proposed Model

Fig. 8.1 shows the working process of the proposed model. The mammograms are said to be preprocessed images at the primary stage and improve the variations among objects as well as irregular background noise. Intensity is one of the parameters

calculated in this approach. Preprocessing is carried out due to the minimum contrast of mammographic pictures, which is very complex to interrupt the masses present in a mammogram image. In general, there is no vast difference in the severity of pectoral muscle than tumor intensity. Thus, it tends to eliminate the pectoral muscle region before processing the feature extraction. The preprocessing phase is more applicable in eliminating labels and background noise present in electronic mammograms. Once the unwanted labels and noise were removed from a mammogram, the feature extraction stage has been used in a computed image that is the same as random transform. Therefore, it is majorly employed in predicting random arbitrary shapes as well as straight lines. These features are extracted efficiently and isolated. Consequently, GA-SVM has been applied for classification operation.

Image Acquisition

A mammogram image can be normal, benign, and malignant in the case of fatty- glandular breast types that have been gathered from the MIAS database for predicting the disease. It is comprised of mammogram images at a 50-mm pixel edge. It is applied where images are 1024 x 1024 pixels. Around 300 mammogram images derived from 94 images have been deployed for this task. Cancer detection is complex to identify in the mammogram. Hence, dense images cannot be applied to the analyzing process.


The images gathered from the database are composed of irregular data as well as background noise. The preprocessing stage has been utilized for eliminating such mammograms and applicable for future experiments. The unwanted area of breast cancer as well as tumors are present. The yellow region is the breast region; irregular label and background can be projected with the application of the blue circle as well as a red circle that shows the tumor. The unknown label has to be eliminated initially by applying the gradient relied threshold model. Various morphological tasks were processed to produce a mask. Dilation as well as hole-filling are major tasks applied to produce a binary edge map of the image with the application of the gradient relied threshold approach.

In this model, GT is a gradient-based threshold that is identified with the application of Otsu’s adaptive model [19]. Then, the binary image undergoes dilations with the help of the structuring component. Here, the mask is improved with the actual image. The given tw'o images exhibit the mask produced by the gradient model and the alternate is the label-avoided image. The major step in the next process is to eliminate the pectoral image in a mammogram. Such muscles are closer to intensity value when compared with tumor intensity. For effective feature extraction process, the unwanted regions are needed to the removed by the use of segmentation techniques.

HFE Model

In the presented approach, hybrid feature extraction has experimented with converted mammogram images. Here, a higher-level feature, termed as the Gray-Level Co-occurrence Matrix (GLCM), is applied to extract the features of mammogram images. There are two productive GLCM texture features that were assumed: homogeneity and power. Additionally, the histogram of oriented gradients (HOG) descriptor has been utilized in medical image processing as well as computer vision to extract optimized feature values. A wider definition of HOG and GLCM texture features are explained in upcoming sections.


Homogeneity computes distribution units in the GLCM. To quantitatively simplify the homogeneous texture for affinity, the local spatial statistics of texture has been determined under the application of scale as well as orientation selection of Gabor filtering. These mammogram images are further divided as a collection of homogeneous texture and texture features are relevant to regions of subjected image data. In GLCM, homogeneity operates on four directions such as 0 = 0°, 45°, 90°, or 135° with a feature vector size of 4. It provides maximum accuracy to detect the infected regions that are defined using the vulnerable difference in gray level. A typical formula to determine the homogeneity is expressed in Eq. (8.2)


Energy is used to measure the uniformity of normalized pixel pair distributions and computes the number of duplicate pairs. Energy is defined as a normalized value with a higher range of 1. The higher energy value exists if the gray level distribution has a regular format. Energy is applicable in reflecting depth as well as the smoothness of mammogram images. The typical expression of computing energy of the mammogram image is provided in Eq. (8.3)

where n denotes the gray levels, P(i, j) implies the pixel value of position (/, j) of mammogram images, and PVj signifies a normalized co-occurrence matrix.

HOG Features

The main feature in the HOG descriptor can hold the local appearance of objects and account the invariance of object conversions, as well as illumination status as the edge and data regarding gradients are estimated by using a multiple coordinate - HOG feature vector. Initially, a gradient operator N has been applied to determine the gradient measure. The gradient point of the mammogram image is presented as

G and image frames are shown as I. A common formula used in computing gradient points is provided in Eq. (8.4)

The image-detecting window undergoes characterization as diverse spatial regions that are named as cells. Therefore, the magnitude gradients of pixels are implemented with edge orientation. As a result, the magnitude of gradients (x, у) is implied in Eq. (8.5)

The edge orientation of point (x, y) is provided in Eq. (8.6)

where Gx represents the horizontal direction of gradients and Gy denotes the vertical direction of gradients. The graphical architecture of the HOG descriptor is provided in Fig. 8.2. In case of enhanced illumination as well as noise, a normalization task is processed once completing the histogram measures. The determination of normalization is used in contrast and local histograms can be validated. In multiple coordinate HOG, four diverse approaches of normalization can be applied such as L2-norm, L2-Hys, Ll-Sqrt, and Ll-norm. When compared to this normalization, L2-norm provides an optimal function in cancer prediction. The segments of normalization in HOG are expressed in Eq. (8.7)

where e implies a small positive value used in regularization,/can be a represented feature vector, h shows the nonnormalized vector, and II h || 2is named as 2-norm of HOG normalization.

GA-SVM-Based Classification

A GA can be an effective classification as well as a parameter optimization relevant to the development of chromosomes, estimation of fitness function (FF), and system process and is represented in upcoming sections.

Chromosome Design

An SVM classifier with a radial basis function (RBF) kernel is used in classifying land-cover classes; however, the parameters C and у are estimated. Hence, the chromosome is constrained in three portions: the selected features, C, and y. A binary coding model has been used to compute the chromosome. The features are deployed as Fbt ~ Fb„,, which implements input features; Fbi = 1, if a corresponding feature has been selected; and Fbt = 0, if a feature has not been selected. Cblh ~ Cbnc is a value of C, and уb{ ~ уb is a measure of y. The norm nf is referred to as a number of bits showing the features, nc denotes the number of bits the parameter C, and /;y implies the count of bits showing the parameter y.

Fitness Function

It is one of the major portions of estimating whether an individual can “fit” to live to produce the units. It is applied with two strategies to describe the FF such as the classifying accuracy as well as several features from a selected subset. When an individual is comprised of maximum classification accuracy and minimum features, the function has greater high fitness measure and maximum probability to pass in the upcoming generation, as provided in Eq. (8.8)

where WOA, OASvm, Wf and }] implies the weight of classification accuracy, overall classifier accuracy, feature weights, as well as mask value. Here, it is computed with FF of classification accuracy weight (WOA) and weight of features (XVf) of 0.2 for every data set.

Hybridization of the GA-SVM Algorithm

The major function in GA-based feature selection (FS), as well as parameter optimizing, is consolidated here.

Chromosomes present from the initial population, such as feature subset as well as SVM kernel parameters (C, y), are produced. The basic size of the population has to be selected by the customer.

  • • The FF values of all chromosomes such as C, y, and the chosen feature subset are computed.
  • • In the SVM classifier, the training, as well as testing samples of every class retrieved from ROI, are employed according to professional interpretation.

In case of the training set, a combined image was used in training the SVM classification, and testing samples were used in assessing the classification accuracy.

  • • Fitness values of individuals are calculation with the application FF. and based on classification accuracy as well as selected features.
  • • Individuals with higher fitness measures would be selected and provisioned for the next generation in genetic operation procedures.
  • • When a termination condition is met, the process terminates at best individuals. Hence, the normalized outcomes are comprised of C, y, and selected features. Otherwise, the process would be repeated with the next generation under the application of genetic task, such as selection, crossover, as well as mutation.
< Prev   CONTENTS   Source   Next >