For more information about Scrivener publications please visit www.scrivenerpublishing.com.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
Wiley Global Headquarters
111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read.
Library of Congress Cataloging-in-Publication Data
ISBN 978-1-119-78609-2
Cover image: Pixabay.Com
Cover design by: Russell Richardson
Set in size of 11pt and Minion Pro by Manila Typesetting Company, Makati, Philippines
5.5
6 Surface Defect Detection Using SVM-Based Machine Vision System with
Suriya, S., Balaji, M., Gowtham, T.M. and Rahul, Kumar S.
7.1
7.2
8 A Comparative Study
Ankita Tiwari, Bhawana Sahu, Jagalingam Pushaparaj and Muthukumaran Malarvel
8.1
10.2
Inderjeet Singh Sandhu, Chanchal Kaushik and Mansi Chitkara
12
M. Pavithra, R. Rajmohan, T. Ananth Kumar and R. Ramya 12.1
12.3
12.4
12.5
12.6
13
Vamsidhar Enireddy, Karthikeyan C., Rajesh Kumar T. and Ashok Bekkanti
13.2.1
13.2.3
13.2.4
14
Rao Nalluri, K. Kannan and Diptendu
14.2 A Brief Review of the
Formulating the Constrained Multi-Objective Optimization of Software Redundancy Allocation Problem (CMOO-SRAP)
14.4 The Novel Discrete Firefly Algorithm for Constrained MultiObjective Software Reliability Assessment of Digital Relay
Solution Encoding for the CMOO-SRAP for Digital Relay
14.5.3 Configuration of Solution Vectors for the CMOOSRAP
Preface
The edited book aims to bring together leading researchers, academic scientists, and research scholars to put forward and share their experiences and research results on all aspects of an inspection system for detection analysis for various machine vision applications. It also provides a premier interdisciplinary platform for educators, practitioners and researchers to present and discuss the most recent innovations, trends, methodology, applications, and concerns as well as practical challenges encountered and solutions adopted in the inspection system in terms of machine learning-based approaches of machine vision for real and industrial application. The book is organized into fourteen chapters.
Chapter 1 deliberated about various dangerous infectious viruses affect human society with a detailed analysis of transmission electron microscopy virus images (TEMVIs). In this chapter, several TEMVIs such as Ebola virus (EV), Enterovirus (ENV), Lassa virus (LV), severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), Zika virus (ZV), etc. are analyzed. The ML-based approach mainly focuses on the classification techniques such as Logistic Regression (LR), Neural Network (NN), k-Nearest Neighbors (kNN) and Naive Bayes (NB) for the processing of TEMVIs.
Chapter 2 focused to identify and differentiate handwriting characters using deep neural networks. As a solution to the character recognition problem in low resource languages, this chapter proposes a model that replicates the human cognition ability to learn with small datasets. The proposed solution is a Siamese neural network which bestows capsules and convolutional units to get a thorough understanding of the image. Further, this chapter attests that the capsule-based Siamese network could learn abstract knowledge about different characters which could be extended to unforeseen characters.
Chapter 3 presented Optics growth with the development of lens in terms of accuracy. The 4f-based optical system is used as a benchmark to
Preface
develop a firm system for medical applications. This method performing transforms with the optical system helps in improving accuracy. The image of the patient placed in the object plane is exposed to optical rays, the biconvex lens between the object and Fourier Plane performs an optical Fourier transform. This system indicating the normal or abnormal condition of the patient and helps in high-speed pattern recognition with optical signals.
Chapter 4 studied about brain tumor diagnosis process on digital images using a convolutional neural network (CNN) as a part of the deep learning model. To classification of brain tumors, eight different CNN models were tested on magnetic resonance imaging (MRI). Additionally, the detailed discussion on machine learning algorithms and deep learning techniques is presented.
Chapter 5 focused on optical character recognition. In this chapter, the detailed study was presented on handwritten identification and classification techniques and their applications. Furthermore, this chapter discussed their limitations along with an overview of the precision rate of Artificial Neural Network-based approaches.
Chapter 6 presented an automated process of detection of defects on wood or metal surface. Generally, monitoring the quality of raw material plays a crucial role in the production of a quality product. Therefore, this chapter developed the classification model using the multiclass support vector machine to identify the defected present into the wood.
Chapter 7 focused computational linguistics towards text recognition and synthesis, speech recognition and synthesis, and conversion between text to speech and vice versa. This chapter branches out towards a textto-speech system (TTS) which is used for conversion of natural language text into speech distinguishing itself from other systems that render symbolic linguistic representations like phonetic transcriptions into speech. This chapter mainly deals with an intelligible text-to-speech program that allows a visually impaired or a person with a reading disability to familiarize a language.
Chapter 8 deliberated surveyed about breast cancer among Indian females. The survey revealed that only 66.1% of women were diagnosed with cancer and survived. To identify the tumor for breast cancer various machine learning algorithms were adopted in the literature. In this chapter, a comparative study of existing classifiers like support vector clustering (SVC), decision tree classification algorithm (DTC), K-nearest neighbors
Preface xv
(KNN), random forest (RF), and multilayer perceptron (MLP) are demonstrated on Wisconsin-breast-cancer-dataset (WBCD) of UCI Machine learning repository.
Chapter 9 focused on communication for hearing impaired people. Since most members of this community use sign language, it is extremely valuable to develop automatized traductors between this language and other spoken languages. This chapter reports the recognition of Mexican sign-language static-alphabet from 3D data acquired from leap motion and MS Kinect 1 sensors. The novelty of this research is the use of six 3D affine moments invariants for sign language recognition.
Chapter 10 presented the solar cooker precise for scientific design. The human interference methods of traditional are exceeding trust for thermal applications and the environment cannot adapt to the variable source. In this chapter, the novel solar cooker has been discussed and based adaptive control through an online Sequential Extreme Learning Machine (OSELM).
Chapter 11 discussed the uses and applications of X-ray images. In this chapter, a detailed study was conducted on radio-diagnosis, nuclear medicine, and radiotherapy remain strong pillars for inspection, diagnosis, and treatment delivery systems. Also, discussed recent advances in artificial intelligence using radiography such as computed tomography.
Chapter 12 addressed the detection and analysis of breast illnesses in mammography images. This chapter presented the use of overlay convolutional neural networks that allow characteristic extraction from the mammography scans which is thereafter fed into a recurrent neural community. Also, this chapter would in actuality assist in tumor localization in case of breast cancers.
Chapter 13 focused on compression of medical images like MRI, ultrasound, and medical-related scans. Generally, voluminous data is embedded in medically produced images from various procedures and it produces images that need more storage space, managing which is difficult. Therefore, this chapter discussed compression of medical images and also techniques to classify the compressed images which are useful in telemedicine.
Chapter 14 presented a computer relays a special-purpose system designed specifically for sensing anomalies in the power system. Since all modern engineered systems, including modern computer relays, are constituted of increased proportions of software sophistication, software reliability assessment has become very important. This chapter discussed a
xvi Preface
constrained multi-objective formulation of the optimal software reliability allocation problem and thereafter develops a customized Discrete Firefly algorithm (DFA) to solve the aforementioned problem, using computer relay software as a case study.
Muthukumaran Malarvel
Soumya Ranjan Nayak
Prasant Kumar Pattnaik
Surya Narayan Panda
November 2020
1 Machine Learning-Based Virus Type Classification Using Transmission Electron Microscopy Virus Images
1Department of Computer Science and Engineering, Parala Maharaja Engineering College, Berhampur, India
2Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India
3Department of Mathematics, Parala Maharaja Engineering College, Berhampur, India
Abstract
Viruses are the submicroscopic infectious agents having the capability of replication itself inside the living cells of human body. Different dangerous infectious viruses greatly affect the human society along with plants, animals and microorganisms. It is very difficult for the survival of human society due to these viruses. In this chapter, Machine Learning (ML)-based approach is used to analyze several transmission electron microscopy virus images (TEMVIs). In this work, several TEMVIs such as Ebola virus (EV), Entero virus (ENV), Lassa virus (LV), severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), Zika virus (ZV), etc. are analyzed. The ML-based approach mainly focuses on the classification techniques such as Logistic Regression (LR), Neural Network (NN), k-Nearest Neighbors (kNN) and Naive Bayes (NB) for the processing of TEMVIs. The performance of these techniques is analyzed using classification accuracy (CA) parameter. The simulation of this work is carried out using Orange3-3.24.1.
Keywords: ML, TEMVIs, Classification Techniques, LR, NN, kNN, NB
ML [1–34] plays an important role in the today’s era for the researchers and scientists to carry out their research work. ML is considered as one of the most important application of artificial intelligence. Systems can be learned and improved from experience in automatic manner without any explicit programming by using ML mechanism. The main focus of ML is to develop computer programs that can access data as well as use it for learning purpose. ML techniques can be mainly classified as unsupervised learning techniques and supervised learning techniques. Unsupervised learning techniques focus on clustering techniques and supervised learning techniques focus on classification techniques. Hierarchical clustering, distance map, distance matrix, DBSCAN, manifold learning, k-means, Louvain clustering, etc. are some ML-based clustering techniques. ML [1–34] focuses on several classification techniques such as LR, NN, kNN, NB, decision tree, random forest, AdaBoost, etc. The similar objects can be grouped into a set which is known as cluster by using clustering techniques. Classification techniques are used to categorize a set of data into classes. In classification technique, the algorithm can learn from the data input provided to it and then use this learning mechanism to classify new observations. These techniques are mainly used to categorize the data into a desired and distinct number of classes where label can be assigned to each class. It is a very challenging task to categorize the set of data into classes accurately. Several ML-based classification techniques can be used for such classification. Viruses [57, 58] are the submicroscopic infectious agents and they are having the replication capability due to which they replicate itself inside the living cells of human body. Viruses can be classified as DNA and RNA viruses on the basis of nucleic acid, cubical, spiral, and radial symmetry, complex viruses on the basis of structure, bacteriophage, plant and animal, insect viruses on the basis of host range. Several viruses can be transmitted through respiratory route, feco-oral route, sexual contacts, blood transfusion, etc. Very dangerous viruses such as SARS-CoV-2, EV, ENV, LV, ZV, dengue virus, Hepatitis C virus have adverse effects which greatly affect the human society in the current scenario. In this work, several ML-based classification techniques such as LR, NN, kNN, NB are focused for the implementation of classification mechanism on several TEMVIs such as EV, ENV, LV, SARS-CoV-2 and ZV.
The main contribution of this work is stated as follows.
• ML-based approach is used for the processing of several TEMVIs such as EV, ENV, LV, SARS-CoV-2 and ZV.
• ML-based approach focuses on several classification techniques such as LR, NN, kNN and NB for such processing.
• These techniques are compared using the performance metric such as CA.
• This work is carried out using Orange3-3.24.1.
The rest of the chapter is organized as follows. Section 1.2 describes related works, Section 1.3 describes methodology for the processing of TEMVIs, Section 1.4 describes results and discussion and Section 1.5 describes the conclusion.
1.2 Related Works
Different works have introduced by several researchers and scientists for the processing of virus as well as other images for wide variety of applications in the real world scenario [1–34, 35–55]. Some of the works are described as follows. Singh et al. [2] focus on the review of several ML as well as image processing techniques for the detection and classification of paddy leaf diseases. Al-Kasassbeh et al. [5] focus on the feature selection mechanism by the help of ML-based approach for the classification of malware. Yang et al. [6] focus on a sequence embedding-based ML mechanism for the prediction of human-virus protein–protein interactions. Dey et al. [7] focus on ML-based techniques for sequence based prediction of viral host interactions between human proteins and SARS-CoV-2. Karanja et al. [9] focus on ML-based techniques as well as image texture features for the analysis of internet of things malware. Muda et al. [14] focus on the k-means clustering as well as NB classification mechanism for intrusion detection. Trishan et al. [17] focus on ML-based classification such as NB, k-nearest and random forest to detect Hepatitis A, B, C and E viruses. Kaur [19] focuses on the ML-based approaches such as kNN and NB for the detection of fraud associated with credit card. Goyal [20] focuses on a NB model that is based on enhanced kNN classification mechanism for the prediction of breast cancer. Wahid et al. [22] focus on the performance analysis of several ML-based techniques for the classification of microscopic bacteria images. Ito et al. [27] focus on convolutional NN mechanism for the detection of virus particle in transmission electron microscopy (TEM) images. Devan et al. [28] focus on transfer learning mechanism to detect herpesvirus capsids by considering several TEM images.
1.3 Methodology
In this work, the ML-based classification techniques [10, 11, 14–16] such as LR, NN, kNN and NB are used to carry out classification mechanism on several TEMVIs such as EV, ENV, LV, SARS-CoV-2 and ZV. LR technique is used for the prediction of probability of a target variable or dependent variable. Generally, this target variable has a dichotomous nature. It deals with the data coded as 1 for yes or success and 0 for no or failure. A LR model can be used to predict a dependent data variable by considering the relationship between one or more existing independent variable. NN technique deals with a network of functions in order to understand as well as translate a data input of one form into another form as required output. It deals with different neurons layers where each layer can receive inputs from previous layers and can pass outputs to further layers. This technique can process complex data inputs into a space that the computers can be able to understand. kNN technique uses all the available data and classifies new data points on the basis of similarity measures. This technique takes k closest training examples in the feature space as input and generates a class membership as output. NB technique uses the Bayes theorem and this technique assumes that the presence of a particular feature in a class is not related to any other features. So, every features pair is independent of each other. This technique can predict the membership probabilities for each class and the class having the highest probability can be considered as the most likely class.
In this work, at first the TEMVIs are given as input to the Orange 3-3.24.1 [56]. Afterwards, image embedding mechanism is carried out by taking input TEMVIs as inputs to generate embeddings or skipped TEMVIs as outputs. Several embedders such as Inception v3, SqueezeNet (local), VGG-16, VGG-19, Painters, DeepLoc, Openface can be used for image embedding purpose. SqueezeNet (local) is taken as embedder for image embedding purpose. Then, test and score calculation will be carried out by considering image embedding mechanism and by applying LR, NN, kNN and NB techniques separately to compute CA values. For LR, the regularization type, strength are considered as Ridge (L2) and C = 1 respectively. For NN, the neurons in hidden layers, activation function, solver method, regularization and maximal number of iterations are considered as 100, ReLu, Adam, a = 0.0001 and 100 respectively along with replicable training mechanism. For kNN, the number of neighbors, metric and weight are considered as 5, Euclidean and uniform respectively. For test and score calculation, inputs can be considered as data, test data, learner,
preprocessor and outputs can be generated as evaluation results as well as predictions. Afterwards, confusion matrix can be generated to represent classification results of each technique. For confusion matrix, the inputs can be considered as evaluation results from test and score and it generates data or selected data as outputs. Figure 1.1 describes the methodology. The steps involved in this work are mentioned as follows.
Steps for TEMVIs Classification
Step 1: Input several categories of TEMVIs such as EV, ENV, LV, SARS-CoV-2 and ZV.
Step 2: Perform image embedding mechanism by considering input TEMVIs.
Step 3: Test and score calculation by considering image embedding data and by applying LR, NN, kNN and NB techniques separately to compute CA values.
Step 4: Create confusion matrix to represent the classification results each technique.
Figure 1.1 Methodology.
1.4 Results and Discussion
This work uses Orange 3-3.24.1 [56] for the simulation purpose. Several TEMVIs with different sizes are taken from the source [59–88]. In this work, 30 TEMVIs with 6 images of each category such as EV, ENV, LV, SARS-CoV-2 and ZV are taken for testing purpose which is mentioned in Figures 1.2–1.6. The TEMVIs are processed using ML-based classification
techniques such as LR, NN, kNN and NB. The classification results of these techniques are mentioned in Figures 1.7–1.10, 1.11–1.14, 1.15–1.18, 1.19–1.22, 1.23–1.26 by considering number of folds (NoF) as 2, 3, 5, 10 and 20 respectively. In this work, five different cases such as cases-I–V are considered by taking five different NoF such as 2, 3, 5, 10 and 20 respectively.
The test and score calculation is carried out by using cross validation sampling mechanism with different NoF. The classification results generated by using confusion matrix for each of these classification techniques represent actual and predicted values by showing the number of instances. The correct classification results are represented by the help of light blue color and the misclassification results are represented by the help of light red color. Table 1.1 represents the CA values by applying the LR, NN, kNN and NB classification techniques.