[Ebooks PDF] download Recent advances in hybrid metaheuristics for data clustering 1. edition sourav

Page 1


Visit to download the full and correct content document: https://ebookmass.com/product/recent-advances-in-hybrid-metaheuristics-for-data-clu stering-1-edition-sourav-de-editor/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Recent Advances in Micro- and Macroalgal Processing Gaurav Rajauria

https://ebookmass.com/product/recent-advances-in-micro-andmacroalgal-processing-gaurav-rajauria/

Recent Advances in Natural Products Analysis 1st Edition Seyed Mohammad Nabavi (Editor)

https://ebookmass.com/product/recent-advances-in-naturalproducts-analysis-1st-edition-seyed-mohammad-nabavi-editor/

Recent Advances in Applications of Name Reactions in Multicomponent Reactions 1st Edition Majid M. Heravi

https://ebookmass.com/product/recent-advances-in-applications-ofname-reactions-in-multicomponent-reactions-1st-edition-majid-mheravi/

Recent Advances and Applications of Thermoset Resins 2nd Edition Debdatta Ratna

https://ebookmass.com/product/recent-advances-and-applicationsof-thermoset-resins-2nd-edition-debdatta-ratna/

Fundamentals and Recent Advances in Nanocomposites Based on Polymers and Nanocellulose Md Rezaur Rahman

https://ebookmass.com/product/fundamentals-and-recent-advancesin-nanocomposites-based-on-polymers-and-nanocellulose-md-rezaurrahman/

Advances in Business Statistics, Methods and Data Collection Ger Snijkers

https://ebookmass.com/product/advances-in-business-statisticsmethods-and-data-collection-ger-snijkers/

Recent advances in understanding and design of efficient hydrogen evolution electrocatalysts for water splitting: A comprehensive review Bashir Adegbemiga Yusuf

https://ebookmass.com/product/recent-advances-in-understandingand-design-of-efficient-hydrogen-evolution-electrocatalysts-forwater-splitting-a-comprehensive-review-bashir-adegbemiga-yusuf/

Advances in Ophthalmology and Optometry, 2022 (Volume 7-1) (Advances, Volume 7-1) Myron Yanoff Md (Editor)

https://ebookmass.com/product/advances-in-ophthalmology-andoptometry-2022-volume-7-1-advances-volume-7-1-myron-yanoff-mdeditor/

Advances in Ophthalmology and Optometry, 2021 (Volume 6-1) (Advances, Volume 6-1) Myron Yanoff Md (Editor)

https://ebookmass.com/product/advances-in-ophthalmology-andoptometry-2021-volume-6-1-advances-volume-6-1-myron-yanoff-mdeditor/

RecentAdvancesinHybridMetaheuristicsforDataClustering

RecentAdvancesinHybridMetaheuristicsfor DataClustering

Editedby

SouravDe CoochBeharGovernmentEngineeringCollege,WestBengal,India

SandipDey SukantaMahavidyalaya,WestBengal,India

SiddharthaBhattacharyya CHRIST(DeemedtobeUniversity),Bangalore,India

Thiseditionfirstpublished2020 ©2020JohnWiley&SonsLtd

Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmitted,inanyform orbyanymeans,electronic,mechanical,photocopying,recordingorotherwise,exceptaspermittedbylaw.Adviceonhow toobtainpermissiontoreusematerialfromthistitleisavailableathttp://www.wiley.com/go/permissions.

TherightofSouravDe,SandipDey,andSiddharthaBhattacharyyatobeidentifiedastheauthorsoftheeditorialmaterial inthisworkhasbeenassertedinaccordancewithlaw.

RegisteredOffices

JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA

JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UK

EditorialOffice

TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UK

Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWileyproductsvisitusat www.wiley.com.

Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Somecontentthatappearsin standardprintversionsofthisbookmaynotbeavailableinotherformats.

LimitofLiability/DisclaimerofWarranty

Inviewofongoingresearch,equipmentmodifications,changesingovernmentalregulations,andtheconstantflowof informationrelatingtotheuseofexperimentalreagents,equipment,anddevices,thereaderisurgedtoreviewand evaluatetheinformationprovidedinthepackageinsertorinstructionsforeachchemical,pieceofequipment,reagent,or devicefor,amongotherthings,anychangesintheinstructionsorindicationofusageandforaddedwarningsand precautions.

Whilethepublisherandauthorshaveusedtheirbesteffortsinpreparingthiswork,theymakenorepresentations orwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthisworkandspecificallydisclaimall warranties,includingwithoutlimitationanyimpliedwarrantiesofmerchantabilityorfitnessforaparticularpurpose. Nowarrantymaybecreatedorextendedbysalesrepresentatives,writtensalesmaterialsorpromotionalstatementsfor thiswork.Thefactthatanorganization,website,orproductisreferredtointhisworkasacitationand/orpotentialsource offurtherinformationdoesnotmeanthatthepublisherandauthorsendorsetheinformationorservicestheorganization, website,orproductmayprovideorrecommendationsitmaymake.Thisworkissoldwiththeunderstandingthatthe publisherisnotengagedinrenderingprofessionalservices.Theadviceandstrategiescontainedhereinmaynotbesuitable foryoursituation.Youshouldconsultwithaspecialistwhereappropriate.Further,readersshouldbeawarethatwebsites listedinthisworkmayhavechangedordisappearedbetweenwhenthisworkwaswrittenandwhenitisread.Neitherthe publishernorauthorsshallbeliableforanylossofprofitoranyothercommercialdamages,includingbutnotlimitedto special,incidental,consequential,orotherdamages.

LibraryofCongressCataloging-in-PublicationData

Names:De,Sourav,1979-editor.|Dey,Sandip,1977-editor.| Bhattacharyya,Siddhartha,1975-editor.

Title:Recentadvancesinhybridmetaheuristicsfordataclustering/edited byDr.SouravDe,Dr.SandipDey,Dr.SiddharthaBhattacharyya.

Description:Firstedition.|Hoboken,NJ:JohnWiley&Sons,Inc.,[2020] |Includesbibliographicalreferencesandindex.

Identifiers:LCCN2020010571(print)|LCCN2020010572(ebook)|ISBN 9781119551591(cloth)|ISBN9781119551614(adobepdf)|ISBN 9781119551607(epub)

Subjects:LCSH:Clusteranalysis–Dataprocessing.|Metaheuristics.

Classification:LCCQA278.55.R432020(print)|LCCQA278.55(ebook)| DDC519.5/3–dc23

LCrecordavailableathttps://lccn.loc.gov/2020010571

LCebookrecordavailableathttps://lccn.loc.gov/2020010572

CoverDesign:Wiley

CoverImage:©Nobi_Prizue/GettyImages

Setin9.5/12.5ptSTIXTwoTextbySPiGlobal,Chennai,India

PrintedandboundbyCPIGroup(UK)Ltd,Croydon,CR04YY 10987654321

Dr.SouravDededicatesthisbooktohisrespectedparents,SatyaNarayanDeandTapasiDe; hislovingwife,DebolinaGhosh;hisbelovedson,AishikDe;hissister,SoumiDe,andhis in-laws

Dr.SandipDeydedicatesthisbooktothelovingmemoryofhisfather,thelateDhananjoy Dey;hisbelovedmother,Smt.GitaDey;hiswife,SwagataDeySarkar;hischildren,Sunishka andShriaan;hissiblings,Kakali,Tanusree,andSanjoy;andhisnephews,Shreyashand Adrishaan.

Dr.SiddharthaBhattacharyyadedicatesthisbooktohislatefather,AjitKumar Bhattacharyya;hislatemother,HashiBhattacharyya;hisbelovedwife,Rashni,andhis in-laws,AsisMukherjeeandPolyMukherjee.

Contents

ListofContributors xiii

SeriesPreface xv

Preface xvii

1MetaheuristicAlgorithmsinFuzzyClustering 1

SouravDe,SandipDey,andSiddharthaBhattacharyya

1.1Introduction 1

1.2FuzzyClustering 1

1.2.1Fuzzy c-means(FCM)clustering 2

1.3Algorithm 2

1.3.1SelectionofClusterCenters 3

1.4GeneticAlgorithm 3

1.5ParticleSwarmOptimization 5

1.6AntColonyOptimization 6

1.7ArtificialBeeColonyAlgorithm 7

1.8LocalSearch-BasedMetaheuristicClusteringAlgorithms 7

1.9Population-BasedMetaheuristicClusteringAlgorithms 8

1.9.1GA-BasedFuzzyClustering 8

1.9.2PSO-BasedFuzzyClustering 9

1.9.3AntColonyOptimization–BasedFuzzyClustering 10

1.9.4ArtificialBeeColonyOptimization–BasedFuzzyClustering 10

1.9.5DifferentialEvolution–BasedFuzzyClustering 11

1.9.6FireflyAlgorithm–BasedFuzzyClustering 12

1.10Conclusion 13 References 13

2HybridHarmonySearchAlgorithmtoSolvetheFeatureSelectionfor DataMiningApplications 19

LaithMohammadAbualigah,MoflehAl-diabat,MohammadAlShinwan, KhaldoonDhou,BisanAlsalibi,EssamSaidHanandeh,andMohammadShehab

2.1Introduction 19

2.2ResearchFramework 21

2.3TextPreprocessing 22

2.3.1Tokenization 22

2.3.2StopWordsRemoval 22

2.3.3Stemming 23

2.3.4TextDocumentRepresentation 23

2.3.5TermWeight(TF-IDF) 23

2.4TextFeatureSelection 24

2.4.1MathematicalModeloftheFeatureSelectionProblem 24

2.4.2SolutionRepresentation 24

2.4.3FitnessFunction 24

2.5HarmonySearchAlgorithm 25

2.5.1ParametersInitialization 25

2.5.2HarmonyMemoryInitialization 26

2.5.3GeneratingaNewSolution 26

2.5.4UpdateHarmonyMemory 27

2.5.5ChecktheStoppingCriterion 27

2.6TextClustering 27

2.6.1MathematicalModeloftheTextClustering 27

2.6.2FindClustersCentroid 27

2.6.3SimilarityMeasure 28

2.7 k-meanstextclusteringalgorithm 28

2.8ExperimentalResults 29

2.8.1EvaluationMeasures 29

2.8.1.1F-measureBasedonClusteringEvaluation 30

2.8.1.2AccuracyBasedonClusteringEvaluation 31

2.8.2ResultsandDiscussions 31

2.9Conclusion 34 References 34

3AdaptivePosition–BasedCrossoverintheGeneticAlgorithmforData Clustering 39 ArnabGainandPrasenjitDey

3.1Introduction 39

3.2Preliminaries 40

3.2.1Clustering 40

3.2.1.1 k-meansClustering 40

3.2.2GeneticAlgorithm 41

3.3RelatedWorks 42

3.3.1GA-BasedDataClusteringbyBinaryEncoding 42

3.3.2GA-BasedDataClusteringbyRealEncoding 43

3.3.3GA-BasedDataClusteringforImbalancedDatasets 44

3.4ProposedModel 44

3.5Experimentation 46

3.5.1ExperimentalSettings 46

3.5.2DBIndex 47

3.5.3ExperimentalResults 49

3.6Conclusion 51 References 57

4ApplicationofMachineLearningintheSocialNetwork 61 BelfinR.V.,E.GraceMaryKanaga,andSumanKundu

4.1Introduction 61

4.1.1SocialMedia 61

4.1.2BigData 62

4.1.3MachineLearning 62

4.1.4NaturalLanguageProcessing(NLP) 63

4.1.5SocialNetworkAnalysis 64

4.2ApplicationofClassificationModelsinSocialNetworks 64

4.2.1SpamContentDetection 65

4.2.2TopicModelingandLabeling 65

4.2.3HumanBehaviorAnalysis 67

4.2.4SentimentAnalysis 68

4.3ApplicationofClusteringModelsinSocialNetworks 68

4.3.1RecommenderSystems 69

4.3.2SentimentAnalysis 70

4.3.3InformationSpreadingorPromotion 70

4.3.4Geolocation-SpecificApplications 70

4.4ApplicationofRegressionModelsinSocialNetworks 71

4.4.1SocialNetworkandHumanBehavior 71

4.4.2EmotionContagionthroughSocialNetworks 73

4.4.3RecommenderSystemsinSocialNetworks 74

4.5ApplicationofEvolutionaryComputingandDeepLearninginSocial Networks 74

4.5.1EvolutionaryComputingandSocialNetwork 75

4.5.2DeepLearningandSocialNetworks 75

4.6Summary 76 Acknowledgments 77 References 78

5PredictingStudents’GradesUsingCART,ID3,andMulticlassSVM OptimizedbytheGeneticAlgorithm(GA):ACaseStudy 85 DebanjanKonar,RuchitaPradhan,TaniaDey,TejaswiniSapkota, andPrativaRai

5.1Introduction 85

5.2LiteratureReview 87

5.3DecisionTreeAlgorithms:ID3andCART 88

5.4MulticlassSupportVectorMachines(SVMs)OptimizedbytheGenetic Algorithm(GA) 90

5.4.1GeneticAlgorithmsforSVMModelSelection 92

5.5PreparationofDatasets 93

x Contents

5.6ExperimentalResultsandDiscussions 95

5.7Conclusion 96 References 96

6ClusterAnalysisofHealthCareDataUsingHybridNature-Inspired Algorithms 101 KauserAhmedP,RishabhAgrawal

6.1Introduction 101

6.2RelatedWork 102

6.2.1FireflyAlgorithm 102

6.2.2 k-meansAlgorithm 103

6.3ProposedMethodology 104

6.4ResultsandDiscussion 106

6.5Conclusion 110 References 111

7PerformanceAnalysisThroughaMetaheuristicKnowledge Engine 113

InduChhabraandGunmalaSuri

7.1Introduction 113

7.2DataMiningandMetaheuristics 114

7.3ProblemDescription 115

7.4AssociationRuleLearning 116

7.4.1AssociationMiningIssues 116

7.4.2ResearchInitiativesandProjects 116

7.5LiteratureReview 117

7.6Methodology 119

7.6.1Phase1:PatternSearch 120

7.6.2Phase2:RuleMining 120

7.6.3Phase3:KnowledgeDerivation 121

7.7Implementation 121

7.7.1TestIssues 121

7.7.2SystemEvaluation 121

7.7.2.1IndicatorMatrixFormulation 122

7.7.2.2Phase1:FrequentPatternDerivation 123

7.7.2.3Phase2:AssociationRuleFraming 123

7.7.2.4Phase3:KnowledgeDiscoveryThroughMetaheuristicImplementation 123

7.8PerformanceAnalysis 124

7.9ResearchContributionsandFutureWork 125

7.10Conclusion 126 References 126

8MagneticResonanceImageSegmentationUsingaQuantum-Inspired ModifiedGeneticAlgorithm(QIANA)BasedonFRCM 129 SunandaDas,SouravDe,SandipDey,andSiddharthaBhattacharyya

8.1Introduction 129

8.2LiteratureSurvey 131

8.3QuantumComputing 133

8.3.1Quoit-QuantumBit 133

8.3.2Entanglement 133

8.3.3Measurement 133

8.3.4QuantumGate 134

8.4SomeQualityEvaluationIndicesforImageSegmentation 134

8.4.1F(I) 134

8.4.2F’(I) 135

8.4.3Q(I) 135

8.5Quantum-InspiredModifiedGeneticAlgorithm(QIANA)–BasedFRCM 135

8.5.1Quantum-InspiredMEGA(QIANA)–BasedFRCM 136

8.6ExperimentalResultsandDiscussion 139

8.7Conclusion 147 References 147

9AHybridApproachUsingthe k -meansandGeneticAlgorithmsfor ImageColorQuantization 151 MarcosRobertoeSouza,AndersonCarlosSousaeSantos,andHelioPedrini

9.1Introduction 151

9.2Background 152

9.3ColorQuantizationMethodology 154

9.3.1CrossoverOperators 157

9.3.2MutationOperators 158

9.3.3FitnessFunction 158

9.4ResultsandDiscussions 159

9.5ConclusionsandFutureWork 168 Acknowledgments 168 References 168

Index 173

ListofContributors

LaithMohammadAbualigah AmmanArabUniversity

Jordan

RishabhAgrawal VIT

India

KauserAhmed VIT

India

MoflehAl-diabat AlAlbaytUniversity

Jordan

BisanAlsalibi UniversitiSainsMalaysia

Malaysia

MohammadAlShinwan AmmanArabUniversity

Jordan

BelfinRV KarunyaInstituteofTechnologyand Sciences

India

SiddharthaBhattacharyya CHRIST(Deemedtobeuniversity)

India

InduChhabra PanjabUniversity Chandigarh

India

SunandaDas NationalInstituteofTechnology Durgapur

India

SouravDe CoochBeharGovernmentEngineering College

India

PrasenjitDey CoochBeharGovernmentEngineering College

India

SandipDey SukantaMahavidyala

India

TaniaDey SikkimManipalInstituteofTechnology

India

KhaldoonDhou DruryUniversity

USA

xiv ListofContributors

ArnabGain CoochBeharGovernmentEngineering College

India

EssamHanandeh Zarqauniversity

Jordan

GraceMaryKanaga KarunyaInstituteofTechnologyand Sciences

India

AhamadKhader UniversitiSainsMalaysia

Malaysia

DebanjanKonar SikkimManipalInstituteofTechnology

India

SumanKundu WroclawUniversityofScienceand Technology

India

RuchitaPradhan SikkimManipalInstituteofTechnology

India

HelioPedrini InstituteofComputing UniversityofCampinas

Brazil

PrativaRai SikkimManipalInstituteofTechnology

India

MarcosRobertoeSouza InstituteofComputing UniversityofCampinas Campinas

Brazil

EssamSaidHanandeh ZarqaUniversity

Jordan

AndersonSantos InstituteofComputing UniversityofCampinas

Brazil

TejaswiniSapkota SikkimManipalInstituteofTechnology

India

MohammadShehab AqabaUniversityofTechnology

Jordan

GunmalaSuri UniversityBusinessSchool PanjabUniversity Chandigarh

India

SeriesPreface:DrSiddharthaBhattacharyya,Christ(DeemedTo BeUniversity),Bangalore,India(SeriesEditor)

TheIntelligentSignalandDataProcessing(ISDP)bookseriesfocusesonthefieldofsignalanddataprocessingencompassingthetheoryandpracticeofalgorithmsandhardware thatconvertsignalsproducedbyartificialornaturalmeansintoaformusefulforaspecificpurpose.Thesignalsmightbespeech,audio,images,video,sensordata,telemetry, electrocardiograms,orseismicdata,amongothers.Thepossibleapplicationareasinclude transmission,display,storage,interpretation,classification,segmentation,anddiagnosis. TheprimaryobjectiveoftheISDPbookseriesistoevolvefuture-generation,scalable,intelligentsystemsforfaithfulanalysisofsignalsanddata.TheISDPseriesisintendedmainlyto enrichthescholarlydiscourseonintelligentsignalandimageprocessingindifferentincarnations.Theserieswillbenefitawideaudiencethatincludesstudents,researchers,and practitioners.Thestudentcommunitycanusethebooksintheseriesasreferencetextsto advancetheirknowledgebase.Inaddition,theconstituentmonographswillbehandyto aspiringresearchersduetorecentandvaluablecontributionsinthisfield.Moreover,facultymembersanddatapractitionersarelikelytogainrelevantknowledgefromthebooks intheseries.

Theseriescoveragewillcontain,butnotbeexclusiveto,thefollowing:

● Intelligentsignalprocessing

a)Adaptivefiltering

b)Learningalgorithmsforneuralnetworks

c)Hybridsoftcomputingtechniques

d)Spectrumestimationandmodeling

● Imageprocessing

a)Imagethresholding

b)Imagerestoration

c)Imagecompression

d)Imagesegmentation

e)Imagequalityevaluation

f)Computervisionandmedicalimaging

g)Imagemining

h)Patternrecognition

i)Remotesensingimagery

j)Underwaterimageanalysis

k)Gestureanalysis

l)Humanmindanalysis

m)Multidimensionalimageanalysis

● Speechprocessing

a)Modeling

b)Compression

c)Speechrecognitionandanalysis

● Videoprocessing

a)Videocompression

b)Analysisandprocessing

c)3Dvideocompression

d)Targettracking

e)Videosurveillance

f)Automatedanddistributedcrowdanalytics

g)Stereo-to-autostereoscopic3Dvideoconversion

h)Virtualandaugmentedreality

● Dataanalysis

a)Intelligentdataacquisition

b)Datamining

c)Exploratorydataanalysis

d)Modelingandalgorithms

e)Bigdataanalytics

f)Businessintelligence

g)Smartcitiesandsmartbuildings

h)Multiwaydataanalysis

i)Predictiveanalytics

j)Intelligentsystems

Preface

Groupingorclassifyingreal-lifedataintoasetofclustersorcategoriesforfurtherprocessing andclassificationisknownas clustering.Thegroupsareorganizedonthebasisofbuilt-in propertiesorcharacteristicsofthedatainthatdataset.Thefeaturesofthegroupsareimportanttorepresentanewobjectortounderstandanewphenomenon.Homogeneousdata shouldbeinthesamecluster,whereasdissimilarorheterogeneousdataisgroupedinto differentclusters.Theclusteringofdatacanbeappliedindifferentfieldsoftheworld,such asdocumentretrieval,datamining,patternclassification,imagesegmentation,artificial intelligence,machinelearning,biology,microbiology,etc.

Broadly,therearetwotypesofdataclusteringalgorithms:supervisedandunsupervised. Insuperviseddataclusteringalgorithms,thenumberofcovetedpartitionsandlabeled datasetsissuppliedasthebasicinputatthebeginningofthealgorithm.Moreover,insupervisedclusteringalgorithms,itisattemptedtokeepthenumberofsegmentssmall,andthe datapointsareallottedtoclustersusingtheideaofclosenessbyresortingtoagivendistancefunction.Bycontrast,priorinformationaboutthelabeledclasses,decision-making criterionforoptimization,ornumberofdesiredsegmentsbeyondtherawdataorgroupingofprinciple(s)onthebasisoftheirdatacontentarenotrequiredfortheunsupervised algorithms.

Metaheuristicalgorithmshaveprovedefficientinhandlingandsolvingdifferenttypesof dataclusteringproblems.Metaheuristicsisdesignedtotacklecomplexclusteringproblems whereclassicalclusteringalgorithmsfailtobeeithereffectiveorefficient.Basically,the solvingprocedureofasubordinateheuristicproblembyaniterativegenerationprocedure isknownas metaheuristic.Thisisdonebysyndicatingintelligentlydifferentconceptsto exploreandexploitthesearchspace,andthenonoptimalsolutionsarederivedefficiently bythelearningstrategiesthatareappliedonthestructuralinformationoftheproblem. Themainobjectiveofmetaheuristicistoderiveasetofoptimalsolutionslargeenough tobecompletelysampled.Differenttypesofreal-worldproblemscanbehandledbythe metaheuristictechniquesbecauseconventionalalgorithmscan’tmanagemanyreal-world problems,inspiteofincreasingcomputationalpower,simplyduetotheunrealisticallylong runningtimes.Tosolvetheoptimizationproblems,thesealgorithmsmakeafewassumptionsattheinitialstages.Itisnotassuredthatmetaheuristicalgorithmswillgenerate globallyoptimalsolutionstosolvealltypesofproblemssincemostoftheimplementations aresomeformofstochasticoptimizationandtheresultantsolutionsmaydependonthe setofgeneratedrandomvariables.Tosolveoptimizationalgorithms,heuristics,oriterative

methods,metaheuristicalgorithmsarethebetteroptionastheyoftendeterminegood solutionswithlessercomputationaleffortbyexploringalargesetoffeasiblesolutions. Somewell-knownmetaheuristicalgorithmsincludethegeneticalgorithm(GA),simulated annealing(SA),tabusearch(TS),anddifferenttypesofswarmintelligencealgorithms. Somerecognizedswarmintelligencealgorithmsareparticleswarmoptimization(PSO), antcolonyoptimization(ACO),artificialbeecolonyoptimization(ABC),differential optimization(DE),cuckoosearchalgorithm,etc.Inrecentadvancementsoftheresearch, somemodernswarmintelligence–basedoptimizationalgorithmssuchasEgyptianvulture optimizationalgorithm,ratsherdalgorithm(RATHA),batalgorithm,crowsearchalgorithm,glowwormswarmoptimization(GSO),etc.,arefoundtoperformwellwhensolving somereal-lifeproblems.Thesealgorithmsalsoworkefficientlytoclusterdifferenttypesof real-lifedatasets.

Duringtheclusteringofdata,ithasbeenobservedthatthemethaheuristicalgorithms sufferfromtimecomplexitythoughtheycanaffordoptimumsolutions.Togetridof thesetypesofproblemsandnotdependonaparticulartypeofmetaheuristicalgorithm tosolvecomplexproblems,researchersandscientistsblendednotonlydifferentmetaheuristicapproachesbutalsohybridizeddifferentmetaheuristicalgorithmswithother softcomputingtoolsandtechniques,suchasneuralnetwork,fuzzyset,roughset,etc.The hybridmetaheuristicalgorithms,acombinationofmetaheuristicalgorithmsandother techniques,aremoreeffectiveathandlingreal-lifedataclusteringproblems.Recently, quantummechanicalprinciplesarealsoappliedtocutdownonthetimecomplexityofthe metaheuristicapproachestoagreatextent.

Thebookwillenticereaderstodesignefficientmetaheuristicsfordataclusteringindifferentdomains.Thebookwillelaborateonthefundamentalsofdifferentmetaheuristics andtheirapplicationtodataclustering.Asasequeltothis,itwillpavethewayfordesigninganddevelopinghybridmetaheuristicstobeappliedtodataclustering.Itisnoteasyto findbooksonhybridmetaheuristicalgorithmsthatcoverthistopic.

Thebookcontainsninechapterswrittenbytheleadingpractitionersinthefield.

Abriefoverviewoftheadvantagesandlimitationsofthefuzzyclusteringalgorithmis presentedinChapter1.Theprincipleofoperationandthestructureoffuzzyalgorithms arealsoelucidatedwithreferencetotheinherentlimitationsofclustercentroidselection. Severallocal-search-basedandpopulation-basedmetaheuristicalgorithmsarediscussed withreferencetotheiroperatingprinciples.Finally,differentavenuesforaddressingthe clustercentroidselectionproblemwithrecoursetothedifferentmetaheuristicalgorithms arepresented.

Theincreasingsizeofthedataandtextonelectronicsiteshasnecessitatedtheuseofdifferentclusteringmethods,includingtextclustering.Thismaybeahelpfulunsupervised analysismethodusedforpartitioningtheimmensesizeoftextdocumentsintoasetof groups.Thefeaturechoicemaybeawell-knownunsupervisedmethodologyaccustomed toeliminatinguninformativeoptionstoenhancetheperformanceofthetextclustering method.InChapter2,theauthorshaveatendencytoprojectaruletoresolvethefeaturedchoicedrawbackbeforeapplyingthe k-meanstextclusteringtechniquebyrisingthe exploitationsearchabilityofthefundamentalharmonysearchalgorithmicruleknownas H-HSA.Theprojectedfeaturechoicemethodologyisusedinthischaptertoreinforcethe textclusteringtechniquebyofferingareplacementsetofinformativefeatures.

Intheadvancementofdataanalytics,dataclusteringhasbecomeoneofthemostimportantareasinmoderndatascience.Severalworkshavecomeupwithvariousalgorithms todealwithdataclustering.InChapter3,theobjectiveistoimprovethedataclustering byusingmetaheuristic-basedalgorithms.Forthispurpose,theauthorshaveproposeda geneticalgorithm–baseddataclusteringapproach.Here,anewadaptiveposition–based crossovertechniquehasbeenproposedforthegeneticalgorithmwherethenewconceptof vitalgenehasbeenproposedduringthecrossover.Thesimulationresultsdemonstratethat theproposedmethodperformsbettercomparedtotheothertwogeneticalgorithm–based dataclusteringmethods.Furthermore,ithasalsobeenobservedthattheproposedapproach istimeefficientcomparedtoitscounterparts.

Asocialnetwork,usedbythehumanpopulationasaplatformofinteraction,generatesa largevolumeofdiversedataeveryday.Thesedataandattributesoftheinteractionsbecome moreandmorecriticalforresearchersandbusinessestoidentifysocietalandeconomic values.However,thegenerateddataisvast,highlycomplex,anddynamic,whichnecessitatesareal-timesolution.Machinelearningisausefultoolinordertosummarizethe meaningfulinformationfromlarge,diversedatasets.Chapter4providesasurveyofseveral applicationsofsocialnetworkanalysiswheremachinelearningplaysacriticalrole.These applicationsrangefromspamcontentdetectiontohumanbehavioranalysis,fromtopic modelingtorecommendersystems,andfromsentimentanalysistoemotioncontagionin socialnetwork.

Predictingstudents’performanceatanearlierstageisimportantforimprovingtheirperformanceforhighereducationandplacementopportunities.Earlypredictionofstudent gradesallowsaninstructortodetectthestudents’poorperformanceinacourseautomaticallyandalsoprovidesenormousopportunitiestothedecision-makerstotakeremedial measurestohelpthestudentstosucceedinfutureeducation.Amodelpredictingstudents’ gradesusingCART,ID3,andimprovedmulticlassSVMoptimizedbythegeneticalgorithm (GA)isinvestigatedinChapter5.Themodelfollowsasupervisedlearningclassificationby meansofCART,ID3,andSVMoptimzedbytheGA.Inthisstudy,themodelistestedona datasetthatcontainsundergraduatestudentinformation,i.e.,totalmarksobtainedinthe coursestakenupinfouryearswiththerespectivelabeledsubjectnameandacodeatSikkim ManipalInstituteofTechnology,Sikkm,India.AcomparativeanalysisamongCART,ID3, andmulticlassSVMoptimizedbytheGAindicatesthatthemulticlassSVMoptimizedby GAoutperformsID3andCARTdecisiontreealgorithmsinthecaseofmulticlassclassification.

Significantadvancesininformationtechnologyresultintheexcessivegrowthofdatain healthcareinformatics.Intoday’sworld,developmenttechnologiesarealsobeingmadeto treatnewtypesofdiseasesandillnesses,butnostepsarebeingtakentostopthedisease initstrackintheearlystages.ThemotivationofChapter6istohelppreparepeopleto diagnosethediseaseatearlystagesbasedonthesymptomsofthedisease.Inthischapter, theauthorshaveusedvariousnature-inspiredclusteringalgorithmsincollaborationwith k-meansalgorithmstoactivelyclusteraperson’shealthdatawithalreadyavailabledataand labelitaccordingly.Experimentsresultsprovethatnature-inspiredalgorithmslikefirefly with k-meansaregivingefficientresulttotheexistingproblems.

Withthefastdevelopmentofpatterndiscovery–orientedsystems,dataminingisrapidly intensifyinginotherdisciplinesofmanagement,biomedical,andphysicalsciencesto

tackletheissuesofdatacollectionanddatastorage.Withtheadvancementofdata science,numerousknowledge-orientedparadigmsareevaluatedforautomaticrule mining.Associationruleminingisanactiveresearchareawithnumerousalgorithms usedforknowledgeaccumulation.Chapter7focusesonhandlingthevariouschallenging issuesofonlydemand-drivenaggregationofinformationsources,miningandanalyzing relevantpatternstopreserveuserconcerns,andimplementingthesameassociationrule miningproblemformulti-objectivesolutionsratherthanasasingle-objectivesolutionfor post-purchasecustomeranalysis.

TheGAandfuzzy c-means(FRCM)algorithmiswidelyusedinmagneticresonance imagesegmentation.InChapter8,ahybridconcept,quantum-inspiredmodifiedGA andFRCMareusedtosegmentMRimages.ThemodifiedGA(MEGA)enhancesthe performanceoftheGAbymodifyingpopulationinitializationandcrossoverprobability. TospeedupthisclassicalMEGAandalsotoderivemoreoptimizedclasslevels,some quantumcomputingcharacteristicslikequbit,entanglement,orthogonality,rotational gate,etc.,areincorporatedintotheclassicalMEGA.Theclasslevelscreatedbythe quantum-inspiredMEGAareemployedtotheFRCMasinitialinputtoovercomethe convergenceproblemoftheFRCM.Aperformancecomparisonusingsomestandard evaluationmetricsisdelineatedbetweenquantum-inspiredMEGA-basedFRCM,classical MEGA-basedFRCM,andconventionalFRCMwiththehelpoftwograyscaleMRimages, whichshowstheexcellenceoftheproposedquantum-inspiredMEGA-basedFRCMover boththeclassicalMEGA-basedFRCMandtheconventionalFRCMmethods.

Largevolumesofdatahavebeenrapidlycollectedduetotheincreasingadvancesin equipmentandtechniquesforcontentacquisition.However,theefficientstorage,indexing, retrieval,representation,andrecognitionofmultimediadata,suchastext,audio,images, andvideos,arechallengingtasks.Tosummarizethemaincharacteristicsofdatasetsand simplifytheirinterpretation,exploratorydataanalysisisoftenappliedtonumerousproblemsinseveralfields,suchaspatternrecognition,computervision,machinelearning,and datamining.Acommondataanalysistechnique,commonlyassociatedwithdescriptive statisticsandvisualmethods,isclusteranalysisorclustering.InChapter9,theauthorsproposeahybridmethodbasedon k-meansandthegeneticalgorithmguidedbyaqualitative objectivefunction.Experimentsdemonstrategoodresultsoftheproposedmethod.

Theeditorshopethatthisbookwillbehelpfulforstudentsandresearcherswhoare interestedinthisarea.Itcanalsoprovetobeanovelinitiativeforundergraduatestudentsofcomputerscience,informationscience,andelectronicsengineeringforpartoftheir curriculum.

October,2019 SouravDe CoochBehar,India SandipDey Jalpaiguri,India SiddharthaBhattacharyya Bangalore,India

MetaheuristicAlgorithmsinFuzzyClustering

SouravDe 1 ,SandipDey 2 ,andSiddharthaBhattacharyya 3

1 DepartmentofComputerScienceandEngineering,CoochBeharGovernmentEngineeringCollege,India

2 DepartmentofComputerScience,SukantaMahavidyalaya,Jalpaiguri,India

3 DepartmentofComputerScienceandEngineering,CHRIST(DeemedtobeUniversity),Bangalore,India

1.1Introduction

Fuzzyclusteringreferstotheprocessofassigningdatapointstodifferentclustersbasedon thesimilarity/dissimilarityoffeatures.Thisprocessensuresthatitemsinthesamecluster areassimilaraspossible,whiledissimilaritemsbelongtodifferentclusters.Theidentificationoftheclustersandtheassignmentofitemstoclustersaredecidedwiththehelpof severalsimilaritymeasures,whichincludemeasuresofdistance,connectivity,andintensity.Thechoiceofthesimilaritymeasuresdependsonthetypeofdataortheapplication[1].

Bothclassicalandnewalgorithmshaveevolvedovertheyearstoaddresstheclustering problem.Notableamongthemarethe k-means[2]andfuzzyclustering[3,4].Theclassicalalgorithmsprimarilysegregatethedatapointsintocompletelydifferentclusterswhile ensuringthatthedissimilaritybetweenthedifferentclustersandthesimilarityoftheconstituentdatapointswithinanyclusteraremaximizedintheprocess.Thus,thesealgorithms ensurethatthereisnooverlapbetweentheclusters.However,fuzzyclusteringrelieson thesoftmeaning,therebyenablingoverlappingbetweenclusterswiththeconstituentdata pointsbelongingtomorethanoneclusterdependingonadegreeofbelongingness.

Themainlimitationinanyclusteringalgorithmliesintheinitializationprocess,which entailsaninitialselectionofclustercenterpoints,whicharechosenrandomlyinmostcases. Hence,animproperinitializationoftheclustercentersmayleadtoanunacceptableresult sincethepositionsoftheclustercenters,withrespecttotheconstituentdatapoints,are majorconcernsintheassignmentofthedatapointstotheclustercenters.

1.2FuzzyClustering

Fuzzyclustering,oftenreferredtoassoftclusteringorsoft k-means,isamethodthatentails asoftdistinctionoftheconstituentdatapoints.Bycontrast,inanynon-fuzzy/crispclustering,eachdatapointisdesignatedtobelongtoexactlyoneandonlyoneclusterwithno

RecentAdvancesinHybridMetaheuristicsforDataClustering, FirstEdition. EditedbySouravDe,SandipDey,andSiddharthaBhattacharyya. ©2020JohnWiley&SonsLtd.Published2020byJohnWiley&SonsLtd.

1MetaheuristicAlgorithmsinFuzzyClustering

overlappingofclusters.Thedatapoints,however,canbelongtomorethanonecluster, implyingthatcertainoverlapsexistbetweenresultantclusters.Theunderlyingprinciple behindthispartitionalclusteringtechniqueistheconceptoffuzzysoftsettheory,which holdsthatforagivenuniverseofdiscourse,everyconstituentelementbelongstoallthesets definedintheuniversewithacertaindegreeofbelongingness(alsoreferredtoasmembership)[3–5].Fuzzyclusteringisoftentreatedaspreferableduetotheinherentadvantages ofhavinganaturalaffinityofincorporatinglargerdatasets,asimpleandstraightforward implementation,theabilitytohandlelargedatasetsasthetimecomplexityis O(n),theabilitytoproduceverygoodresultsforhypersphericallyshapedwell-separatedclusters,being robustindesign,andtheabilitytoconvergetoalocaloptimalsolution[1].

1.2.1Fuzzy c -means(FCM)clustering

FCMclusteringisoneofthemostwidelyused.ItwasdevelopedbyJ.C.Dunnin1973[6] andimprovedbyJ.C.Bezdekin1981[1].Theoperationofthealgorithmisquitesimilarto thewidelyknown k-meansalgorithm.Thebasicstepsareasfollows:

1)Selectanumberofclusters.

2)Randomlyassigncoefficientstoeachdatapointtolabelthemtotheclusters.

3)Repeatuntilthealgorithmconverges,i.e.,whenthechangeinthecoefficientsintwo consecutiveiterationsisnomorethanapredefinedthreshold ��.

4)Computetheclustercentroidsforeachcluster.Everydatapoint x isidentifiedbyaset ofcoefficientsindicatingthedegreeofbelongingnesstothe kth cluster ��k (x ).InFCM, themeanofalltheparticipatingpointsweightedbytheirdegreeofbelongingnesstothe clusterrepresentstheclustercentroid.Itismathematicallygivenas

where m isahyper-parameterthatcontrolsthefuzzynessoftheclusters.Thehigher m is,thefuzziertheclusterwillbeintheend.

5)Foreachdatapoint,computeitscoefficientsofbelongingnessintheclusters.

1.3Algorithm

Analgorithmattemptstopartitionafinitecollectionof n elements X ={x1 , … , xn } into acollectionof c fuzzyclusterswithrespecttosomegivencriterion.Givenafinitesetof data,thealgorithmreturnsalistof c clustercenters C ={c1 , , cc } andapartitionmatrix W = ��i, j ∈[0, 1], i = 1, , n, j = 1, , c,whereeachelement ��ij tellsthedegreetowhich element xi belongstocluster cj . Theaimistominimizeanobjectivefunctioninthefollowingform:

K -meansclusteringworksalongsimilarlines.However,itdiffersfromthe k-meansobjectivefunctionbythepresenceof ��ij (ortheclusterfuzziness)determinedbythefuzzifier, m ∈ R,with m ≥ 1.Alarge m resultsinsmallermembershipvalues, ��ij ,and,hence,fuzzier clusters.Inthelimit m = 1,thememberships, ��ij ,convergeto0or1,whichimpliesacrisp partitioning. m iscommonlysetto2.Thealgorithmalsominimizestheintraclustervariancethatoftenleadstoalocalminimum.Moreover,theclusteringresultsdependonthe initialchoiceofweights.Fuzzyclusteringsuffersfromthefactthatthenumberofclusters inthegivendatasetshouldbeknownbeforehand.Itisalsosensitivetonoiseandoutliers.

1.3.1SelectionofClusterCenters

Mostoftheclusteringalgorithmsrequireaninitialselectionofclustercentroids(whichis oftenmadeinarandomfashion)withoutexception.Infact,theselectionoftheinitialclustercentervaluesisconsideredoneofthemostchallengingtasksinpartitionalclustering algorithms.Incorrectselectionofinitialclustercentervaluesleadsthesearchingprocess towardanoptimalsolutionthatgetsoftenstuckinalocaloptimayieldingundesirable clusteringresults[7,8].Theprimarycausebehindthisproblemliesinthefactthatthe clusteringalgorithmsruninamannersimilartothehillclimbingalgorithm[9],which, beingalocalsearch-basedalgorithm,movesinonedirectionwithoutperformingawider scanofthesearchspacetominimize(ormaximize)theobjectivefunction.Thisbehavior preventsthealgorithmtoexploreotherregionsinthesearchspacethatmighthaveabetter, oreventhedesired,solution.Thus,properexploitationandexplorationofthesearchspace arenoteffectedintherunningofthesealgorithms.

Thegeneralapproachtoalleviatethisproblemistorerunthealgorithmseveraltimes withseveralclusterinitializations.However,thisapproachisnotalwaysfeasible,especially whenitcomestotheclusteringofalargedatasetorcomplexdataset(i.e.,adatasetwithmultipleoptima)[10].Thus,thissectionmechanismmaybeincarnatedasaglobaloptimization problemcallingforthehelpofoptimizationalgorithms.

Severalglobal-basedsearchalgorithmshavebeenproposedtosolvethislocal-search problem[11].Thesealgorithmsincludebothlocalsearch-basedmetaheuristicalgorithms suchasSA,TS,orsuchasEAs(includingEP,ES,GAs,andDE),HSorsuchasPSO,ABC, andACO.Thefollowingsectionsprovideanoverviewofthealgorithmsproposedtosolve theclusteringproblemwheretheclustersnumberisknownorsetup apriori.

1.4GeneticAlgorithm

Geneticalgorithms(GAs),apopularoptimizationalgorithm,aregenerallyusedtosearch foroptimalsolution(s)toanyparticularcomputationalproblem.Thisisdonebymaximizingorminimizingaparticularfunction,calledanobjective/fitnessfunction.GAsrepresent evolutionarycomputation[12],afieldofstudy,wheretheyemulatethebiologicalprocesses likereproductionandnaturalselectiontofind“fittestsolutions”[13].Thereexistvarious GAsprocessesintheliteraturethatarerandominnature.Inthistechnique,adifferentlevel ofrandomizationandcontrolareallowedtosetforoperation[13].GAswasproventobe apowerfulandwell-regulatedoptimizationtechniqueincomparisonwithotherrandom searchalgorithmsandexhaustivesearchalgorithms[12].

GAsaredesignedtoimitatebiologicalprocess,andalargenumberofthepertinentterminologyistakenfrombiologicalscience.Thefundamentalcomponentsthatarecommon amongallGAsare

● Afitness(objective)function

● Apopulationofnumberofchromosomes

● Selectionoperationtoproducepoolofchromosomesinpopulation

● Crossoveroperationtoproducepopulationdiversityinthesubsequentgenerations

● Mutationoperationtochangethechromosome’spropertyinnewgeneration

Agivenalgorithmisoptimizedwithreferencetoanobjectivefunction[14].Theterm“fitness”originatedfromevolutionarytheory.Thefitnessfunctionisusedtotestandquantify eachindividualpotentialsolution.Chromosomesinapopulationarereferredtoasnumericalvaluesthatareusedtorepresentacandidatesolutionforagivenproblem,whichis solvedusingaGA[14].Eachofthecandidatesolutionsispassedthroughanencodingprocess,whichisbasicallyastreamofparametervalues[15].Theoretically,theencodingof eachchromosomeforaproblemhaving N dimensionsisaccomplishedasanarrayof N chromosomeasgivenby [q1 , q2 , … , qN ].Here,each qk representsaspecificvalueofthe kth parameter[15].Ingeneral,chromosomesareencodedusingabitstring,i.e.,asequenceof 0sand1s.Inmodern-daycomputersystems,chromosomescanbegeneratedbyincluding realnumbers,permutations,andalsootherobjects.

AGAstartswithanumberofchromosomes,chosenrandomly,whichcreatestheinitialpopulation.Thereafter,afitnessfunctionisintroducedtoevaluateeachmemberinthe population.Thisevaluationisbasicallyaccomplishedtostudyhowwellitcanpotentially solveagivenproblemathand.Afterward,aselectionoperatorisintroducedtochoosea numberofpotentialchromosomesforreproductiononthebasisofauser-definedprobabilitydistribution.Theselectionisdoneinaccordancewiththefitnessofthechromosomes inthepopulation.Theprobabilityoftheselectionofaparticularchromosomeincreases astheincreaseofitsfitnessforthesubsequentgenerations.Forexample,let fn bethefitnessfunctionintroducedforsolvingaparticularproblem.Theprobabilityofselecting Cn ischosenby

Itcanbenotedthattheselectionoperatorisusedchoosechromosomeswithreplacement. Thisapproachensuresthatthesamechromosomecanbeselectedanumberoftimes.The issomehowanalogoustothebiologicalcrossoverandrecombiningthem.Twooffsprings havingdifferentfeaturesarecreatedbyswappingtwochosenchromosomesatasingle pointoratmultiplepoints.Supposetheparentchromosomes [10010010010011] and [11010011001011] arecrossedoveratthefifthposition.Itcreatestwonewoffsprings,given by [10010011001011] and [11010010010011]

Themutationoperationoverturnsindividualbitstogetnewchromosome.Basically, bit0isturnedintobit1andviceversaoncethisoperatorisintroduced.Generally,mutation operationoccurswithanexceedinglylowprobability(suchas0.001).Inafewoccasions, theorderofimplementationofmutationoperatorbeforeothertwooperatorsissubject tothematterofpreferencebythedesigner.Selectionandcrossoveroperatorsgenerally

prolongthegeneticinformationofbetter(fitter)chromosomes,whichcanresultinthe quickconvergenceofagivenalgorithm.Thiscausesthealgorithmtobestuckatalocal optimamuchbeforeattainingtheglobaloptima[16].Thismaintainspopulationdiversity byhelpingthealgorithmtoprotectagainstthisproblem.Itcanalsobethecauseofthe algorithmsufferingfromslowconvergence.

1.5ParticleSwarmOptimization

Theconceptofparticleswarmoptimization(PSO)wasfirstdevelopedbyJamesKennedy andRussellEberhartin1995[17].Theinspirationbehinditsdevelopmentcamebeentaken fromthefollowingconcepts:

● Byobservingtheswarmingabilityofanimalslikefishorbirds

● Byadoptingtheideafromthetheoryofevolutionarycomputation

TheconceptofPSOisdescribedusingthefollowingpoints:

● Techniquesarecapableofhandlingandpreservinganumberofpotentialsolutionsevery singletime.

● Eachsolutionisassessedwithareferencetoafunction,calledanobjectivefunction,to computeitsfitnessduringeachiteration.

● Eachofthepotentialsolutionsinthepopulationisregardedasaparticleinthesearch space(fitnesslandscape).

● Eachparticle“swarm”or“fly”throughthefitnesslandscapetofindtheminimum/maximumvaluecomputedbythefitnessfunction.

Duringeachiteration,theparticlesinthepopulation(swarm)maintainthefollowing criteria:

● Positionofeachparticleinitssearchspace,whichincludesolutionandfitness

● Velocityateachparticle

● Bestpositionofeachindividual

● Globalbestpositionintheswarm

Thetechniquegenerallycomprisesthefollowingsteps:

1)Evaluatethefitnessvalueofeachparticleinthepopulation.

2)Theindividualbestandtheglobalbestoftheswarmareupdated.

3)Thevelocityanditscorrespondingpositionofeachparticleareupdated.

Thesementionedstepsarerepeatedforapredefinednumberofgenerationsoruntila stoppingcriteriaismet.

Thevelocityofeachparticleisupdatedusingthefollowingformula:

where k representstheparticle’sindex, �� iscalledtheinertialcoefficient, ��1 and ��2 are knowntobetheaccelerationcoefficients,and0 ≤ ��1 ,��2 ≤ 2, r1 , r2 aretworandomvalues, 0 ≤ r1 , r2 ≤ 1. ��k (t) isrepresentedastheparticle’svelocityatanytime t,and xk (t) represents

1MetaheuristicAlgorithmsinFuzzyClustering

thepositionofaparticleattime t.Asoftime t, ̂ yk (t)) and g(t) arerepresentedastheparticle’s individualbestandtheswarm’sbestsolution.

Thepositionofeachparticlesisupdatedbyusingthefollowingequation:

wherethelocationofthe kth particle(yk (t))atthe tth generationischangedtoanotherlocation, yk (t + 1),atthe (t + 1)th generationusingvelocity ��k (t + 1)

1.6AntColonyOptimization

TheAntSystemwasthefirstmemberofacertainclassofalgorithms,called(ACO)[18]. Thisisarecent,popularmetaheuristicalgorithm.Thisalgorithmwasinitiallyintroduced byColorni,Dorigo,andManiezzo.Theinspirationbehinditsdevelopmentwastheforagingbehaviorgatheredfromrealants.Thisforagingbehaviorwasexploredandexploitedin artificialantcoloniestofindtheapproximatesolutionstoseveraldiscrete/continuousoptimizationproblems.Thisoptimizationsolutionisalsoverymuchapplicabletovariousproblemsintelecommunications,suchasloadbalancingandrouting.Eachantintheircolony randomlytraversesfromheretothere.Theindirectcommunicationhappensbetweenthe realantswiththehelpofchemicalpheromonetrails.Theantsdepositthischemicalintheir pathsfromtheirsourceofthenesttothefoodsource.Thischemicalenablesthemtosearch fortheshortestpathstodifferentfoodsources.Theprobabilityofvisitingaparticularpath increaseswiththeincreaseofpheromonedepositedonthatpath.

Thealgorithmcomprisesthefollowingsteps:

1)Avirtualtrailisgatheredonvariouspathsegments.

2)Apathisrandomlyselectedonthebasisoftheamountof“trail”availableonpossible pathsfromtheinitialnode.

3)Theanttraversestothenextavailablenodetoselectthenextpath.

4)Thisprocesscontinuesuntiltheantreachesthestartingnode.

5)Thefinishedtourisrecognizedasasolution.

6)Thewholetourisanalyzedtofindtheoptimalpath.

Supposeananttraversesfromnode j tonode k inagraph (G, E) withaprobabilityof pjk . Thenvalueof pjk isdeterminedby

where ��jk representstheamountofpheromonedepositedontheedge (j, k), �� ,aparameter thatisusedtocontroltheinfluenceof ��jk , ��jk isdefinedasthedesirabilityofedge (j, k), Like �� , �� isanotherparameterthatisusedtocontroltheinfluenceof ��jk . Theamountofdepositedpheromoneisbeingupdatedwithreferencetothefollowing equation:

1.8LocalSearch-BasedMetaheuristicClusteringAlgorithms

where ��ij isrepresentedastheamountofpheromonedepositedonanygivenedge (j, k), �� iscalledtheevaporationrateofpheromone,and ��jk isrepresentedasthedepositedamount ofpheromone.Thevalueof ��jk iscomputedby 1 Ck ,ifant k traversestheedge (j, k) and Ck isthecostofthattravelbythatparticularant.Inallothercases,thevalueof ��jk isassumed tobezero.

1.7ArtificialBeeColonyAlgorithm

Karaboga[19]introducedtheartificialbeecolony(ABC)algorithmisapresent-dayclassof swarmintelligencealgorithms.TheinspirationbehindthedevelopmentoftheABCalgorithmistheforagingbehaviorofrealbeecolonies.Thisalgorithmhasbeenappliedtosolve continuousoptimizationproblems.

TheABCalgorithmhasthreekindsof(artificial)beesasgivenbythefollowing:

1)Employedbees:Eachoftheemployedbeeisconnectedtoanonidenticalsolutionofthe givenoptimizationproblemthatisrequiredtobesolved.Thisclassofbeeexploresthe localityofthesolution,whereitwaskeptassociatedateachiteration.

2)Onlookerbees:Thisclassofbeesalsoexploresitslocalityofsolutionsinadifferentmanner.Theyprobabilisticallyselectthesolution,whichisexploredbythemineachiteration dependingonthequalityofthesolution.Hence,theprobablesolutionsvaryforthem.

3)Scoutbees:Forthisclassofbees,thelocalityofthesolutionisexploredforapredefined numberoftimes.Ifthereisnopositiveoutcomefound,thescoutbeeuniformlyselects anewrandomsolutioninitssearchspacebyappendinganexplorationpropertytothe algorithm.

Thisisapopulation-based,efficient,localsearchalgorithm,inwhichitexplorestheneighborhoodofeachsolutionateachiteration.Thefirstalgorithmwasruntosolveseveral standardbenchmarkproblems,whichgivesinspiringresults,butwithrespecttosome state-of-the-artalgorithms,itsresultswerenotsoencouraging.Particularly,whileconsideringcompositeandnonseparablefunctions,theABCalgorithmgivescomparatively poorperformance,andalsothisalgorithmpossessesslowconvergenceforhigh-quality solutions[20].

1.8LocalSearch-BasedMetaheuristicClusteringAlgorithms

Thelocalsearch-basedalgorithmshavebeenwidelyusedinsolvingseveralclusteringproblemsintheliterature.Someofthemarepresentedinthissection.In[21],theauthorspresentedasimulatedannealing-basedclusteringproblem,whichcanbeefficientlyhandled andsolvedusingthemetaheuristicclusteringalgorithms.Oneoftheweaknessthatiscommoninisthatthistechniquemayfindlocalminimumsolutions.Theproposedsimulated annealing-basedmethodsolvesoptimizationproblemsbytakingcareofsuchclustering weakness.Therequiredfactorshavebeenaddressedindetailintheproposedtechnique andithasbeenshownthattheproposedtechniqueconvergestooptimalsolutionsofthese

clusteringweaknesses.Al-Sultan[22]laterpresentedatabusearch-basedclusteringtechnique,wheretheauthorprovedthattheproposedtechniqueoutperformsboththe k-means techniqueandthesimulatedannealing–basedclusteringtechnique.Abouttwoyearslater, Al-SultanandFedjki[23]proposedanotherkindofalgorithmforhandlingthefuzzyclusteringproblem.In2000,animprovedversionoftabusearch-basedclusteringalgorithmwas presentedbySungandJinSung[24].Intheirproposedwork,theycombinedatabusearch heuristicapproachwithtwoothercompatiblefunctionalapproaches,knownaspacking andreleasing.Theelectivenessoftheirproposedtechniquehasbeennumericallytestedand provedbycomparingotherworks,suchasthetabusearchalgorithm,simulatedannealing technique,amongothers.Theaforementionedlocalsearchmetaheuristics,tabusearch,and simulatedannealingimproveonecandidatesolution,andtheproblemofsuckingatlocal minimumsolutionshasbeenrectified.Twomoreimportantpointsregardingtheefficacy ofthistechniquearethatitisfactorssensitiveandalsoexceedinglyproblem-dependentin thecaseoftuning[25].

1.9Population-BasedMetaheuristicClusteringAlgorithms

Intheliterature,thepopulation-basedmetaheuristicclusteringalgorithms,designedby severalauthors,havebeencomprehensivelyappliedinfuzzyclustering.Afewefficient algorithmsofthiskindispresentedinthissection.Evolutionaryalgorithms(EAs)forfuzzy clusteringthatcanadaptthecurrenttechniques.TheapproachestakenbyEAscanbeprincipallygroupedintotwocategories[10].Thefirstoneiscommon,whichcanbefurther dividedintotwoofthefollowingsteps.

1)Searchforappositeclustercentersviaevolutionaryalgorithm.

2)Theclustercentersattainedastheoutcomeoftheformerstepareusedastheinitial clustercenteronwhichtheFCMalgorithmisapplied.

Thesecondoneisanotherpopularapproachthatusestheevolutionaryalgorithmasaclusteringalgorithmbyitself.Afewofitskindoritsothervariationsuseasalocalsearchengine forsupportingtheperformanceofthemtospeeduptheirconvergence.Thisapproachis alsoappropriatetometaheuristic-basedhardclusteringtechniques.Afewofthiskindof algorithmsaredescribednext.

1.9.1GA-BasedFuzzyClustering

Halletal.[26],HallandOzyurt[27],andHalletal.[28]presenteddifferentworks,in whichtheauthorsclaimedthattheirproposedalgorithmscanbeusedasefficientclusteringtools.Intheirproposedalgorithms,thegeneticalgorithmwasappliedtosearchfor thenearoptimalclustercentersand,ontheotherside,wasappliedtoaccomplishtheclustering.Forencodingpurposes,arealencodingschemewasintroducedinthepopulationof geneticalgorithm.Theauthorsprovedthattheproposedalgorithmsfoundanencouraging resultincomparisonwithrandominitialization.Thelimitationoftheseworksarethattheir GAmethodisincapableoffocusingonsmalladvancementstotheclustercenterstohold upafinalclustering.Incomparisonwiththepreviousmethod,theGA-guidedclustering

algorithmhasthehighersensitivitytotherandomsolutiongeneratedinitially,whichcan beappliedtocreateaninitialpopulation.Moreover,moreexperimentswereneededto provethecapabilityoftheproposedalgorithmforavoidingaprematureconvergence.Liu andXie[29]presentedapaperinwhichtheyprovedthattheirproposedapproachpossessesmuchhigherprobabilitiesofreachingtheglobaloptimalsolutionscomparedtothe traditionaltechniques.Inthisalgorithm,abinaryencodingschemeandthestandardoperatorsofgeneticalgorithmswereusedtorepresentclustercentersineachchromosome. Experimentally,theauthorsprovedthattheirproposedalgorithmoutperformsothers.The shortcomingofthisapproachisthatifthesizeofpopulationissmall,itmaygetstuckin alocaloptimaproblem.VanLeVan[30]presentedtwoseparateapproaches,basedona geneticalgorithmandevolutionaryprogramming,todealwithfuzzyclusteringproblems. Afterconductingseveralexperiments,theauthorsconcludedthatthesuccessrateofthis proposedalgorithmisbettercomparedtotheFCMalgorithm,andtheauthorconcluded thattheevolutionaryprogramming-basedmethodproducesthebestresults.Klawonnand Keller[31]designedanevolutionaryprogrammingmodelforclusteringvarioustypesof clustershapeslikesolidandshellclusters.Forashellshape,theproposedalgorithmdoes notproduceanyencouragingresultsbutforasolidshape,theresultsseemstobegood. Eganetal.[32]introducedageneticalgorithm–basedfuzzyclusteringtechniquefornoisy data.Inthistechnique,anadditionalclusterrepresentingnoisedata,knownasanoise cluster,isclubbedtoeachchromosome.Theexperimentalresultsshowedthatthebinary representationpossessesabetterresultcomparedtoareal-valuedrepresentationinthis particulardomain.Halletal.[28]presentedanotheralgorithm,inwhichtheproposedalgorithmshowedapromisingresultforlessnoisedata[33],butfornoisedata,thealgorithm doesnotperformsatisfactorily[28].MaulikandSaha[34]developedamodifieddifferential evolution-basedfuzzyclusteringalgorithm.Inthisalgorithm,amodifiedmutationprocess wasintroducedusingthethoughtsoflocalandglobalbestvectorsasinthePSOalgorithm.Thesetheorieswereusedduringthismodifiedmutationprocessforpushingthetrial vectorspeedilytowardaglobaloptima.Theyhavesuccessfullyusedtheirproposedalgorithmintheimagesegmentationalgorithm.Theyhavealsousedtheproposedalgorithm inanumberofsynthetic,real-lifedatasetsandstandardbenchmarkfunctionstoprovethe applicabilityofthisalgorithm.

1.9.2PSO-BasedFuzzyClustering

Xiaoetal.[35]usedanovelandself-organizingmap(SOM)methodforclustering.The authorshaveusedthegeneexpressiondataforclusteringpurposes.Inthisproposed method,aconsciencefactorhasbeenaddedtoincreasetherateofconvergence.In thisapproach,theconceptsofPSOhavebeenutilizedfordevelopingtheweightsand, thereafter,isusedtotraintheseweightsinthefirstphase.Afterward,inthenextphase, PSOisappliedtoimprovethem.ThishybridSOM-PSOapproachgivesencouraging outcomewhileappliedonthegeneexpressiondataofRatHepatocytesandYeast.Cui etal.[36]andCuiandPotok[37]introducedPSO-basedhybridmethodstoclassifythe textdocuments.Forhybridization,two-stepclusteringapproacheshavebeenused.Firstly, PSO-basedfuzzyclusteringisusedforclusteringforapredefinedmaximumnumberof iterations.Thereafter,the k-meansalgorithmisintroducedtoinitializetheclustercenters

1MetaheuristicAlgorithmsinFuzzyClustering

thatwereachievedfromthepreviousstepandthenaccomplishthelastclusteringprocess. Thesetwostepshavebeenusedatonetimetoimprovetheperformanceandalsospeed upitsconvergencemainlyforlargedatasets.TheauthorsexaminedthePSO,aswellas ahybridPSOclusteringalgorithmonfourvarioustextdocumentdatasets.Accordingto theirobservation,highlycompactclusteringisgenerallygeneratedbythehybridPSO clusteringalgorithmoverashortperiodoftimecomparedtothe k-meansalgorithm.

1.9.3AntColonyOptimization–BasedFuzzyClustering

Theantcolonyoptimization(ACO)[18]algorithmisalsoappliedtoovercomethe shortcomingsofthefuzzyclusteringmethods.Yuetal.[38]proposedahybridbased onACOtosegmentthenoisyimages.Theyalsoproposedapossibilistic c-means(PCM) algorithm.Inthisapproach,theclusteringproblemissolvedbyusingpre-classifiedpixel informationandfurnishesthenearoptimalinitializationofthenumberofclustersand theircentroids[38].ImagesegmentationisdoneusingACO-basedfuzzyclustering[39].In thisapproach,theantisconsideredanindividualpixelintheimage,andthemembership functioniscalculatedonthebasisofheuristicandpheromoneinformationoneachcluster center.Theperformanceoftheimagesegmentationalgorithmimprovisedbyincludingthe spatialinformationinthemembership[39].Ahybridclusteringalgorithm,i.e.,onethat ishybridizedwiththePCMalgorithm,ispresentedtosegmentthemedicalimages[40]. Thishybridizedalgorithmovercomesthedrawbackoftheimagesegmentation.Niknam andAmiri[41]proposedahybridapproachbasedonPSO,ACO,and k-meansforcluster analysis.Thisapproachisappliedtofindbetterclusterpartition.Theperformanceof theFCMisimprovedwiththeantcolonyoptimizationalgorithm,andthemin-maxant systemisinducedintheACOalgorithminthismethod[42].Gajjaretal.[43]presented fuzzyandantcolonyoptimizationbasedoncombiningMAC,routing,andtheunequal clusteringcross-layerprotocolforwirelesssensornetworks(FAMACROW).Thiscombined networkconsistsofseveralnodesandisusedtosendthesenseddatatothemasterstation. Theclusteringalgorithmhybridizedwithanimprovedantcolonyalgorithmisapplied forfaultidentificationandfaultclassification[44].Thefuzzyclusteringnumbersand initialclusteringcenterareidentifiedbythealgorithm.MaryandRaja[45]proposed anACO-basedimprovedFCMtosegmentthemedicalimages.TheFCMandfour-chain quantumbeecolonyoptimization(QABC)isappliedforimagesegmentation[46].A modifiedACO-basedfuzzyclusteringalgorithmisproposedbySupratidandKim[47]. ACO,FCM,andGAarecombinedinthisalgorithmtoovercometheproblemofthe frequentimprovementofclustercenters.

1.9.4ArtificialBeeColonyOptimization–BasedFuzzyClustering Alrosanetal.[48]proposedaclusteringmethodbycouplingartificialbeecolonywith thefuzzy c-means(ABC-FCM)algorithm.Thisapproachtookadvantageofthesearchingcapabilitiesofoptimuminitialclustercentersandappliestheseclustersastheinitial clustercenters.Theproposedapproachproveditssuperioritywhenappliedontwosets ofMRIimages:simulatedbraindataandrealMRIimages[48].Theweaknessoverthe controlofthelocaloptimumofthetechniqueisalsohandledbyPhametal.[49].Inthis

regard,theyexploitthesearchabilityoftheABCalgorithm.Inthismethod,arealnumber isusedineachbeetodeterminetheappropriateclustercenters.Thesupremityofthis approachisestablishedbycomparingtheproposedalgorithmwithFCMandtheGA-based clusteringalgorithmonsomenumericalbenchmarkdata.Amodifiedartificialbeecolony algorithm–basedfuzzy C-meansalgorithm(MoABC-FCM)ispresentedbyOuadfeland Meshoul[50].Inspiredfromdifferentialevolution(DE),anewmutationmethodwasintroducedintheABCalgorithmtoimprovetheexploitationprocess.Ultimately,theproposed MoABC-FCMalgorithmenhancedtheeffectivenessoftheoriginalFCMalgorithm,and itisprovedthatthisnewapproachisbetterthanoptimization-basedsearchalgorithms, suchasthestandardABC,modifiedABC,andPSO.AnovelspatialfuzzyclusteringalgorithmoptimizedbytheABCalgorithm,inshort,ABC-SFCM,isemployedtosegmentthe syntheticandrealimages[51].Thisalgorithmisadvantageousfortworeasons.First,itis abletotacklenoisyimagesegmentationefficientlybyusingthespatiallocalinformationin themembershipfunction.Second,theglobalperformancethatresultsfromtakingadvantageoftheglobalsearchcapabilityofABCisimprovedbytheproposedmethod[51].Itis alsonotedthatthismethodismorerobusttonoisethanotherknownmethods.Animage segmentationprocedureutilizingtheABCalgorithmwaspresentedbyHanceretal.[52] toidentifybraintumorsfromtheMRIbrainimagesandisoneofthestandardvaluable toolsappliedfordiagnosingandtreatingmedicalcases.Theproposedprocedureincludes threestages:preprocessingtheinputMRIimage,applyingtheABC-basedfuzzyclustering methodforsegmentation,andinthelaststage,i.e.,inthepost-processingstage,extractingthebraintumors.TheproposedprocedureincludesexaminingMRIimagesinvarious areasofapatient’sbrainwithdifferentapproacheslikethe k-means,FCM,andGAalgorithms.ShokouhifarandAbkenar[53]appliedtheABCalgorithmtoclassifyallpixelsinto twocategories:normalandnoisy.Theyintroducedanoiseprobabilityforeachpixelwithin theimage,andbasedonthatnoiseprobabilitythepixelsaregrouped.Inthisproposed method,ABCoptimizationisemployedbeforetheFCMclusteringalgorithmtosegment thereal-lifeMRIimages,anditwasshownthatthisapproachisbetterthantheprevious methods.KarabogaandOzturk[54,55]appliedsuccessfullytheABCalgorithmtooptimize fuzzyclusteringtosomebenchmarkmedicaldata.TheproposedmethodproveditssuperioritycomparedtotheFCMalgorithm.AcombinedapproachoftheFCMalgorithmand theABCalgorithmisemployedtosegmentMRimagesefficiently[56].Twonewparametersareintroducedinthatapproach:thedifferencebetweenneighboringpixelsinthe imageandtherelativelocationoftheneighboringpixels.ItwasshownthattheseparametersimprovedtheperformanceofFCMusingtheABCalgorithm[56].BoseandMali[57] presentedacombinedABCandFCMalgorithm,namedFABC,forunsupervisedclassification.ThePredefiningtheinitialclustercenterssolvedbytheFABCandtheproposed methodworksbetterinrespectofconvergency,timecomplexity,robustness,andsegmentationaccuracy[57].ThepropertyoftherandomizationoftheABCalgorithmappliedfor theinitializationoftheclustercentersintheFABCalgorithm.

1.9.5DifferentialEvolution–BasedFuzzyClustering

DasandKonar[58]presentedafuzzyclusteringmethodincombinationwithmodified differentialevolution,namedautomaticfuzzyclusteringdifferentialevolutionforimage

segmentation.Inthisapproach,thechromosomerepresentationintheDEalgorithm wasmodifiedandappliedforthedeterminationoffuzzyclustersintheintensityspace ofanimageautomatically.Thesymmetry-basedfuzzyclusteringvalidityindex[59]was employedwiththismodifiedDEalgorithm,andtheconvergencepropertyoftheDE algorithmalsoimproved.Thispresentedapproachappliedonsixdifferenttypesofimages includingthenaturalimage,MRIbrainimage,andsatelliteimage.Themutationconstant factor,F,playedavitalroleintheDEalgorithm,anditsvaluewasassignedrandomly rangedbetween0.5and1inAFDE[58].Second,thecrossoverrate(Cr)isnotfixedduring theevolvingprocessasitsvaluechangesaccordinglyduringtheiterationsteps.Thestarting valueofCris1,anditdecreaseslinearlytotheminimumacceptablevalue,whichis0.5. Theauthorspresentedarealcodedchromosomerepresentationinsuchawaythatcan dynamicallydeterminetheappropriatenumberofclustersthatthedatasetmayhave[58]. AnotherversionoftheAFDEisfoundin[60]astheautomaticclusteringDE(ACDE) algorithm.Inthisapproach,theauthorsapplieddifferentobjectivefunctionsnamedDB index[61]andCSindex[62]toevaluatethequalityoftheclustereddata.Gongetal.[63] alsoproposedanautomaticclusteringDEtechniquestosolvetheclusteringproblem.This methodisdescribedbythreeways:(i)amodifiedpointsymmetry-basedclustervalidity index(CVI)presentedtoevaluatethevalidityofthecorrespondingpartitioning,(ii)the Kd-treenearestneighborsearchisappliedtodecreasethecomplexityoffindingtheclosest symmetricpoint,and(iii)anewchromosomalrepresentationisinducedtorepresent individuals.Afterbeingappliedonsixartificialdatasetsofdiversecomplexities,ithasbeen provedthattheproposedapproachissuitedforboththesymmetricalintraclustersand thesymmetricalinterclusters[63].MaulikandSaha[64]presentedamodifiedDE-based techniqueoffuzzyclustering(MoDEFC),andtheauthorsappliedrealcodedencoding techniquefortheclustercenters.Theadvantageoftheglobalbest(GBest)andlocal best(LBest)conceptsofthePSOalgorithminducedinthestandardmutationprocessof differentialevolutionalgorithmtopushthetrialvectorquicklytowardtheglobaloptima. Intheinitialstage,theLBest,i.e.,thebestvectorinthecurrentstage,isthemoreimportant forevolvingthemutantvectorthaninthelaterstage[64].Asthegenerationincreases,the contributionofGBest,i.e.,thebestvectorevaluateduntilthecurrentgeneration,increases thecontributionofLBestforthemutantvectordecreases.

1.9.6FireflyAlgorithm–BasedFuzzyClustering

ToovercometheshortcomingsofFCM,anewfuzzysubspaceclusteringalgorithmbasedon improvedfireflyalgorithmsispresentedin[65].Inthisapproach,theglobaloptimization capabilityofthefireflyalgorithm,stronglocalsearchfeaturesofFCM,andlearningcalculationforfeatureweightsofreliability-based k-meansaretakenintoconsideration[65]. Thisalgorithmisappliedaccuratelyandefficientlyondifferentfeaturesubspace-based clusteringproblems.Alomoushetal.[66]proposedahybridizedsegmentationalgorithm, fireflyalgorithm(FA),andfuzzy c-meansalgorithm(FCM)tosegmentmagneticresonance imaging(MRI)brainimages.MRIimagesarenoteasytosegmentasnormalandabnormaltissuesareverymuchsimilarinview.Thefireflyalgorithmisemployedtodetermine theoptimalclustercentersforFCM,andthatimprovestheefficiencyofFCMtosegment theMRIimages.Sharmaetal.[67]presenteda k-meansalgorithmandfireflyalgorithm

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.