RecentAdvancesinHybridMetaheuristicsfor DataClustering
Editedby
SouravDe CoochBeharGovernmentEngineeringCollege,WestBengal,India
SandipDey SukantaMahavidyalaya,WestBengal,India
SiddharthaBhattacharyya CHRIST(DeemedtobeUniversity),Bangalore,India
Thiseditionfirstpublished2020 ©2020JohnWiley&SonsLtd
Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmitted,inanyform orbyanymeans,electronic,mechanical,photocopying,recordingorotherwise,exceptaspermittedbylaw.Adviceonhow toobtainpermissiontoreusematerialfromthistitleisavailableathttp://www.wiley.com/go/permissions.
TherightofSouravDe,SandipDey,andSiddharthaBhattacharyyatobeidentifiedastheauthorsoftheeditorialmaterial inthisworkhasbeenassertedinaccordancewithlaw.
RegisteredOffices
JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA
JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UK
EditorialOffice
TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UK
Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWileyproductsvisitusat www.wiley.com.
Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Somecontentthatappearsin standardprintversionsofthisbookmaynotbeavailableinotherformats.
LimitofLiability/DisclaimerofWarranty
Inviewofongoingresearch,equipmentmodifications,changesingovernmentalregulations,andtheconstantflowof informationrelatingtotheuseofexperimentalreagents,equipment,anddevices,thereaderisurgedtoreviewand evaluatetheinformationprovidedinthepackageinsertorinstructionsforeachchemical,pieceofequipment,reagent,or devicefor,amongotherthings,anychangesintheinstructionsorindicationofusageandforaddedwarningsand precautions.
Whilethepublisherandauthorshaveusedtheirbesteffortsinpreparingthiswork,theymakenorepresentations orwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthisworkandspecificallydisclaimall warranties,includingwithoutlimitationanyimpliedwarrantiesofmerchantabilityorfitnessforaparticularpurpose. Nowarrantymaybecreatedorextendedbysalesrepresentatives,writtensalesmaterialsorpromotionalstatementsfor thiswork.Thefactthatanorganization,website,orproductisreferredtointhisworkasacitationand/orpotentialsource offurtherinformationdoesnotmeanthatthepublisherandauthorsendorsetheinformationorservicestheorganization, website,orproductmayprovideorrecommendationsitmaymake.Thisworkissoldwiththeunderstandingthatthe publisherisnotengagedinrenderingprofessionalservices.Theadviceandstrategiescontainedhereinmaynotbesuitable foryoursituation.Youshouldconsultwithaspecialistwhereappropriate.Further,readersshouldbeawarethatwebsites listedinthisworkmayhavechangedordisappearedbetweenwhenthisworkwaswrittenandwhenitisread.Neitherthe publishernorauthorsshallbeliableforanylossofprofitoranyothercommercialdamages,includingbutnotlimitedto special,incidental,consequential,orotherdamages.
LibraryofCongressCataloging-in-PublicationData
Names:De,Sourav,1979-editor.|Dey,Sandip,1977-editor.| Bhattacharyya,Siddhartha,1975-editor.
Title:Recentadvancesinhybridmetaheuristicsfordataclustering/edited byDr.SouravDe,Dr.SandipDey,Dr.SiddharthaBhattacharyya.
Description:Firstedition.|Hoboken,NJ:JohnWiley&Sons,Inc.,[2020] |Includesbibliographicalreferencesandindex.
Identifiers:LCCN2020010571(print)|LCCN2020010572(ebook)|ISBN 9781119551591(cloth)|ISBN9781119551614(adobepdf)|ISBN 9781119551607(epub)
Subjects:LCSH:Clusteranalysis–Dataprocessing.|Metaheuristics.
Classification:LCCQA278.55.R432020(print)|LCCQA278.55(ebook)| DDC519.5/3–dc23
LCrecordavailableathttps://lccn.loc.gov/2020010571
LCebookrecordavailableathttps://lccn.loc.gov/2020010572
CoverDesign:Wiley
CoverImage:©Nobi_Prizue/GettyImages
Setin9.5/12.5ptSTIXTwoTextbySPiGlobal,Chennai,India
PrintedandboundbyCPIGroup(UK)Ltd,Croydon,CR04YY 10987654321
Dr.SouravDededicatesthisbooktohisrespectedparents,SatyaNarayanDeandTapasiDe; hislovingwife,DebolinaGhosh;hisbelovedson,AishikDe;hissister,SoumiDe,andhis in-laws
Dr.SandipDeydedicatesthisbooktothelovingmemoryofhisfather,thelateDhananjoy Dey;hisbelovedmother,Smt.GitaDey;hiswife,SwagataDeySarkar;hischildren,Sunishka andShriaan;hissiblings,Kakali,Tanusree,andSanjoy;andhisnephews,Shreyashand Adrishaan.
Dr.SiddharthaBhattacharyyadedicatesthisbooktohislatefather,AjitKumar Bhattacharyya;hislatemother,HashiBhattacharyya;hisbelovedwife,Rashni,andhis in-laws,AsisMukherjeeandPolyMukherjee.
Contents
ListofContributors xiii
SeriesPreface xv
Preface xvii
1MetaheuristicAlgorithmsinFuzzyClustering 1
SouravDe,SandipDey,andSiddharthaBhattacharyya
1.1Introduction 1
1.2FuzzyClustering 1
1.2.1Fuzzy c-means(FCM)clustering 2
1.3Algorithm 2
1.3.1SelectionofClusterCenters 3
1.4GeneticAlgorithm 3
1.5ParticleSwarmOptimization 5
1.6AntColonyOptimization 6
1.7ArtificialBeeColonyAlgorithm 7
1.8LocalSearch-BasedMetaheuristicClusteringAlgorithms 7
1.9Population-BasedMetaheuristicClusteringAlgorithms 8
1.9.1GA-BasedFuzzyClustering 8
1.9.2PSO-BasedFuzzyClustering 9
1.9.3AntColonyOptimization–BasedFuzzyClustering 10
1.9.4ArtificialBeeColonyOptimization–BasedFuzzyClustering 10
1.9.5DifferentialEvolution–BasedFuzzyClustering 11
1.9.6FireflyAlgorithm–BasedFuzzyClustering 12
1.10Conclusion 13 References 13
2HybridHarmonySearchAlgorithmtoSolvetheFeatureSelectionfor DataMiningApplications 19
LaithMohammadAbualigah,MoflehAl-diabat,MohammadAlShinwan, KhaldoonDhou,BisanAlsalibi,EssamSaidHanandeh,andMohammadShehab
2.1Introduction 19
2.2ResearchFramework 21
2.3TextPreprocessing 22
2.3.1Tokenization 22
2.3.2StopWordsRemoval 22
2.3.3Stemming 23
2.3.4TextDocumentRepresentation 23
2.3.5TermWeight(TF-IDF) 23
2.4TextFeatureSelection 24
2.4.1MathematicalModeloftheFeatureSelectionProblem 24
2.4.2SolutionRepresentation 24
2.4.3FitnessFunction 24
2.5HarmonySearchAlgorithm 25
2.5.1ParametersInitialization 25
2.5.2HarmonyMemoryInitialization 26
2.5.3GeneratingaNewSolution 26
2.5.4UpdateHarmonyMemory 27
2.5.5ChecktheStoppingCriterion 27
2.6TextClustering 27
2.6.1MathematicalModeloftheTextClustering 27
2.6.2FindClustersCentroid 27
2.6.3SimilarityMeasure 28
2.7 k-meanstextclusteringalgorithm 28
2.8ExperimentalResults 29
2.8.1EvaluationMeasures 29
2.8.1.1F-measureBasedonClusteringEvaluation 30
2.8.1.2AccuracyBasedonClusteringEvaluation 31
2.8.2ResultsandDiscussions 31
2.9Conclusion 34 References 34
3AdaptivePosition–BasedCrossoverintheGeneticAlgorithmforData Clustering 39 ArnabGainandPrasenjitDey
3.1Introduction 39
3.2Preliminaries 40
3.2.1Clustering 40
3.2.1.1 k-meansClustering 40
3.2.2GeneticAlgorithm 41
3.3RelatedWorks 42
3.3.1GA-BasedDataClusteringbyBinaryEncoding 42
3.3.2GA-BasedDataClusteringbyRealEncoding 43
3.3.3GA-BasedDataClusteringforImbalancedDatasets 44
3.4ProposedModel 44
3.5Experimentation 46
3.5.1ExperimentalSettings 46
3.5.2DBIndex 47
3.5.3ExperimentalResults 49
3.6Conclusion 51 References 57
4ApplicationofMachineLearningintheSocialNetwork 61 BelfinR.V.,E.GraceMaryKanaga,andSumanKundu
4.1Introduction 61
4.1.1SocialMedia 61
4.1.2BigData 62
4.1.3MachineLearning 62
4.1.4NaturalLanguageProcessing(NLP) 63
4.1.5SocialNetworkAnalysis 64
4.2ApplicationofClassificationModelsinSocialNetworks 64
4.2.1SpamContentDetection 65
4.2.2TopicModelingandLabeling 65
4.2.3HumanBehaviorAnalysis 67
4.2.4SentimentAnalysis 68
4.3ApplicationofClusteringModelsinSocialNetworks 68
4.3.1RecommenderSystems 69
4.3.2SentimentAnalysis 70
4.3.3InformationSpreadingorPromotion 70
4.3.4Geolocation-SpecificApplications 70
4.4ApplicationofRegressionModelsinSocialNetworks 71
4.4.1SocialNetworkandHumanBehavior 71
4.4.2EmotionContagionthroughSocialNetworks 73
4.4.3RecommenderSystemsinSocialNetworks 74
4.5ApplicationofEvolutionaryComputingandDeepLearninginSocial Networks 74
4.5.1EvolutionaryComputingandSocialNetwork 75
4.5.2DeepLearningandSocialNetworks 75
4.6Summary 76 Acknowledgments 77 References 78
5PredictingStudents’GradesUsingCART,ID3,andMulticlassSVM OptimizedbytheGeneticAlgorithm(GA):ACaseStudy 85 DebanjanKonar,RuchitaPradhan,TaniaDey,TejaswiniSapkota, andPrativaRai
5.1Introduction 85
5.2LiteratureReview 87
5.3DecisionTreeAlgorithms:ID3andCART 88
5.4MulticlassSupportVectorMachines(SVMs)OptimizedbytheGenetic Algorithm(GA) 90
5.4.1GeneticAlgorithmsforSVMModelSelection 92
5.5PreparationofDatasets 93
x Contents
5.6ExperimentalResultsandDiscussions 95
5.7Conclusion 96 References 96
6ClusterAnalysisofHealthCareDataUsingHybridNature-Inspired Algorithms 101 KauserAhmedP,RishabhAgrawal
6.1Introduction 101
6.2RelatedWork 102
6.2.1FireflyAlgorithm 102
6.2.2 k-meansAlgorithm 103
6.3ProposedMethodology 104
6.4ResultsandDiscussion 106
6.5Conclusion 110 References 111
7PerformanceAnalysisThroughaMetaheuristicKnowledge Engine 113
InduChhabraandGunmalaSuri
7.1Introduction 113
7.2DataMiningandMetaheuristics 114
7.3ProblemDescription 115
7.4AssociationRuleLearning 116
7.4.1AssociationMiningIssues 116
7.4.2ResearchInitiativesandProjects 116
7.5LiteratureReview 117
7.6Methodology 119
7.6.1Phase1:PatternSearch 120
7.6.2Phase2:RuleMining 120
7.6.3Phase3:KnowledgeDerivation 121
7.7Implementation 121
7.7.1TestIssues 121
7.7.2SystemEvaluation 121
7.7.2.1IndicatorMatrixFormulation 122
7.7.2.2Phase1:FrequentPatternDerivation 123
7.7.2.3Phase2:AssociationRuleFraming 123
7.7.2.4Phase3:KnowledgeDiscoveryThroughMetaheuristicImplementation 123
7.8PerformanceAnalysis 124
7.9ResearchContributionsandFutureWork 125
7.10Conclusion 126 References 126
8MagneticResonanceImageSegmentationUsingaQuantum-Inspired ModifiedGeneticAlgorithm(QIANA)BasedonFRCM 129 SunandaDas,SouravDe,SandipDey,andSiddharthaBhattacharyya
8.1Introduction 129
8.2LiteratureSurvey 131
8.3QuantumComputing 133
8.3.1Quoit-QuantumBit 133
8.3.2Entanglement 133
8.3.3Measurement 133
8.3.4QuantumGate 134
8.4SomeQualityEvaluationIndicesforImageSegmentation 134
8.4.1F(I) 134
8.4.2F’(I) 135
8.4.3Q(I) 135
8.5Quantum-InspiredModifiedGeneticAlgorithm(QIANA)–BasedFRCM 135
8.5.1Quantum-InspiredMEGA(QIANA)–BasedFRCM 136
8.6ExperimentalResultsandDiscussion 139
8.7Conclusion 147 References 147
9AHybridApproachUsingthe k -meansandGeneticAlgorithmsfor ImageColorQuantization 151 MarcosRobertoeSouza,AndersonCarlosSousaeSantos,andHelioPedrini
9.1Introduction 151
9.2Background 152
9.3ColorQuantizationMethodology 154
9.3.1CrossoverOperators 157
9.3.2MutationOperators 158
9.3.3FitnessFunction 158
9.4ResultsandDiscussions 159
9.5ConclusionsandFutureWork 168 Acknowledgments 168 References 168
Index 173
ListofContributors
LaithMohammadAbualigah AmmanArabUniversity
Jordan
RishabhAgrawal VIT
India
KauserAhmed VIT
India
MoflehAl-diabat AlAlbaytUniversity
Jordan
BisanAlsalibi UniversitiSainsMalaysia
Malaysia
MohammadAlShinwan AmmanArabUniversity
Jordan
BelfinRV KarunyaInstituteofTechnologyand Sciences
India
SiddharthaBhattacharyya CHRIST(Deemedtobeuniversity)
India
InduChhabra PanjabUniversity Chandigarh
India
SunandaDas NationalInstituteofTechnology Durgapur
India
SouravDe CoochBeharGovernmentEngineering College
India
PrasenjitDey CoochBeharGovernmentEngineering College
India
SandipDey SukantaMahavidyala
India
TaniaDey SikkimManipalInstituteofTechnology
India
KhaldoonDhou DruryUniversity
USA
xiv ListofContributors
ArnabGain CoochBeharGovernmentEngineering College
India
EssamHanandeh Zarqauniversity
Jordan
GraceMaryKanaga KarunyaInstituteofTechnologyand Sciences
India
AhamadKhader UniversitiSainsMalaysia
Malaysia
DebanjanKonar SikkimManipalInstituteofTechnology
India
SumanKundu WroclawUniversityofScienceand Technology
India
RuchitaPradhan SikkimManipalInstituteofTechnology
India
HelioPedrini InstituteofComputing UniversityofCampinas
Brazil
PrativaRai SikkimManipalInstituteofTechnology
India
MarcosRobertoeSouza InstituteofComputing UniversityofCampinas Campinas
Brazil
EssamSaidHanandeh ZarqaUniversity
Jordan
AndersonSantos InstituteofComputing UniversityofCampinas
Brazil
TejaswiniSapkota SikkimManipalInstituteofTechnology
India
MohammadShehab AqabaUniversityofTechnology
Jordan
GunmalaSuri UniversityBusinessSchool PanjabUniversity Chandigarh
India
SeriesPreface:DrSiddharthaBhattacharyya,Christ(DeemedTo BeUniversity),Bangalore,India(SeriesEditor)
TheIntelligentSignalandDataProcessing(ISDP)bookseriesfocusesonthefieldofsignalanddataprocessingencompassingthetheoryandpracticeofalgorithmsandhardware thatconvertsignalsproducedbyartificialornaturalmeansintoaformusefulforaspecificpurpose.Thesignalsmightbespeech,audio,images,video,sensordata,telemetry, electrocardiograms,orseismicdata,amongothers.Thepossibleapplicationareasinclude transmission,display,storage,interpretation,classification,segmentation,anddiagnosis. TheprimaryobjectiveoftheISDPbookseriesistoevolvefuture-generation,scalable,intelligentsystemsforfaithfulanalysisofsignalsanddata.TheISDPseriesisintendedmainlyto enrichthescholarlydiscourseonintelligentsignalandimageprocessingindifferentincarnations.Theserieswillbenefitawideaudiencethatincludesstudents,researchers,and practitioners.Thestudentcommunitycanusethebooksintheseriesasreferencetextsto advancetheirknowledgebase.Inaddition,theconstituentmonographswillbehandyto aspiringresearchersduetorecentandvaluablecontributionsinthisfield.Moreover,facultymembersanddatapractitionersarelikelytogainrelevantknowledgefromthebooks intheseries.
Theseriescoveragewillcontain,butnotbeexclusiveto,thefollowing:
● Intelligentsignalprocessing
a)Adaptivefiltering
b)Learningalgorithmsforneuralnetworks
c)Hybridsoftcomputingtechniques
d)Spectrumestimationandmodeling
● Imageprocessing
a)Imagethresholding
b)Imagerestoration
c)Imagecompression
d)Imagesegmentation
e)Imagequalityevaluation
f)Computervisionandmedicalimaging
g)Imagemining
h)Patternrecognition
i)Remotesensingimagery
j)Underwaterimageanalysis
k)Gestureanalysis
l)Humanmindanalysis
m)Multidimensionalimageanalysis
● Speechprocessing
a)Modeling
b)Compression
c)Speechrecognitionandanalysis
● Videoprocessing
a)Videocompression
b)Analysisandprocessing
c)3Dvideocompression
d)Targettracking
e)Videosurveillance
f)Automatedanddistributedcrowdanalytics
g)Stereo-to-autostereoscopic3Dvideoconversion
h)Virtualandaugmentedreality
● Dataanalysis
a)Intelligentdataacquisition
b)Datamining
c)Exploratorydataanalysis
d)Modelingandalgorithms
e)Bigdataanalytics
f)Businessintelligence
g)Smartcitiesandsmartbuildings
h)Multiwaydataanalysis
i)Predictiveanalytics
j)Intelligentsystems
Preface
Groupingorclassifyingreal-lifedataintoasetofclustersorcategoriesforfurtherprocessing andclassificationisknownas clustering.Thegroupsareorganizedonthebasisofbuilt-in propertiesorcharacteristicsofthedatainthatdataset.Thefeaturesofthegroupsareimportanttorepresentanewobjectortounderstandanewphenomenon.Homogeneousdata shouldbeinthesamecluster,whereasdissimilarorheterogeneousdataisgroupedinto differentclusters.Theclusteringofdatacanbeappliedindifferentfieldsoftheworld,such asdocumentretrieval,datamining,patternclassification,imagesegmentation,artificial intelligence,machinelearning,biology,microbiology,etc.
Broadly,therearetwotypesofdataclusteringalgorithms:supervisedandunsupervised. Insuperviseddataclusteringalgorithms,thenumberofcovetedpartitionsandlabeled datasetsissuppliedasthebasicinputatthebeginningofthealgorithm.Moreover,insupervisedclusteringalgorithms,itisattemptedtokeepthenumberofsegmentssmall,andthe datapointsareallottedtoclustersusingtheideaofclosenessbyresortingtoagivendistancefunction.Bycontrast,priorinformationaboutthelabeledclasses,decision-making criterionforoptimization,ornumberofdesiredsegmentsbeyondtherawdataorgroupingofprinciple(s)onthebasisoftheirdatacontentarenotrequiredfortheunsupervised algorithms.
Metaheuristicalgorithmshaveprovedefficientinhandlingandsolvingdifferenttypesof dataclusteringproblems.Metaheuristicsisdesignedtotacklecomplexclusteringproblems whereclassicalclusteringalgorithmsfailtobeeithereffectiveorefficient.Basically,the solvingprocedureofasubordinateheuristicproblembyaniterativegenerationprocedure isknownas metaheuristic.Thisisdonebysyndicatingintelligentlydifferentconceptsto exploreandexploitthesearchspace,andthenonoptimalsolutionsarederivedefficiently bythelearningstrategiesthatareappliedonthestructuralinformationoftheproblem. Themainobjectiveofmetaheuristicistoderiveasetofoptimalsolutionslargeenough tobecompletelysampled.Differenttypesofreal-worldproblemscanbehandledbythe metaheuristictechniquesbecauseconventionalalgorithmscan’tmanagemanyreal-world problems,inspiteofincreasingcomputationalpower,simplyduetotheunrealisticallylong runningtimes.Tosolvetheoptimizationproblems,thesealgorithmsmakeafewassumptionsattheinitialstages.Itisnotassuredthatmetaheuristicalgorithmswillgenerate globallyoptimalsolutionstosolvealltypesofproblemssincemostoftheimplementations aresomeformofstochasticoptimizationandtheresultantsolutionsmaydependonthe setofgeneratedrandomvariables.Tosolveoptimizationalgorithms,heuristics,oriterative
methods,metaheuristicalgorithmsarethebetteroptionastheyoftendeterminegood solutionswithlessercomputationaleffortbyexploringalargesetoffeasiblesolutions. Somewell-knownmetaheuristicalgorithmsincludethegeneticalgorithm(GA),simulated annealing(SA),tabusearch(TS),anddifferenttypesofswarmintelligencealgorithms. Somerecognizedswarmintelligencealgorithmsareparticleswarmoptimization(PSO), antcolonyoptimization(ACO),artificialbeecolonyoptimization(ABC),differential optimization(DE),cuckoosearchalgorithm,etc.Inrecentadvancementsoftheresearch, somemodernswarmintelligence–basedoptimizationalgorithmssuchasEgyptianvulture optimizationalgorithm,ratsherdalgorithm(RATHA),batalgorithm,crowsearchalgorithm,glowwormswarmoptimization(GSO),etc.,arefoundtoperformwellwhensolving somereal-lifeproblems.Thesealgorithmsalsoworkefficientlytoclusterdifferenttypesof real-lifedatasets.
Duringtheclusteringofdata,ithasbeenobservedthatthemethaheuristicalgorithms sufferfromtimecomplexitythoughtheycanaffordoptimumsolutions.Togetridof thesetypesofproblemsandnotdependonaparticulartypeofmetaheuristicalgorithm tosolvecomplexproblems,researchersandscientistsblendednotonlydifferentmetaheuristicapproachesbutalsohybridizeddifferentmetaheuristicalgorithmswithother softcomputingtoolsandtechniques,suchasneuralnetwork,fuzzyset,roughset,etc.The hybridmetaheuristicalgorithms,acombinationofmetaheuristicalgorithmsandother techniques,aremoreeffectiveathandlingreal-lifedataclusteringproblems.Recently, quantummechanicalprinciplesarealsoappliedtocutdownonthetimecomplexityofthe metaheuristicapproachestoagreatextent.
Thebookwillenticereaderstodesignefficientmetaheuristicsfordataclusteringindifferentdomains.Thebookwillelaborateonthefundamentalsofdifferentmetaheuristics andtheirapplicationtodataclustering.Asasequeltothis,itwillpavethewayfordesigninganddevelopinghybridmetaheuristicstobeappliedtodataclustering.Itisnoteasyto findbooksonhybridmetaheuristicalgorithmsthatcoverthistopic.
Thebookcontainsninechapterswrittenbytheleadingpractitionersinthefield.
Abriefoverviewoftheadvantagesandlimitationsofthefuzzyclusteringalgorithmis presentedinChapter1.Theprincipleofoperationandthestructureoffuzzyalgorithms arealsoelucidatedwithreferencetotheinherentlimitationsofclustercentroidselection. Severallocal-search-basedandpopulation-basedmetaheuristicalgorithmsarediscussed withreferencetotheiroperatingprinciples.Finally,differentavenuesforaddressingthe clustercentroidselectionproblemwithrecoursetothedifferentmetaheuristicalgorithms arepresented.
Theincreasingsizeofthedataandtextonelectronicsiteshasnecessitatedtheuseofdifferentclusteringmethods,includingtextclustering.Thismaybeahelpfulunsupervised analysismethodusedforpartitioningtheimmensesizeoftextdocumentsintoasetof groups.Thefeaturechoicemaybeawell-knownunsupervisedmethodologyaccustomed toeliminatinguninformativeoptionstoenhancetheperformanceofthetextclustering method.InChapter2,theauthorshaveatendencytoprojectaruletoresolvethefeaturedchoicedrawbackbeforeapplyingthe k-meanstextclusteringtechniquebyrisingthe exploitationsearchabilityofthefundamentalharmonysearchalgorithmicruleknownas H-HSA.Theprojectedfeaturechoicemethodologyisusedinthischaptertoreinforcethe textclusteringtechniquebyofferingareplacementsetofinformativefeatures.
Intheadvancementofdataanalytics,dataclusteringhasbecomeoneofthemostimportantareasinmoderndatascience.Severalworkshavecomeupwithvariousalgorithms todealwithdataclustering.InChapter3,theobjectiveistoimprovethedataclustering byusingmetaheuristic-basedalgorithms.Forthispurpose,theauthorshaveproposeda geneticalgorithm–baseddataclusteringapproach.Here,anewadaptiveposition–based crossovertechniquehasbeenproposedforthegeneticalgorithmwherethenewconceptof vitalgenehasbeenproposedduringthecrossover.Thesimulationresultsdemonstratethat theproposedmethodperformsbettercomparedtotheothertwogeneticalgorithm–based dataclusteringmethods.Furthermore,ithasalsobeenobservedthattheproposedapproach istimeefficientcomparedtoitscounterparts.
Asocialnetwork,usedbythehumanpopulationasaplatformofinteraction,generatesa largevolumeofdiversedataeveryday.Thesedataandattributesoftheinteractionsbecome moreandmorecriticalforresearchersandbusinessestoidentifysocietalandeconomic values.However,thegenerateddataisvast,highlycomplex,anddynamic,whichnecessitatesareal-timesolution.Machinelearningisausefultoolinordertosummarizethe meaningfulinformationfromlarge,diversedatasets.Chapter4providesasurveyofseveral applicationsofsocialnetworkanalysiswheremachinelearningplaysacriticalrole.These applicationsrangefromspamcontentdetectiontohumanbehavioranalysis,fromtopic modelingtorecommendersystems,andfromsentimentanalysistoemotioncontagionin socialnetwork.
Predictingstudents’performanceatanearlierstageisimportantforimprovingtheirperformanceforhighereducationandplacementopportunities.Earlypredictionofstudent gradesallowsaninstructortodetectthestudents’poorperformanceinacourseautomaticallyandalsoprovidesenormousopportunitiestothedecision-makerstotakeremedial measurestohelpthestudentstosucceedinfutureeducation.Amodelpredictingstudents’ gradesusingCART,ID3,andimprovedmulticlassSVMoptimizedbythegeneticalgorithm (GA)isinvestigatedinChapter5.Themodelfollowsasupervisedlearningclassificationby meansofCART,ID3,andSVMoptimzedbytheGA.Inthisstudy,themodelistestedona datasetthatcontainsundergraduatestudentinformation,i.e.,totalmarksobtainedinthe coursestakenupinfouryearswiththerespectivelabeledsubjectnameandacodeatSikkim ManipalInstituteofTechnology,Sikkm,India.AcomparativeanalysisamongCART,ID3, andmulticlassSVMoptimizedbytheGAindicatesthatthemulticlassSVMoptimizedby GAoutperformsID3andCARTdecisiontreealgorithmsinthecaseofmulticlassclassification.
Significantadvancesininformationtechnologyresultintheexcessivegrowthofdatain healthcareinformatics.Intoday’sworld,developmenttechnologiesarealsobeingmadeto treatnewtypesofdiseasesandillnesses,butnostepsarebeingtakentostopthedisease initstrackintheearlystages.ThemotivationofChapter6istohelppreparepeopleto diagnosethediseaseatearlystagesbasedonthesymptomsofthedisease.Inthischapter, theauthorshaveusedvariousnature-inspiredclusteringalgorithmsincollaborationwith k-meansalgorithmstoactivelyclusteraperson’shealthdatawithalreadyavailabledataand labelitaccordingly.Experimentsresultsprovethatnature-inspiredalgorithmslikefirefly with k-meansaregivingefficientresulttotheexistingproblems.
Withthefastdevelopmentofpatterndiscovery–orientedsystems,dataminingisrapidly intensifyinginotherdisciplinesofmanagement,biomedical,andphysicalsciencesto
tackletheissuesofdatacollectionanddatastorage.Withtheadvancementofdata science,numerousknowledge-orientedparadigmsareevaluatedforautomaticrule mining.Associationruleminingisanactiveresearchareawithnumerousalgorithms usedforknowledgeaccumulation.Chapter7focusesonhandlingthevariouschallenging issuesofonlydemand-drivenaggregationofinformationsources,miningandanalyzing relevantpatternstopreserveuserconcerns,andimplementingthesameassociationrule miningproblemformulti-objectivesolutionsratherthanasasingle-objectivesolutionfor post-purchasecustomeranalysis.
TheGAandfuzzy c-means(FRCM)algorithmiswidelyusedinmagneticresonance imagesegmentation.InChapter8,ahybridconcept,quantum-inspiredmodifiedGA andFRCMareusedtosegmentMRimages.ThemodifiedGA(MEGA)enhancesthe performanceoftheGAbymodifyingpopulationinitializationandcrossoverprobability. TospeedupthisclassicalMEGAandalsotoderivemoreoptimizedclasslevels,some quantumcomputingcharacteristicslikequbit,entanglement,orthogonality,rotational gate,etc.,areincorporatedintotheclassicalMEGA.Theclasslevelscreatedbythe quantum-inspiredMEGAareemployedtotheFRCMasinitialinputtoovercomethe convergenceproblemoftheFRCM.Aperformancecomparisonusingsomestandard evaluationmetricsisdelineatedbetweenquantum-inspiredMEGA-basedFRCM,classical MEGA-basedFRCM,andconventionalFRCMwiththehelpoftwograyscaleMRimages, whichshowstheexcellenceoftheproposedquantum-inspiredMEGA-basedFRCMover boththeclassicalMEGA-basedFRCMandtheconventionalFRCMmethods.
Largevolumesofdatahavebeenrapidlycollectedduetotheincreasingadvancesin equipmentandtechniquesforcontentacquisition.However,theefficientstorage,indexing, retrieval,representation,andrecognitionofmultimediadata,suchastext,audio,images, andvideos,arechallengingtasks.Tosummarizethemaincharacteristicsofdatasetsand simplifytheirinterpretation,exploratorydataanalysisisoftenappliedtonumerousproblemsinseveralfields,suchaspatternrecognition,computervision,machinelearning,and datamining.Acommondataanalysistechnique,commonlyassociatedwithdescriptive statisticsandvisualmethods,isclusteranalysisorclustering.InChapter9,theauthorsproposeahybridmethodbasedon k-meansandthegeneticalgorithmguidedbyaqualitative objectivefunction.Experimentsdemonstrategoodresultsoftheproposedmethod.
Theeditorshopethatthisbookwillbehelpfulforstudentsandresearcherswhoare interestedinthisarea.Itcanalsoprovetobeanovelinitiativeforundergraduatestudentsofcomputerscience,informationscience,andelectronicsengineeringforpartoftheir curriculum.
October,2019 SouravDe CoochBehar,India SandipDey Jalpaiguri,India SiddharthaBhattacharyya Bangalore,India
1.4GeneticAlgorithm
Geneticalgorithms(GAs),apopularoptimizationalgorithm,aregenerallyusedtosearch foroptimalsolution(s)toanyparticularcomputationalproblem.Thisisdonebymaximizingorminimizingaparticularfunction,calledanobjective/fitnessfunction.GAsrepresent evolutionarycomputation[12],afieldofstudy,wheretheyemulatethebiologicalprocesses likereproductionandnaturalselectiontofind“fittestsolutions”[13].Thereexistvarious GAsprocessesintheliteraturethatarerandominnature.Inthistechnique,adifferentlevel ofrandomizationandcontrolareallowedtosetforoperation[13].GAswasproventobe apowerfulandwell-regulatedoptimizationtechniqueincomparisonwithotherrandom searchalgorithmsandexhaustivesearchalgorithms[12].
GAsaredesignedtoimitatebiologicalprocess,andalargenumberofthepertinentterminologyistakenfrombiologicalscience.Thefundamentalcomponentsthatarecommon amongallGAsare
● Afitness(objective)function
● Apopulationofnumberofchromosomes
● Selectionoperationtoproducepoolofchromosomesinpopulation
● Crossoveroperationtoproducepopulationdiversityinthesubsequentgenerations
● Mutationoperationtochangethechromosome’spropertyinnewgeneration
Agivenalgorithmisoptimizedwithreferencetoanobjectivefunction[14].Theterm“fitness”originatedfromevolutionarytheory.Thefitnessfunctionisusedtotestandquantify eachindividualpotentialsolution.Chromosomesinapopulationarereferredtoasnumericalvaluesthatareusedtorepresentacandidatesolutionforagivenproblem,whichis solvedusingaGA[14].Eachofthecandidatesolutionsispassedthroughanencodingprocess,whichisbasicallyastreamofparametervalues[15].Theoretically,theencodingof eachchromosomeforaproblemhaving N dimensionsisaccomplishedasanarrayof N chromosomeasgivenby [q1 , q2 , … , qN ].Here,each qk representsaspecificvalueofthe kth parameter[15].Ingeneral,chromosomesareencodedusingabitstring,i.e.,asequenceof 0sand1s.Inmodern-daycomputersystems,chromosomescanbegeneratedbyincluding realnumbers,permutations,andalsootherobjects.
AGAstartswithanumberofchromosomes,chosenrandomly,whichcreatestheinitialpopulation.Thereafter,afitnessfunctionisintroducedtoevaluateeachmemberinthe population.Thisevaluationisbasicallyaccomplishedtostudyhowwellitcanpotentially solveagivenproblemathand.Afterward,aselectionoperatorisintroducedtochoosea numberofpotentialchromosomesforreproductiononthebasisofauser-definedprobabilitydistribution.Theselectionisdoneinaccordancewiththefitnessofthechromosomes inthepopulation.Theprobabilityoftheselectionofaparticularchromosomeincreases astheincreaseofitsfitnessforthesubsequentgenerations.Forexample,let fn bethefitnessfunctionintroducedforsolvingaparticularproblem.Theprobabilityofselecting Cn ischosenby
Itcanbenotedthattheselectionoperatorisusedchoosechromosomeswithreplacement. Thisapproachensuresthatthesamechromosomecanbeselectedanumberoftimes.The issomehowanalogoustothebiologicalcrossoverandrecombiningthem.Twooffsprings havingdifferentfeaturesarecreatedbyswappingtwochosenchromosomesatasingle pointoratmultiplepoints.Supposetheparentchromosomes [10010010010011] and [11010011001011] arecrossedoveratthefifthposition.Itcreatestwonewoffsprings,given by [10010011001011] and [11010010010011]
Themutationoperationoverturnsindividualbitstogetnewchromosome.Basically, bit0isturnedintobit1andviceversaoncethisoperatorisintroduced.Generally,mutation operationoccurswithanexceedinglylowprobability(suchas0.001).Inafewoccasions, theorderofimplementationofmutationoperatorbeforeothertwooperatorsissubject tothematterofpreferencebythedesigner.Selectionandcrossoveroperatorsgenerally
prolongthegeneticinformationofbetter(fitter)chromosomes,whichcanresultinthe quickconvergenceofagivenalgorithm.Thiscausesthealgorithmtobestuckatalocal optimamuchbeforeattainingtheglobaloptima[16].Thismaintainspopulationdiversity byhelpingthealgorithmtoprotectagainstthisproblem.Itcanalsobethecauseofthe algorithmsufferingfromslowconvergence.
1.5ParticleSwarmOptimization
Theconceptofparticleswarmoptimization(PSO)wasfirstdevelopedbyJamesKennedy andRussellEberhartin1995[17].Theinspirationbehinditsdevelopmentcamebeentaken fromthefollowingconcepts:
● Byobservingtheswarmingabilityofanimalslikefishorbirds
● Byadoptingtheideafromthetheoryofevolutionarycomputation
TheconceptofPSOisdescribedusingthefollowingpoints:
● Techniquesarecapableofhandlingandpreservinganumberofpotentialsolutionsevery singletime.
● Eachsolutionisassessedwithareferencetoafunction,calledanobjectivefunction,to computeitsfitnessduringeachiteration.
● Eachofthepotentialsolutionsinthepopulationisregardedasaparticleinthesearch space(fitnesslandscape).
● Eachparticle“swarm”or“fly”throughthefitnesslandscapetofindtheminimum/maximumvaluecomputedbythefitnessfunction.
Duringeachiteration,theparticlesinthepopulation(swarm)maintainthefollowing criteria:
● Positionofeachparticleinitssearchspace,whichincludesolutionandfitness
● Velocityateachparticle
● Bestpositionofeachindividual
● Globalbestpositionintheswarm
Thetechniquegenerallycomprisesthefollowingsteps:
1)Evaluatethefitnessvalueofeachparticleinthepopulation.
2)Theindividualbestandtheglobalbestoftheswarmareupdated.
3)Thevelocityanditscorrespondingpositionofeachparticleareupdated.
Thesementionedstepsarerepeatedforapredefinednumberofgenerationsoruntila stoppingcriteriaismet.
Thevelocityofeachparticleisupdatedusingthefollowingformula:
where k representstheparticle’sindex, �� iscalledtheinertialcoefficient, ��1 and ��2 are knowntobetheaccelerationcoefficients,and0 ≤ ��1 ,��2 ≤ 2, r1 , r2 aretworandomvalues, 0 ≤ r1 , r2 ≤ 1. ��k (t) isrepresentedastheparticle’svelocityatanytime t,and xk (t) represents
1MetaheuristicAlgorithmsinFuzzyClustering
thepositionofaparticleattime t.Asoftime t, ̂ yk (t)) and g(t) arerepresentedastheparticle’s individualbestandtheswarm’sbestsolution.
Thepositionofeachparticlesisupdatedbyusingthefollowingequation:
wherethelocationofthe kth particle(yk (t))atthe tth generationischangedtoanotherlocation, yk (t + 1),atthe (t + 1)th generationusingvelocity ��k (t + 1)
1.6AntColonyOptimization
TheAntSystemwasthefirstmemberofacertainclassofalgorithms,called(ACO)[18]. Thisisarecent,popularmetaheuristicalgorithm.Thisalgorithmwasinitiallyintroduced byColorni,Dorigo,andManiezzo.Theinspirationbehinditsdevelopmentwastheforagingbehaviorgatheredfromrealants.Thisforagingbehaviorwasexploredandexploitedin artificialantcoloniestofindtheapproximatesolutionstoseveraldiscrete/continuousoptimizationproblems.Thisoptimizationsolutionisalsoverymuchapplicabletovariousproblemsintelecommunications,suchasloadbalancingandrouting.Eachantintheircolony randomlytraversesfromheretothere.Theindirectcommunicationhappensbetweenthe realantswiththehelpofchemicalpheromonetrails.Theantsdepositthischemicalintheir pathsfromtheirsourceofthenesttothefoodsource.Thischemicalenablesthemtosearch fortheshortestpathstodifferentfoodsources.Theprobabilityofvisitingaparticularpath increaseswiththeincreaseofpheromonedepositedonthatpath.
Thealgorithmcomprisesthefollowingsteps:
1)Avirtualtrailisgatheredonvariouspathsegments.
2)Apathisrandomlyselectedonthebasisoftheamountof“trail”availableonpossible pathsfromtheinitialnode.
3)Theanttraversestothenextavailablenodetoselectthenextpath.
4)Thisprocesscontinuesuntiltheantreachesthestartingnode.
5)Thefinishedtourisrecognizedasasolution.
6)Thewholetourisanalyzedtofindtheoptimalpath.
Supposeananttraversesfromnode j tonode k inagraph (G, E) withaprobabilityof pjk . Thenvalueof pjk isdeterminedby
where ��jk representstheamountofpheromonedepositedontheedge (j, k), �� ,aparameter thatisusedtocontroltheinfluenceof ��jk , ��jk isdefinedasthedesirabilityofedge (j, k), Like �� , �� isanotherparameterthatisusedtocontroltheinfluenceof ��jk . Theamountofdepositedpheromoneisbeingupdatedwithreferencetothefollowing equation:
1.8LocalSearch-BasedMetaheuristicClusteringAlgorithms
where ��ij isrepresentedastheamountofpheromonedepositedonanygivenedge (j, k), �� iscalledtheevaporationrateofpheromone,and ��jk isrepresentedasthedepositedamount ofpheromone.Thevalueof ��jk iscomputedby 1 Ck ,ifant k traversestheedge (j, k) and Ck isthecostofthattravelbythatparticularant.Inallothercases,thevalueof ��jk isassumed tobezero.
1.7ArtificialBeeColonyAlgorithm
Karaboga[19]introducedtheartificialbeecolony(ABC)algorithmisapresent-dayclassof swarmintelligencealgorithms.TheinspirationbehindthedevelopmentoftheABCalgorithmistheforagingbehaviorofrealbeecolonies.Thisalgorithmhasbeenappliedtosolve continuousoptimizationproblems.
TheABCalgorithmhasthreekindsof(artificial)beesasgivenbythefollowing:
1)Employedbees:Eachoftheemployedbeeisconnectedtoanonidenticalsolutionofthe givenoptimizationproblemthatisrequiredtobesolved.Thisclassofbeeexploresthe localityofthesolution,whereitwaskeptassociatedateachiteration.
2)Onlookerbees:Thisclassofbeesalsoexploresitslocalityofsolutionsinadifferentmanner.Theyprobabilisticallyselectthesolution,whichisexploredbythemineachiteration dependingonthequalityofthesolution.Hence,theprobablesolutionsvaryforthem.
3)Scoutbees:Forthisclassofbees,thelocalityofthesolutionisexploredforapredefined numberoftimes.Ifthereisnopositiveoutcomefound,thescoutbeeuniformlyselects anewrandomsolutioninitssearchspacebyappendinganexplorationpropertytothe algorithm.
Thisisapopulation-based,efficient,localsearchalgorithm,inwhichitexplorestheneighborhoodofeachsolutionateachiteration.Thefirstalgorithmwasruntosolveseveral standardbenchmarkproblems,whichgivesinspiringresults,butwithrespecttosome state-of-the-artalgorithms,itsresultswerenotsoencouraging.Particularly,whileconsideringcompositeandnonseparablefunctions,theABCalgorithmgivescomparatively poorperformance,andalsothisalgorithmpossessesslowconvergenceforhigh-quality solutions[20].
1.8LocalSearch-BasedMetaheuristicClusteringAlgorithms
Thelocalsearch-basedalgorithmshavebeenwidelyusedinsolvingseveralclusteringproblemsintheliterature.Someofthemarepresentedinthissection.In[21],theauthorspresentedasimulatedannealing-basedclusteringproblem,whichcanbeefficientlyhandled andsolvedusingthemetaheuristicclusteringalgorithms.Oneoftheweaknessthatiscommoninisthatthistechniquemayfindlocalminimumsolutions.Theproposedsimulated annealing-basedmethodsolvesoptimizationproblemsbytakingcareofsuchclustering weakness.Therequiredfactorshavebeenaddressedindetailintheproposedtechnique andithasbeenshownthattheproposedtechniqueconvergestooptimalsolutionsofthese
clusteringweaknesses.Al-Sultan[22]laterpresentedatabusearch-basedclusteringtechnique,wheretheauthorprovedthattheproposedtechniqueoutperformsboththe k-means techniqueandthesimulatedannealing–basedclusteringtechnique.Abouttwoyearslater, Al-SultanandFedjki[23]proposedanotherkindofalgorithmforhandlingthefuzzyclusteringproblem.In2000,animprovedversionoftabusearch-basedclusteringalgorithmwas presentedbySungandJinSung[24].Intheirproposedwork,theycombinedatabusearch heuristicapproachwithtwoothercompatiblefunctionalapproaches,knownaspacking andreleasing.Theelectivenessoftheirproposedtechniquehasbeennumericallytestedand provedbycomparingotherworks,suchasthetabusearchalgorithm,simulatedannealing technique,amongothers.Theaforementionedlocalsearchmetaheuristics,tabusearch,and simulatedannealingimproveonecandidatesolution,andtheproblemofsuckingatlocal minimumsolutionshasbeenrectified.Twomoreimportantpointsregardingtheefficacy ofthistechniquearethatitisfactorssensitiveandalsoexceedinglyproblem-dependentin thecaseoftuning[25].
1.9Population-BasedMetaheuristicClusteringAlgorithms
Intheliterature,thepopulation-basedmetaheuristicclusteringalgorithms,designedby severalauthors,havebeencomprehensivelyappliedinfuzzyclustering.Afewefficient algorithmsofthiskindispresentedinthissection.Evolutionaryalgorithms(EAs)forfuzzy clusteringthatcanadaptthecurrenttechniques.TheapproachestakenbyEAscanbeprincipallygroupedintotwocategories[10].Thefirstoneiscommon,whichcanbefurther dividedintotwoofthefollowingsteps.
1)Searchforappositeclustercentersviaevolutionaryalgorithm.
2)Theclustercentersattainedastheoutcomeoftheformerstepareusedastheinitial clustercenteronwhichtheFCMalgorithmisapplied.
Thesecondoneisanotherpopularapproachthatusestheevolutionaryalgorithmasaclusteringalgorithmbyitself.Afewofitskindoritsothervariationsuseasalocalsearchengine forsupportingtheperformanceofthemtospeeduptheirconvergence.Thisapproachis alsoappropriatetometaheuristic-basedhardclusteringtechniques.Afewofthiskindof algorithmsaredescribednext.
1.9.1GA-BasedFuzzyClustering
Halletal.[26],HallandOzyurt[27],andHalletal.[28]presenteddifferentworks,in whichtheauthorsclaimedthattheirproposedalgorithmscanbeusedasefficientclusteringtools.Intheirproposedalgorithms,thegeneticalgorithmwasappliedtosearchfor thenearoptimalclustercentersand,ontheotherside,wasappliedtoaccomplishtheclustering.Forencodingpurposes,arealencodingschemewasintroducedinthepopulationof geneticalgorithm.Theauthorsprovedthattheproposedalgorithmsfoundanencouraging resultincomparisonwithrandominitialization.Thelimitationoftheseworksarethattheir GAmethodisincapableoffocusingonsmalladvancementstotheclustercenterstohold upafinalclustering.Incomparisonwiththepreviousmethod,theGA-guidedclustering
algorithmhasthehighersensitivitytotherandomsolutiongeneratedinitially,whichcan beappliedtocreateaninitialpopulation.Moreover,moreexperimentswereneededto provethecapabilityoftheproposedalgorithmforavoidingaprematureconvergence.Liu andXie[29]presentedapaperinwhichtheyprovedthattheirproposedapproachpossessesmuchhigherprobabilitiesofreachingtheglobaloptimalsolutionscomparedtothe traditionaltechniques.Inthisalgorithm,abinaryencodingschemeandthestandardoperatorsofgeneticalgorithmswereusedtorepresentclustercentersineachchromosome. Experimentally,theauthorsprovedthattheirproposedalgorithmoutperformsothers.The shortcomingofthisapproachisthatifthesizeofpopulationissmall,itmaygetstuckin alocaloptimaproblem.VanLeVan[30]presentedtwoseparateapproaches,basedona geneticalgorithmandevolutionaryprogramming,todealwithfuzzyclusteringproblems. Afterconductingseveralexperiments,theauthorsconcludedthatthesuccessrateofthis proposedalgorithmisbettercomparedtotheFCMalgorithm,andtheauthorconcluded thattheevolutionaryprogramming-basedmethodproducesthebestresults.Klawonnand Keller[31]designedanevolutionaryprogrammingmodelforclusteringvarioustypesof clustershapeslikesolidandshellclusters.Forashellshape,theproposedalgorithmdoes notproduceanyencouragingresultsbutforasolidshape,theresultsseemstobegood. Eganetal.[32]introducedageneticalgorithm–basedfuzzyclusteringtechniquefornoisy data.Inthistechnique,anadditionalclusterrepresentingnoisedata,knownasanoise cluster,isclubbedtoeachchromosome.Theexperimentalresultsshowedthatthebinary representationpossessesabetterresultcomparedtoareal-valuedrepresentationinthis particulardomain.Halletal.[28]presentedanotheralgorithm,inwhichtheproposedalgorithmshowedapromisingresultforlessnoisedata[33],butfornoisedata,thealgorithm doesnotperformsatisfactorily[28].MaulikandSaha[34]developedamodifieddifferential evolution-basedfuzzyclusteringalgorithm.Inthisalgorithm,amodifiedmutationprocess wasintroducedusingthethoughtsoflocalandglobalbestvectorsasinthePSOalgorithm.Thesetheorieswereusedduringthismodifiedmutationprocessforpushingthetrial vectorspeedilytowardaglobaloptima.Theyhavesuccessfullyusedtheirproposedalgorithmintheimagesegmentationalgorithm.Theyhavealsousedtheproposedalgorithm inanumberofsynthetic,real-lifedatasetsandstandardbenchmarkfunctionstoprovethe applicabilityofthisalgorithm.
1.9.2PSO-BasedFuzzyClustering
Xiaoetal.[35]usedanovelandself-organizingmap(SOM)methodforclustering.The authorshaveusedthegeneexpressiondataforclusteringpurposes.Inthisproposed method,aconsciencefactorhasbeenaddedtoincreasetherateofconvergence.In thisapproach,theconceptsofPSOhavebeenutilizedfordevelopingtheweightsand, thereafter,isusedtotraintheseweightsinthefirstphase.Afterward,inthenextphase, PSOisappliedtoimprovethem.ThishybridSOM-PSOapproachgivesencouraging outcomewhileappliedonthegeneexpressiondataofRatHepatocytesandYeast.Cui etal.[36]andCuiandPotok[37]introducedPSO-basedhybridmethodstoclassifythe textdocuments.Forhybridization,two-stepclusteringapproacheshavebeenused.Firstly, PSO-basedfuzzyclusteringisusedforclusteringforapredefinedmaximumnumberof iterations.Thereafter,the k-meansalgorithmisintroducedtoinitializetheclustercenters
1MetaheuristicAlgorithmsinFuzzyClustering
thatwereachievedfromthepreviousstepandthenaccomplishthelastclusteringprocess. Thesetwostepshavebeenusedatonetimetoimprovetheperformanceandalsospeed upitsconvergencemainlyforlargedatasets.TheauthorsexaminedthePSO,aswellas ahybridPSOclusteringalgorithmonfourvarioustextdocumentdatasets.Accordingto theirobservation,highlycompactclusteringisgenerallygeneratedbythehybridPSO clusteringalgorithmoverashortperiodoftimecomparedtothe k-meansalgorithm.
1.9.3AntColonyOptimization–BasedFuzzyClustering
Theantcolonyoptimization(ACO)[18]algorithmisalsoappliedtoovercomethe shortcomingsofthefuzzyclusteringmethods.Yuetal.[38]proposedahybridbased onACOtosegmentthenoisyimages.Theyalsoproposedapossibilistic c-means(PCM) algorithm.Inthisapproach,theclusteringproblemissolvedbyusingpre-classifiedpixel informationandfurnishesthenearoptimalinitializationofthenumberofclustersand theircentroids[38].ImagesegmentationisdoneusingACO-basedfuzzyclustering[39].In thisapproach,theantisconsideredanindividualpixelintheimage,andthemembership functioniscalculatedonthebasisofheuristicandpheromoneinformationoneachcluster center.Theperformanceoftheimagesegmentationalgorithmimprovisedbyincludingthe spatialinformationinthemembership[39].Ahybridclusteringalgorithm,i.e.,onethat ishybridizedwiththePCMalgorithm,ispresentedtosegmentthemedicalimages[40]. Thishybridizedalgorithmovercomesthedrawbackoftheimagesegmentation.Niknam andAmiri[41]proposedahybridapproachbasedonPSO,ACO,and k-meansforcluster analysis.Thisapproachisappliedtofindbetterclusterpartition.Theperformanceof theFCMisimprovedwiththeantcolonyoptimizationalgorithm,andthemin-maxant systemisinducedintheACOalgorithminthismethod[42].Gajjaretal.[43]presented fuzzyandantcolonyoptimizationbasedoncombiningMAC,routing,andtheunequal clusteringcross-layerprotocolforwirelesssensornetworks(FAMACROW).Thiscombined networkconsistsofseveralnodesandisusedtosendthesenseddatatothemasterstation. Theclusteringalgorithmhybridizedwithanimprovedantcolonyalgorithmisapplied forfaultidentificationandfaultclassification[44].Thefuzzyclusteringnumbersand initialclusteringcenterareidentifiedbythealgorithm.MaryandRaja[45]proposed anACO-basedimprovedFCMtosegmentthemedicalimages.TheFCMandfour-chain quantumbeecolonyoptimization(QABC)isappliedforimagesegmentation[46].A modifiedACO-basedfuzzyclusteringalgorithmisproposedbySupratidandKim[47]. ACO,FCM,andGAarecombinedinthisalgorithmtoovercometheproblemofthe frequentimprovementofclustercenters.
1.9.4ArtificialBeeColonyOptimization–BasedFuzzyClustering Alrosanetal.[48]proposedaclusteringmethodbycouplingartificialbeecolonywith thefuzzy c-means(ABC-FCM)algorithm.Thisapproachtookadvantageofthesearchingcapabilitiesofoptimuminitialclustercentersandappliestheseclustersastheinitial clustercenters.Theproposedapproachproveditssuperioritywhenappliedontwosets ofMRIimages:simulatedbraindataandrealMRIimages[48].Theweaknessoverthe controlofthelocaloptimumofthetechniqueisalsohandledbyPhametal.[49].Inthis
regard,theyexploitthesearchabilityoftheABCalgorithm.Inthismethod,arealnumber isusedineachbeetodeterminetheappropriateclustercenters.Thesupremityofthis approachisestablishedbycomparingtheproposedalgorithmwithFCMandtheGA-based clusteringalgorithmonsomenumericalbenchmarkdata.Amodifiedartificialbeecolony algorithm–basedfuzzy C-meansalgorithm(MoABC-FCM)ispresentedbyOuadfeland Meshoul[50].Inspiredfromdifferentialevolution(DE),anewmutationmethodwasintroducedintheABCalgorithmtoimprovetheexploitationprocess.Ultimately,theproposed MoABC-FCMalgorithmenhancedtheeffectivenessoftheoriginalFCMalgorithm,and itisprovedthatthisnewapproachisbetterthanoptimization-basedsearchalgorithms, suchasthestandardABC,modifiedABC,andPSO.AnovelspatialfuzzyclusteringalgorithmoptimizedbytheABCalgorithm,inshort,ABC-SFCM,isemployedtosegmentthe syntheticandrealimages[51].Thisalgorithmisadvantageousfortworeasons.First,itis abletotacklenoisyimagesegmentationefficientlybyusingthespatiallocalinformationin themembershipfunction.Second,theglobalperformancethatresultsfromtakingadvantageoftheglobalsearchcapabilityofABCisimprovedbytheproposedmethod[51].Itis alsonotedthatthismethodismorerobusttonoisethanotherknownmethods.Animage segmentationprocedureutilizingtheABCalgorithmwaspresentedbyHanceretal.[52] toidentifybraintumorsfromtheMRIbrainimagesandisoneofthestandardvaluable toolsappliedfordiagnosingandtreatingmedicalcases.Theproposedprocedureincludes threestages:preprocessingtheinputMRIimage,applyingtheABC-basedfuzzyclustering methodforsegmentation,andinthelaststage,i.e.,inthepost-processingstage,extractingthebraintumors.TheproposedprocedureincludesexaminingMRIimagesinvarious areasofapatient’sbrainwithdifferentapproacheslikethe k-means,FCM,andGAalgorithms.ShokouhifarandAbkenar[53]appliedtheABCalgorithmtoclassifyallpixelsinto twocategories:normalandnoisy.Theyintroducedanoiseprobabilityforeachpixelwithin theimage,andbasedonthatnoiseprobabilitythepixelsaregrouped.Inthisproposed method,ABCoptimizationisemployedbeforetheFCMclusteringalgorithmtosegment thereal-lifeMRIimages,anditwasshownthatthisapproachisbetterthantheprevious methods.KarabogaandOzturk[54,55]appliedsuccessfullytheABCalgorithmtooptimize fuzzyclusteringtosomebenchmarkmedicaldata.TheproposedmethodproveditssuperioritycomparedtotheFCMalgorithm.AcombinedapproachoftheFCMalgorithmand theABCalgorithmisemployedtosegmentMRimagesefficiently[56].Twonewparametersareintroducedinthatapproach:thedifferencebetweenneighboringpixelsinthe imageandtherelativelocationoftheneighboringpixels.ItwasshownthattheseparametersimprovedtheperformanceofFCMusingtheABCalgorithm[56].BoseandMali[57] presentedacombinedABCandFCMalgorithm,namedFABC,forunsupervisedclassification.ThePredefiningtheinitialclustercenterssolvedbytheFABCandtheproposed methodworksbetterinrespectofconvergency,timecomplexity,robustness,andsegmentationaccuracy[57].ThepropertyoftherandomizationoftheABCalgorithmappliedfor theinitializationoftheclustercentersintheFABCalgorithm.
1.9.5DifferentialEvolution–BasedFuzzyClustering
DasandKonar[58]presentedafuzzyclusteringmethodincombinationwithmodified differentialevolution,namedautomaticfuzzyclusteringdifferentialevolutionforimage
segmentation.Inthisapproach,thechromosomerepresentationintheDEalgorithm wasmodifiedandappliedforthedeterminationoffuzzyclustersintheintensityspace ofanimageautomatically.Thesymmetry-basedfuzzyclusteringvalidityindex[59]was employedwiththismodifiedDEalgorithm,andtheconvergencepropertyoftheDE algorithmalsoimproved.Thispresentedapproachappliedonsixdifferenttypesofimages includingthenaturalimage,MRIbrainimage,andsatelliteimage.Themutationconstant factor,F,playedavitalroleintheDEalgorithm,anditsvaluewasassignedrandomly rangedbetween0.5and1inAFDE[58].Second,thecrossoverrate(Cr)isnotfixedduring theevolvingprocessasitsvaluechangesaccordinglyduringtheiterationsteps.Thestarting valueofCris1,anditdecreaseslinearlytotheminimumacceptablevalue,whichis0.5. Theauthorspresentedarealcodedchromosomerepresentationinsuchawaythatcan dynamicallydeterminetheappropriatenumberofclustersthatthedatasetmayhave[58]. AnotherversionoftheAFDEisfoundin[60]astheautomaticclusteringDE(ACDE) algorithm.Inthisapproach,theauthorsapplieddifferentobjectivefunctionsnamedDB index[61]andCSindex[62]toevaluatethequalityoftheclustereddata.Gongetal.[63] alsoproposedanautomaticclusteringDEtechniquestosolvetheclusteringproblem.This methodisdescribedbythreeways:(i)amodifiedpointsymmetry-basedclustervalidity index(CVI)presentedtoevaluatethevalidityofthecorrespondingpartitioning,(ii)the Kd-treenearestneighborsearchisappliedtodecreasethecomplexityoffindingtheclosest symmetricpoint,and(iii)anewchromosomalrepresentationisinducedtorepresent individuals.Afterbeingappliedonsixartificialdatasetsofdiversecomplexities,ithasbeen provedthattheproposedapproachissuitedforboththesymmetricalintraclustersand thesymmetricalinterclusters[63].MaulikandSaha[64]presentedamodifiedDE-based techniqueoffuzzyclustering(MoDEFC),andtheauthorsappliedrealcodedencoding techniquefortheclustercenters.Theadvantageoftheglobalbest(GBest)andlocal best(LBest)conceptsofthePSOalgorithminducedinthestandardmutationprocessof differentialevolutionalgorithmtopushthetrialvectorquicklytowardtheglobaloptima. Intheinitialstage,theLBest,i.e.,thebestvectorinthecurrentstage,isthemoreimportant forevolvingthemutantvectorthaninthelaterstage[64].Asthegenerationincreases,the contributionofGBest,i.e.,thebestvectorevaluateduntilthecurrentgeneration,increases thecontributionofLBestforthemutantvectordecreases.
1.9.6FireflyAlgorithm–BasedFuzzyClustering
ToovercometheshortcomingsofFCM,anewfuzzysubspaceclusteringalgorithmbasedon improvedfireflyalgorithmsispresentedin[65].Inthisapproach,theglobaloptimization capabilityofthefireflyalgorithm,stronglocalsearchfeaturesofFCM,andlearningcalculationforfeatureweightsofreliability-based k-meansaretakenintoconsideration[65]. Thisalgorithmisappliedaccuratelyandefficientlyondifferentfeaturesubspace-based clusteringproblems.Alomoushetal.[66]proposedahybridizedsegmentationalgorithm, fireflyalgorithm(FA),andfuzzy c-meansalgorithm(FCM)tosegmentmagneticresonance imaging(MRI)brainimages.MRIimagesarenoteasytosegmentasnormalandabnormaltissuesareverymuchsimilarinview.Thefireflyalgorithmisemployedtodetermine theoptimalclustercentersforFCM,andthatimprovestheefficiencyofFCMtosegment theMRIimages.Sharmaetal.[67]presenteda k-meansalgorithmandfireflyalgorithm