
https://ebookmass.com/product/deep-reinforcement-learningfor-wireless-communications-and-networking-theoryapplications-and-implementation-dinh-thai-hoang/

Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...
Metaverse Communication and Computing Networks: Applications, Technologies, and Approaches Dinh Thai Hoang
https://ebookmass.com/product/metaverse-communication-and-computingnetworks-applications-technologies-and-approaches-dinh-thai-hoang/
ebookmass.com
Data Communications and Networking 5th Edition

https://ebookmass.com/product/data-communications-and-networking-5thedition/
ebookmass.com
Malware Diffusion Models for Wireless Complex Networks. Theory and Applications 1st Edition Karyotis
https://ebookmass.com/product/malware-diffusion-models-for-wirelesscomplex-networks-theory-and-applications-1st-edition-karyotis/
ebookmass.com
Miss Cecily's Recipes for Exceptional Ladies Vicky Zimmerman
https://ebookmass.com/product/miss-cecilys-recipes-for-exceptionalladies-vicky-zimmerman/
ebookmass.com



Grundlagen des Marketing 8th Edition Philip Kotler
https://ebookmass.com/product/grundlagen-des-marketing-8th-editionphilip-kotler/
ebookmass.com
Physics for Scientists and Engineers (MindTap Course List) 10th Edition Serway
https://ebookmass.com/product/physics-for-scientists-and-engineersmindtap-course-list-10th-edition-serway/
ebookmass.com
My Secret to Tell Natalie D. Richards
https://ebookmass.com/product/my-secret-to-tell-natalie-d-richards/
ebookmass.com
Interfaces in Particle and Fibre Reinforced Composites: From Macro to Nano Scales De Silva
https://ebookmass.com/product/interfaces-in-particle-and-fibrereinforced-composites-from-macro-to-nano-scales-de-silva/
ebookmass.com
CRISPR: Genome Editing and Engineering: And Related Issues Gale
https://ebookmass.com/product/crispr-genome-editing-and-engineeringand-related-issues-gale/
ebookmass.com






IEEEPress
445HoesLane
Piscataway,NJ08854
IEEEPressEditorialBoard
SarahSpurgeon, EditorinChief
JónAtliBenediktsson
AnjanBose
JamesDuncan
AminMoeness
DesineniSubbaramNaidu
BehzadRazavi
JimLyke
HaiLi
BrianJohnson
JeffreyReed
DiomidisSpinellis
AdamDrobot
TomRobertazzi
AhmetMuratTekalp
Copyright©2023byTheInstituteofElectricalandElectronicsEngineers,Inc.Allrights reserved.
PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey.
PublishedsimultaneouslyinCanada.
Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinany formorbyanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise, exceptaspermittedunderSection107or108ofthe1976UnitedStatesCopyrightAct,without eitherthepriorwrittenpermissionofthePublisher,orauthorizationthroughpaymentofthe appropriateper-copyfeetotheCopyrightClearanceCenter,Inc.,222RosewoodDrive,Danvers, MA01923,(978)750-8400,fax(978)750-4470,oronthewebatwww.copyright.com.Requeststo thePublisherforpermissionshouldbeaddressedtothePermissionsDepartment,JohnWiley& Sons,Inc.,111RiverStreet,Hoboken,NJ07030,(201)748-6011,fax(201)748-6008,oronlineat http//www.wiley.com/go/permission.
TrademarksWileyandtheWileylogoaretrademarksorregisteredtrademarksofJohnWiley& Sons,Inc.and/oritsaffiliatesintheUnitedStatesandothercountriesandmaynotbeused withoutwrittenpermission.Allothertrademarksarethepropertyoftheirrespectiveowners. JohnWiley&Sons,Inc.isnotassociatedwithanyproductorvendormentionedinthisbook.
LimitofLiability/DisclaimerofWarrantyWhilethepublisherandauthorhaveusedtheirbest effortsinpreparingthisbook,theymakenorepresentationsorwarrantieswithrespecttothe accuracyorcompletenessofthecontentsofthisbookandspecificallydisclaimanyimplied warrantiesofmerchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedor extendedbysalesrepresentativesorwrittensalesmaterials.Theadviceandstrategiescontained hereinmaynotbesuitableforyoursituation.Youshouldconsultwithaprofessionalwhere appropriate.Neitherthepublishernorauthorshallbeliableforanylossofprofitoranyother commercialdamages,includingbutnotlimitedtospecial,incidental,consequential,orother damages.Further,readersshouldbeawarethatwebsiteslistedinthisworkmayhavechanged ordisappearedbetweenwhenthisworkwaswrittenandwhenitisread.Neitherthepublisher norauthorsshallbeliableforanylossofprofitoranyothercommercialdamages,includingbut notlimitedtospecial,incidental,consequential,orotherdamages.
Forgeneralinformationonourotherproductsandservicesorfortechnicalsupport,please contactourCustomerCareDepartmentwithintheUnitedStatesat(800)762-2974,outsidethe UnitedStatesat(317)572-3993orfax(317)572-4002.
Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsin printmaynotbeavailableinelectronicformats.FormoreinformationaboutWileyproducts, visitourwebsiteatwww.wiley.com.
LibraryofCongressCataloging-in-PublicationDataappliedfor:
HardbackISBN:9781119873679
CoverDesign:Wiley CoverImage:©Liuzishan/Shutterstock
Setin9.5/12.5ptSTIXTwoTextbyStraive,Chennai,India
Tomyfamily–DinhThaiHoang
Tomyfamily–NguyenVanHuynh
ToVeronicaHaiBinh,PaulSonNam,andThuy–DiepN.Nguyen
Tomyparents–EkramHossain
Tomyfamily–DusitNiyato
Contents
NotesonContributors xiii
Foreword xiv
Preface xv
Acknowledgments xviii
Acronyms xix
Introduction xxii
PartIFundamentalsofDeepReinforcementLearning 1
1DeepReinforcementLearningandItsApplications 3
1.1WirelessNetworksandEmergingChallenges 3
1.2MachineLearningTechniquesandDevelopmentofDRL 4
1.2.1MachineLearning 4
1.2.2ArtificialNeuralNetwork 7
1.2.3ConvolutionalNeuralNetwork 8
1.2.4RecurrentNeuralNetwork 9
1.2.5DevelopmentofDeepReinforcementLearning 10
1.3PotentialsandApplicationsofDRL 11
1.3.1BenefitsofDRLinHumanLives 11
1.3.2FeaturesandAdvantagesofDRLTechniques 12
1.3.3AcademicResearchActivities 12
1.3.4ApplicationsofDRLTechniques 13
1.3.5ApplicationsofDRLTechniquesinWirelessNetworks 15
1.4StructureofthisBookandTargetReadership 16
1.4.1MotivationsandStructureofthisBook 16
1.4.2TargetReadership 19
1.5ChapterSummary 20
References 21
3.4.3ProximalPolicyOptimization(PPO) 72
3.5Model-BasedRL 74
3.5.1VanillaModel-BasedRL 75
3.5.2RobustModel-BasedRL:Model-EnsembleTRPO(ME-TRPO) 76
3.5.3AdaptiveModel-BasedRL:Model-BasedMeta-PolicyOptimization (MB-MPO) 77
3.6ChapterSummary 78 References 79
4ACaseStudyandDetailedImplementation 83
4.1SystemModelandProblemFormulation 83
4.1.1SystemModelandAssumptions 84
4.1.1.1JammingModel 84
4.1.1.2SystemOperation 85
4.1.2ProblemFormulation 86
4.1.2.1StateSpace 86
4.1.2.2ActionSpace 87
4.1.2.3ImmediateReward 88
4.1.2.4OptimizationFormulation 88
4.2ImplementationandEnvironmentSettings 89
4.2.1InstallTensorFlowwithAnaconda 89
4.2.2Q-Learning 90
4.2.2.1CodesfortheEnvironment 91
4.2.2.2CodesfortheAgent 96
4.2.3DeepQ-Learning 97
4.3SimulationResultsandPerformanceAnalysis 102
4.4ChapterSummary 106 References 106
PartIIApplicationsofDRLinWirelessCommunications andNetworking 109
5DRLatthePhysicalLayer 111
5.1Beamforming,SignalDetection,andDecoding 111
5.1.1Beamforming 111
5.1.1.1BeamformingOptimizationProblem 111
5.1.1.2DRL-BasedBeamforming 113
5.1.2SignalDetectionandChannelEstimation 118
5.1.2.1SignalDetectionandChannelEstimationProblem 118
5.1.2.2RL-BasedApproaches 120
x Contents
5.1.3ChannelDecoding 122
5.2PowerandRateControl 123
5.2.1PowerandRateControlProblem 123
5.2.2DRL-BasedPowerandRateControl 124
5.3Physical-LayerSecurity 128
5.4ChapterSummary 129 References 131
6DRLattheMACLayer 137
6.1ResourceManagementandOptimization 137
6.2ChannelAccessControl 139
6.2.1DRLintheIEEE802.11MAC 141
6.2.2MACforMassiveAccessinIoT 143
6.2.3MACfor5GandB5GCellularSystems 147
6.3HeterogeneousMACProtocols 155
6.4ChapterSummary 158 References 158
7DRLattheNetworkLayer 163
7.1TrafficRouting 163
7.2NetworkSlicing 166
7.2.1NetworkSlicing-BasedArchitecture 166
7.2.2ApplicationsofDRLinNetworkSlicing 168
7.3NetworkIntrusionDetection 179
7.3.1Host-BasedIDS 180
7.3.2Network-BasedIDS 181
7.4ChapterSummary 183 References 183
8DRLattheApplicationandServiceLayer 187
8.1ContentCaching 187
8.1.1QoS-AwareCaching 187
8.1.2JointCachingandTransmissionControl 189
8.1.3JointCaching,Networking,andComputation 191
8.2DataandComputationOffloading 193
8.3DataProcessingandAnalytics 198
8.3.1DataOrganization 198
8.3.1.1DataPartitioning 198
8.3.1.2DataCompression 199
8.3.2DataScheduling 200
8.3.3TuningofDataProcessingSystems 201
8.3.4DataIndexing 202
8.3.4.1DatabaseIndexSelection 202
8.3.4.2IndexStructureConstruction 203
8.3.5QueryOptimization 205
8.4ChapterSummary 206 References 207
PartIIIChallenges,Approaches,OpenIssues,and EmergingResearchTopics 213
9DRLChallengesinWirelessNetworks 215
9.1AdversarialAttacksonDRL 215
9.1.1AttacksPerturbingtheStatespace 215
9.1.1.1ManipulationofObservations 216
9.1.1.2ManipulationofTrainingData 218
9.1.2AttacksPerturbingtheRewardFunction 220
9.1.3AttacksPerturbingtheActionSpace 222
9.2MultiagentDRLinDynamicEnvironments 223
9.2.1Motivations 223
9.2.2MultiagentReinforcementLearningModels 224
9.2.2.1Markov/StochasticGames 225
9.2.2.2DecentralizedPartiallyObservableMarkovDecisionProcess (DPOMDP) 226
9.2.3ApplicationsofMultiagentDRLinWirelessNetworks 227
9.2.4ChallengesofUsingMultiagentDRLinWirelessNetworks 229
9.2.4.1NonstationarityIssue 229
9.2.4.2PartialObservabilityIssue 229
9.3OtherChallenges 230
9.3.1InherentProblemsofUsingRLinReal-WordSystems 230
9.3.1.1LimitedLearningSamples 230
9.3.1.2SystemDelays 230
9.3.1.3High-DimensionalStateandActionSpaces 231
9.3.1.4SystemandEnvironmentConstraints 231
9.3.1.5PartialObservabilityandNonstationarity 231
9.3.1.6MultiobjectiveRewardFunctions 232
9.3.2InherentProblemsofDLandBeyond 232
9.3.2.1InherentProblemsofDL 232
9.3.2.2ChallengesofDRLBeyondDeepLearning 233
9.3.3ImplementationofDLModelsinWirelessDevices 236
9.4ChapterSummary 237 References 237
NotesonContributors
DinhThaiHoang SchoolofElectricalandData
Engineering
UniversityofTechnologySydney
Australia
NguyenVanHuynh SchoolofComputing,Engineeringand theBuiltEnvironment
EdinburghNapierUniversity
UK
DiepN.Nguyen SchoolofElectricalandData
Engineering
UniversityofTechnologySydney
Australia
EkramHossain DepartmentofElectricaland ComputerEngineering UniversityofManitoba
Canada
DusitNiyato SchoolofComputerScienceand Engineering NanyangTechnologicalUniversity
Singapore
Foreword
Prof.MerouaneDebbah,Integratingdeepreinforcementlearning(DRL)techniquesinwirelesscommunicationsandnetworkinghaspavedthewayfor achievingefficientandoptimizedwirelesssystems.Thisground-breakingbook providesexcellentmaterialforresearcherswhowanttostudyapplicationsofdeep reinforcementlearninginwirelessnetworks,withmanypracticalexamplesand implementationdetailsforthereaderstopractice.Italsocoversvarioustopicsat differentnetworklayers,suchaschannelaccess,networkslicing,andcontent caching.Thisbookisessentialforanyonelookingtostayaheadofthecurvein thisexcitingfield.
Prof.VincentPoor,Manyaspectsofwirelesscommunicationsandnetworkingare beingtransformedthroughtheapplicationofdeepreinforcementlearning(DRL) techniques.Thisbookrepresentsanimportantcontributiontothisfield,providingacomprehensivetreatmentofthetheory,applications,andimplementation ofDRLinwirelesscommunicationsandnetworking.Animportantaspectofthis bookisitsfocusonpracticalimplementationissues,suchassystemdesign,algorithmimplementation,andreal-worlddeploymentchallenges.Bybridgingthe gapbetweentheoryandpractice,theauthorsprovidereaderswiththetoolsto buildanddeployDRL-basedwirelesscommunicationandnetworkingsystems. Thisbookisausefulresourceforthoseinterestedinlearningaboutthepotential ofDRLtoimprovewirelesscommunicationsandnetworkingsystems.Itsbreadth anddepthofcoverage,practicalfocus,andexpertinsightsmakeitasingularcontributiontothefield.
Preface
Reinforcementlearningisoneofthemostimportantresearchdirectionsof machinelearning(ML),whichhashadsignificantimpactsonthedevelopment ofartificialintelligence(AI)overthelast20years.Reinforcementlearningis alearningprocessinwhichanagentcanperiodicallymakedecisions,observe theresults,andthenautomaticallyadjustitsstrategytoachieveanoptimal policy.However,thislearningprocess,evenwithprovenconvergence,oftentakes asignificantamountoftimetoreachthebestpolicyasithastoexploreand gainknowledgeofanentiresystem,makingitunsuitableandinapplicableto large-scalesystemsandnetworks.Consequently,applicationsofreinforcement learningareverylimitedinpractice.Recently,deeplearninghasbeenintroducedasanewbreakthroughMLtechnique.Itcanovercomethelimitationsof reinforcementlearningandthusopenaneweraforthedevelopmentofreinforcementlearning,namely deepreinforcementlearning (DRL).DRLembraces theadvantageofdeepneuralnetworks(DNNs)totrainthelearningprocess, therebyimprovingthelearningrateandtheperformanceofreinforcement learningalgorithms.Asaresult,DRLhasbeenadoptedinnumerousapplications ofreinforcementlearninginpracticesuchasrobotics,computervision,speech recognition,andnaturallanguageprocessing.
Intheareasofcommunicationsandnetworking,DRLhasbeenrecentlyused asaneffectivetooltoaddressvariousproblemsandchallenges.Inparticular, modernnetworkssuchastheInternet-of-Things(IoT),heterogeneousnetworks(HetNets),andunmannedaerialvehicle(UAV)networksbecomemore decentralized,ad-hoc,andautonomousinnature.NetworkentitiessuchasIoT devices,mobileusers,andUAVsneedtomakelocalandindependentdecisions, e.g.spectrumaccess,datarateadaption,transmitpowercontrol,andbasestation association,toachievethegoalsofdifferentnetworksincluding,e.g.throughputmaximizationandenergyconsumptionminimization.Inuncertainand stochasticenvironments,mostofthedecision-makingproblemscanbemodeled asaso-called Markovdecisionprocess (MDP).Dynamicprogrammingandother
Preface
advancedmodelingtechniquestomotivateandprovidefundamentalknowledge forthereaders.Wethenprovidecasestudiestogetherwithimplementation detailstohelpthereadersbetterunderstandhowtopracticeandapplyDRLto theirproblems.Afterthat,wereviewDRLapproachesthataddressemerging issuesincommunicationsandnetworking.Theissuesincludedynamicnetwork access,dataratecontrol,wirelesscaching,dataoffloading,networksecurity,and connectivitypreservation,whichareallimportanttonext-generationnetworks suchas5Gandbeyond.Finally,wehighlightimportantchallenges,openissues, andfutureresearchdirectionsforapplyingDRLtowirelessnetworks.
Acknowledgments
Theauthorswouldliketoacknowledgegrant-awardingagenciesthatsupported partsofthisbook.ThisresearchwassupportedinpartbytheAustralianResearch CouncilundertheDECRAprojectDE210100651andtheNaturalSciencesand EngineeringResearchCouncilofCanada(NSERC).
TheauthorswouldliketothankMr.CongThanhNguyen,Mr.HieuChiNguyen, Mr.NamHoaiChu,andMr.KhoaVietTranfortheirtechnicalassistanceand discussionsduringthewritingofthisbook.
NoAcronymsTerms
24ITSintelligenttransportationsystem
25LTELong-termevolution
26M2Mmachine-to-machine
27MACmediumaccesscontrol
28MARLmulti-agentRL
29MDPMarkovdecisionprocess
30MECmobileedgecomputing
31MIMOmultiple-inputmultiple-output
32MISOMulti-inputsingle-output
33MLmachinelearning
34mMTCmassivemachinetypecommunications
35mmWavemillimeterwave
36MUmobileuser
37NFVnetworkfunctionvirtualization
38OFDMAorthogonalfrequencydivisionmultipleaccess
39POMDPpartiallyobservableMarkovdecisionprocess
40PPOproximalpolicyoptimization
41PSRpredictivestaterepresentation
42QoEQualityofExperience
43QoSQualityofService
44RANradioaccessnetwork
45RBresourceblock
46RFradiofrequency
47RISreconfigurableintelligentsurface
48RLreinforcementlearning
49RNNrecurrentneuralnetwork
50SARSAstate-action-reward-state-action
51SDNsoftware-definednetworking
52SGDstochasticgradientdescent
53SINRsignal-to-interference-plus-noiseratio
54SMDPsemi-Markovdecisionprocess
55TDtemporaldifference
56TDMAtime-divisionmultipleaccess
57TRPOtrustregionpolicyoptimization
NoAcronymsTerms
58UAVunmannedaerialvehicle
59UEuserequipment
60ULuplink
61URLLCultra-reliableandlow-latencycommunications
62VANETvehicularadhocNETworks
63VNFvirtualnetworkfunction
64WLANwirelesslocalareanetwork
65WSNwirelesssensornetwork
1.1WirelessNetworksandEmergingChallenges
Overthepastfewyears,communicationtechnologieshavebeenrapidlydevelopingtosupportvariousaspectsofourdailylives,fromsmartcitiesandhealthcareto logisticsandtransportation.Thiswillbethebackboneforthefuture’sdata-centric society.Nevertheless,thesenewapplicationsgenerateatremendousamount ofworkloadandrequirehigh-reliabilityandultrahigh-capacitywirelesscommunications.Inthelatestreport[1],Ciscoprojectedthenumberofconnected devicesthatwillbearound29.3billionby2023,withmorethan45%equipped withmobileconnections.Thefastest-growingmobileconnectiontypeislikely machine-to-machine(M2M),asInternet-of-Things(IoT)servicesplayasignificantroleinconsumerandbusinessenvironments.Thisposesseveralchallenges infuturewirelesscommunicationsystems:
● Emergingservices(e.g.augmentedreality[AR]andvirtualreality[VR])require high-reliabilityandultrahighcapacitywirelesscommunications.However, existingcommunicationsystems,designedandoptimizedbasedonconventionalcommunicationtheories,significantlypreventfurtherperformance improvementsfortheseservices.
● Wirelessnetworksarebecomingincreasinglyadhocanddecentralized,in whichmobiledevicesandsensorsarerequiredtomakeindependentactions suchaschannelselectionsandbasestationassociationstomeetthesystem’s requirements,e.g.energyefficiencyandthroughputmaximization.Nonetheless,thedynamicsanduncertaintyofthesystemspreventthemfromobtaining optimaldecisions.
● Anothercrucialcomponentoffuturenetworksystemsisnetworktrafficcontrol. Networkcontrolcandramaticallyimproveresourceusageandtheefficiency ofinformationtransmissionthroughmonitoring,checking,andcontrolling dataflows.Unfortunately,theproliferationofsmartIoTdevicesandultradense DeepReinforcementLearningforWirelessCommunicationsandNetworking: Theory,Applications,andImplementation,FirstEdition. DinhThaiHoang,NguyenVanHuynh,DiepN.Nguyen,EkramHossain,andDusitNiyato. ©2023TheInstituteofElectricalandElectronicsEngineers,Inc.Published2023byJohnWiley&Sons,Inc.
radionetworkshasgreatlyexpandedthenetworksizewithextremelydynamic topologies.Inaddition,theexplosivegrowingdatatrafficimposesconsiderable pressureonInternetmanagement.Asaresult,existingnetworkcontrol approachesmaynoteffectivelyhandlethesecomplexanddynamicnetworks.
● Mobileedgecomputing(MEC)hasbeenrecentlyproposedtoprovidecomputingandcachingcapabilitiesattheedgeofcellularnetworks.Inthisway,popularcontentscanbecachedatthenetworkedge,suchasbasestation,end-user devices,andgatewaystoavoidduplicatetransmissionsofthesamecontent, resultinginbetterenergyandspectrumusage[2,3].Onemajorchallengein futurecommunicationsystemsisthestragglingproblemsatbothedgenodes andwirelesslinks,whichcansignificantlyincreasethecomputationdelayof thesystem.Additionally,thehugedatademandsofmobileusersandthelimited storageandprocessingcapacitiesarecriticalissuesthatneedtobeaddressed.
Conventionalapproachestoaddressingthenewchallengesanddemandsof moderncommunicationsystemshaveseverallimitations.First,therapidgrowth inthenumberofdevices,theexpansionofnetworkscale,andthediversityof servicesintheneweraofcommunicationsareexpectedtosignificantlyincrease theamountofdatageneratedbyapplications,users,andnetworks[1].However, traditionalsolutionsmaybeunabletoprocessandutilizethisdataeffectivelyto improvesystemperformance.Second,existingalgorithmsarenotwell-suitedto handlethedynamicanduncertainnatureofnetworkenvironments,resultingin poorperformance[4].Finally,traditionaloptimizationsolutionsoftenrequire completeinformationaboutthesystemtobeeffective,butthisinformation maynotbereadilyavailableinpractice,limitingtheapplicabilityofthese approaches.Deepreinforcementlearning(DRL)hasthepotentialtoovercome theselimitationsandprovidepromisingsolutionstothesechallenges.
DRLleveragesthebenefitsofdeepneuralnetworks(DNNs),whichhaveproven effectiveintacklingcomplex,large-scaleengines,speechrecognition,medical diagnosis,andcomputervision.ThismakesDRLwellsuitedformanagingthe increasingcomplexityandscaleoffuturecommunicationnetworks.Additionally, DRL’sonlinedeploymentallowsittoeffectivelyhandlethedynamicsand unpredictablenatureofwirelesscommunicationenvironments.
1.2MachineLearningTechniquesandDevelopment ofDRL
1.2.1MachineLearning
Machinelearning(ML)isaproblem-solvingparadigmwhereamachine learnsaparticulartask(e.g.imageclassification,documenttextclassification, speechrecognition,medicaldiagnosis,robotcontrol,andresourceallocationin
Figure1.1 Adata-drivenMLarchitecture.
communicationnetworks)andperformancemetric(e.g.classificationaccuracy andperformanceloss)usingexperiencesordata[5].Thetaskgenerallyinvolves afunctionthatmapswell-definedinputstowell-definedoutputs.Theessenceof data-drivenMListhatthereisapatterninthetaskinputsandtheoutcomewhich cannotbepinneddownmathematically.Thus,thesolutiontothetask,which mayinvolvemakingadecisionorpredictinganoutput,cannotbeprogrammed explicitly.Ifthesetofrulesconnectingthetaskinputsandoutput(s)wereknown, aprogramcouldbewrittenbasedonthoserules(e.g.if-then-elsecodes)tosolve theproblem.Instead,anMLalgorithmlearnsfromtheinputdataset,which specifiesthecorrectoutputforagiveninput;thatis,anMLmethodwillresult inaprogramthatusesthedatasamplestosolvetheproblem.Adata-drivenML architecturefortheclassificationproblemisshowninFigure1.1.Thetraining moduleisresponsibleforoptimizingtheclassifierfromthetrainingdatasamples andprovidingtheclassificationmodulewithatrainedclassifier.Theclassificationmoduledeterminestheoutputbasedontheinputdata.Thetrainingand classificationmodulescanworkindependently.Thetrainingproceduregenerally takesalongtime.However,thetrainingmoduleisactivatedonlyperiodically. Also,thetrainingprocedurecanbeperformedinthebackground,whilethe classificationmoduleoperatesasusual.
TherearethreecategoriesofMLtechniques,includingsupervised,unsupervised,andreinforcementlearning.
● Supervisedlearning:Givenadataset D ={(x�� , y1 ), (x�� , y2 ), , (xn , yn )} ⊆ ℝn × ,asupervisedlearningalgorithmpredicts y thatgeneralizestheinput–output mappingin D toinputs x outside D.Here, ℝn isthe n-dimensionalfeature space , xi istheinputvectorofthe ithsample, yi isthelabelofthe ithsample, and isthelabelspace.Forbinaryclassificationproblems(e.g.spamfiltering), ={0,1} or ={−1,1}.Formulticlassclassification(e.g.faceclassification), ={1,2, , K }(K ≥ 2).Ontheotherhand,forregressionproblems(e.g. predictingtemperature), = ℝ.Thedatapoints (xi , yi ) aredrawnfroma (unknown)distribution (X , Y ).Thelearningprocessinvolveslearningafunction h suchthatforanewpair (x, y)∽ ,wehave h(x)= y withhighprobability (or h(x)≈ y).Alossfunction(orriskfunction),suchasthemeansquared