
https://ebookmass.com/product/cybersecurity-in-intelligentnetworking-systems-1st-edition-shengjie-xu/

Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...
The AI Revolution in Networking, Cybersecurity, and Emerging Technologies Omar Santos
https://ebookmass.com/product/the-ai-revolution-in-networkingcybersecurity-and-emerging-technologies-omar-santos/
ebookmass.com
Demystifying Intelligent Multimode Security Systems: An Edge-to-Cloud Cybersecurity Solutions Guide Jody Booth
https://ebookmass.com/product/demystifying-intelligent-multimodesecurity-systems-an-edge-to-cloud-cybersecurity-solutions-guide-jodybooth/
ebookmass.com
Intelligent Systems and Learning Data Analytics in Online Education: A volume in Intelligent Data-Centric Systems Santi Caballé
https://ebookmass.com/product/intelligent-systems-and-learning-dataanalytics-in-online-education-a-volume-in-intelligent-data-centricsystems-santi-caballe/ ebookmass.com
Designing Control Loops for Linear and Switching Power Supplies: A Tutorial Guide – Ebook PDF Version
https://ebookmass.com/product/designing-control-loops-for-linear-andswitching-power-supplies-a-tutorial-guide-ebook-pdf-version/
ebookmass.com




Media Ethics: Key Principles for Responsible Practice (Ebook PDF)
https://ebookmass.com/product/media-ethics-key-principles-forresponsible-practice-ebook-pdf/
ebookmass.com
Why Am I Afraid to Love? John Powell
https://ebookmass.com/product/why-am-i-afraid-to-love-john-powell/
ebookmass.com
Discrepant solace. Contemporary literature and the work of consolation. James
https://ebookmass.com/product/discrepant-solace-contemporaryliterature-and-the-work-of-consolation-james/
ebookmass.com
God, Grades, and Graduation: Religion's Surprising Impact on Academic Success Ilana M. Horwitz
https://ebookmass.com/product/god-grades-and-graduation-religionssurprising-impact-on-academic-success-ilana-m-horwitz/
ebookmass.com
Campbell’s Physical Therapy for Children Expert Consult E Book (Ebook PDF)
https://ebookmass.com/product/campbells-physical-therapy-for-childrenexpert-consult-e-book-ebook-pdf/
ebookmass.com





Wide Bandgap Semiconductor Power Devices_ Materials, Physics, Design, and Applications B. Jayant Baliga
https://ebookmass.com/product/wide-bandgap-semiconductor-powerdevices_-materials-physics-design-and-applications-b-jayant-baliga/
ebookmass.com


CybersecurityinIntelligentNetworking Systems
ShengjieXu SanDiegoStateUniversity,USA
YiQian UniversityofNebraska-Lincoln,USA
RoseQingyangHu UtahStateUniversity,USA
Thiseditionfirstpublished2023 ©2023JohnWiley&SonsLtd
Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,or transmitted,inanyformorbyanymeans,electronic,mechanical,photocopying,recordingor otherwise,exceptaspermittedbylaw.Adviceonhowtoobtainpermissiontoreusematerialfromthis titleisavailableathttp://www.wiley.com/go/permissions.
TherightofShengjieXu,YiQian,andRoseQingyangHutobeidentifiedastheauthorsofthiswork hasbeenassertedinaccordancewithlaw.
RegisteredOffices
JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA
JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UK
Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWiley productsvisitusatwww.wiley.com.
Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Some contentthatappearsinstandardprintversionsofthisbookmaynotbeavailableinotherformats.
Trademarks:WileyandtheWileylogoaretrademarksorregisteredtrademarksofJohnWiley&Sons, Inc.and/oritsaffiliatesintheUnitedStatesandothercountriesandmaynotbeusedwithoutwritten permission.Allothertrademarksarethepropertyoftheirrespectiveowners.JohnWiley&Sons,Inc. isnotassociatedwithanyproductorvendormentionedinthisbook.
LimitofLiability/DisclaimerofWarranty
Whilethepublisherandauthorshaveusedtheirbesteffortsinpreparingthiswork,theymakeno representationsorwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthis workandspecificallydisclaimallwarranties,includingwithoutlimitationanyimpliedwarrantiesof merchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedorextendedbysales representatives,writtensalesmaterialsorpromotionalstatementsforthiswork.Thefactthatan organization,website,orproductisreferredtointhisworkasacitationand/orpotentialsourceof furtherinformationdoesnotmeanthatthepublisherandauthorsendorsetheinformationorservices theorganization,website,orproductmayprovideorrecommendationsitmaymake.Thisworkissold withtheunderstandingthatthepublisherisnotengagedinrenderingprofessionalservices.The adviceandstrategiescontainedhereinmaynotbesuitableforyoursituation.Youshouldconsultwith aspecialistwhereappropriate.Further,readersshouldbeawarethatwebsiteslistedinthisworkmay havechangedordisappearedbetweenwhenthisworkwaswrittenandwhenitisread.Neitherthe publishernorauthorsshallbeliableforanylossofprofitoranyothercommercialdamages,including butnotlimitedtospecial,incidental,consequential,orotherdamages.
LibraryofCongressCataloging-in-PublicationData
Names:Xu,Shengjie(Professor),author.|Qian,Yi,1962-author.|Hu, RoseQingyang,author.
Title:Cybersecurityinintelligentnetworkingsystems/ShengjieXu,Yi Qian,RoseQingyangHu.
Description:Chichester,WestSussex,UK:Wiley,[2023]|Includes bibliographicalreferencesandindex.
Identifiers:LCCN2022033498(print)|LCCN2022033499(ebook)|ISBN 9781119783916(hardback)|ISBN9781119784104(adobepdf)|ISBN 9781119784128(epub)
Subjects:LCSH:Computernetworks–Securitymeasures.
Classification:LCCTK5105.59.X872023(print)|LCCTK5105.59(ebook)| DDC005.8–dc23/eng/20220826
LCrecordavailableathttps://lccn.loc.gov/2022033498
LCebookrecordavailableathttps://lccn.loc.gov/2022033499
CoverDesign:Wiley
CoverImage:©jijomathaidesigners/Shutterstock
Setin9.5/12.5ptSTIXTwoTextbyStraive,Chennai,India
Contents
AbouttheAuthors xi
Preface xii
Acknowledgments xiv Acronyms xv
1CybersecurityintheEraofArtificialIntelligence 1
1.1ArtificialIntelligenceforCybersecurity 2
1.1.1ArtificialIntelligence 2
1.1.2MachineLearning 3
1.1.2.1SupervisedLearning 3
1.1.2.2UnsupervisedLearning 3
1.1.2.3Semi-supervisedLearning 4
1.1.2.4ReinforcementLearning 4
1.1.3Data-DrivenWorkflowforCybersecurity 4
1.2KeyAreasandChallenges 5
1.2.1AnomalyDetection 5
1.2.2TrustworthyArtificialIntelligence 6
1.2.3PrivacyPreservation 7
1.3ToolboxtoBuildSecureandIntelligentSystems 8
1.3.1MachineLearningandDeepLearning 8
1.3.1.1NumPy 8
1.3.1.2SciPy 8
1.3.1.3Scikit-learn 8
1.3.1.4PyTorch 8
1.3.1.5TensorFlow 9
1.3.2Privacy-PreservingMachineLearning 9
1.3.2.1Syft 9
1.3.2.2TensorFlowFederated 9
1.3.2.3TensorFlowPrivacy 9
1.3.3AdversarialMachineLearning 9
1.3.3.1SecMLandSecMLMalware 9
1.3.3.2Foolbox 10
1.3.3.3CleverHans 10
1.3.3.4Counterfit 10
1.3.3.5MintNV 10
1.4DataRepositoriesforCybersecurityResearch 10
1.4.1NSL-KDD 10
1.4.2UNSW-NB15 11
1.4.3EMBER 11
1.5Summary 11
Notes 12
References 12
2CyberThreatsandGatewayDefense 17
2.1CyberThreats 17
2.1.1CyberIntrusions 17
2.1.2DistributedDenialofServicesAttack 19
2.1.3MalwareandShellcode 19
2.2GatewayDefenseApproaches 20
2.2.1NetworkAccessControl 20
2.2.2AnomalyIsolation 20
2.2.3CollaborativeLearning 20
2.2.4SecureLocalDataLearning 22
2.3EmergingData-drivenMethodsforGatewayDefense 22
2.3.1Semi-supervisedLearningforIntrusionDetection 22
2.3.2TransferLearningforIntrusionDetection 23
2.3.3FederatedLearningforPrivacyPreservation 23
2.3.4ReinforcementLearningforPenetrationTest 24
2.4CaseStudy:ReinforcementLearningforAutomatedPost-breach PenetrationTest 24
2.4.1LiteratureReview 25
2.4.2ResearchIdea 25
2.4.3TrainingAgentUsingDeep Q-Learning 26
2.5Summary 27
References 27
3EdgeComputingandSecureEdgeIntelligence 31
3.1EdgeComputing 31
3.2KeyAdvancesinEdgeComputing 33
3.2.1Security 33
3.2.2Reliability 35
3.2.3Survivability 36
3.3SecureEdgeIntelligence 36
3.3.1BackgroundandMotivation 37
3.3.2DesignofDetectionModule 38
3.3.2.1DataPre-processing 38
3.3.2.2ModelLearning 38
3.3.2.3ModelUpdating 39
3.3.3ChallengesAgainstPoisoningAttacks 40
3.4Summary 40 References 40
4EdgeIntelligenceforIntrusionDetection 45
4.1EdgeCyberinfrastructure 45
4.2EdgeAIEngine 46
4.2.1FeatureEngineering 47
4.2.2ModelLearning 48
4.2.3ModelUpdate 48
4.2.4PredictiveAnalytics 49
4.3ThreatIntelligence 49
4.4PreliminaryStudy 49
4.4.1Dataset 49
4.4.2EnvironmentalSetup 50
4.4.3PerformanceEvaluation 51
4.4.3.1ComputationalEfficiency 51
4.4.3.2PredictionAccuracy 52
4.5Summary 53 References 53
5RobustIntrusionDetection 55
5.1Preliminaries 55
5.1.1MedianAbsoluteDeviation 55
5.1.2MahalanobisDistance 55
5.2RobustIntrusionDetection 56
5.2.1ProblemFormulation 56
5.2.2Step1:RobustDataPre-processing 57
5.2.3Step2:BaggingforLabeledAnomalies 58
5.2.4Step3:One-classSVMforUnlabeledSamples 58
5.2.4.1One-classClassification 59
5.2.4.2AlgorithmofOptimalSamplingRatioSection 60
5.2.5Step4:TheFinalClassifier 61
5.3ExperimentalandEvaluation 63
5.3.1ExperimentSetup 63
5.3.1.1Datasets 63
5.3.1.2EnvironmentalSetup 64
5.3.1.3EvaluationMetrics 64
5.3.2PerformanceEvaluation 64
5.3.2.1Step1 64
5.3.2.2Step2 65
5.3.2.3Step3 65
5.3.2.4Step4 71
5.4Summary 72 References 72
6EfficientPre-processingSchemeforAnomaly Detection 75
6.1EfficientAnomalyDetection 75
6.1.1RelatedWork 76
6.1.2PrincipalComponentAnalysis 77
6.2ProposedPre-processingSchemeforAnomalyDetection 78
6.2.1RobustPre-processingScheme 79
6.2.2Real-TimeProcessing 80
6.2.3Discussion 82
6.3CaseStudy 83
6.3.1DescriptionoftheRawData 83
6.3.1.1Dimension 83
6.3.1.2Predictors 83
6.3.1.3ResponseVariables 84
6.3.2Experiment 84
6.3.3Results 86
6.4Summary 87 References 87
7PrivacyPreservationintheEraofBigData 91
7.1PrivacyPreservationApproaches 91
7.1.1Anonymization 91
7.1.2DifferentialPrivacy 92
7.1.3FederatedLearning 93
7.1.4HomomorphicEncryption 94
7.1.5SecureMulti-partyComputation 95
7.1.6Discussion 96
7.2Privacy-PreservingAnomalyDetection 96
7.2.1LiteratureReview 97
7.2.2Preliminaries 99
7.2.2.1BilinearGroups 99
7.2.2.2AsymmetricPredicateEncryption 99
7.2.3SystemModelandSecurityModel 99
7.2.3.1SystemModel 100
7.2.3.2SecurityModel 100
7.3ObjectivesandWorkflow 101
7.3.1Objectives 101
7.3.2Workflow 102
7.4PredicateEncryption-BasedAnomalyDetection 103
7.4.1Procedures 103
7.4.2DevelopmentofPredicate 104
7.4.3DeploymentofAnomalyDetection 105
7.5CaseStudyandEvaluation 106
7.5.1Overhead 106
7.5.2Detection 106
7.6Summary 109 References 109
8AdversarialExamples:ChallengesandSolutions 113
8.1AdversarialExamples 113
8.1.1ProblemFormulationinMachineLearning 113
8.1.2CreationofAdversarialExamples 114
8.1.3TargetedandNon-targetedAttacks 114
8.1.4Black-boxandWhite-boxAttacks 115
8.1.5DefensesAgainstAdversarialExamples 115
8.2AdversarialAttacksinSecurityApplications 115
8.2.1Malware 115
8.2.2CyberIntrusions 116
8.3CaseStudy:ImprovingAdversarialAttacksAgainstMalware Detectors 116
8.3.1Background 116
8.3.2AdversarialAttacksonMalwareDetectors 117
8.3.3MalConvArchitecture 118
8.3.4ResearchIdea 119
8.4CaseStudy:AMetricforMachineLearningVulnerabilityto AdversarialExamples 119
8.4.1Background 120
x Contents
8.4.2ResearchIdea 120
8.5CaseStudy:ProtectingSmartSpeakersfromAdversarialVoice Commands 122
8.5.1Background 122
8.5.2Challenges 122
8.5.3DirectionsandTasks 123
8.6Summary 124 References 124
Index 127
AbouttheAuthors
ShengjieXu,PhD,isanassistantprofessorintheManagementInformationSystemsDepartmentatSanDiegoStateUniversity,USA.HeisarecipientoftheIETJournalsPremiumAwardforBestPaperin2020,theMiltonE. MohrGraduateFellowshipAwardfromtheUniversityofNebraska–Lincoln in2017,andtheBestPosterAwardfromtheInternationalConferenceon DesignofReliableCommunicationNetworksin2015.HeservesasaTechnicalEditorfor IEEEWirelessCommunications Magazine.Heholdsmultiple professionalcertificationsincybersecurityandcomputernetworking.
YiQian,PhD,isaprofessorintheDepartmentofElectricalandComputer EngineeringattheUniversityofNebraska–Lincoln,USA.Heisarecipient oftheHenryY.KleinkaufFamilyDistinguishedNewFacultyTeaching Awardin2011,theHollingFamilyDistinguishedTeachingAwardin2012, theHollingFamilyDistinguishedTeaching/Advising/MentoringAward in2018,andtheHollingFamilyDistinguishedTeachingAwardforInnovativeUseofInstructionalTechnologyin2018,allfromtheUniversityof Nebraska–Lincoln,USA.
RoseQingyangHu,PhD,isaprofessorintheDepartmentofElectricaland ComputerEngineeringandAssociateDeanofResearchintheCollegeof EngineeringatUtahStateUniversity,USA.Sheisarecipientofoutstanding facultyresearcheroftheyearin2014and2016andoutstandinggraduate mentoroftheyearin2022,allfromUtahStateUniversity,USA.Sheisa FellowofIEEE,IEEEComSocDistinguishedLecturer2015–2018,IEEEVTS DistinguishedLecturer2020–2022.
Preface
Nowadays,maliciousattacksandemergingcyberthreatshavebeeninducingcatastrophicdamagetocriticalinfrastructureandcausingwidespread outages.Therearethreemajortypesofcyberattacksthatarecompromisingmodernnetworkingsystems:(i) AttackstargetingConfidentiality intend toacquireunauthorizedinformationfromnetworkresources;(ii) Attacks targetingIntegrity aimatdeliberatelyandillegallymodifyingordisrupting dataexchange;and(iii) AttackstargetingAvailability attempttodelay,block orcorruptservicedelivery.Confidentiality,integrity,andavailabilityarethe threepillarsofcybersecurity.Itisurgenttodefendcriticalnetworkingsystemsagainstanyformsofcyberthreatsfromadversaries.
Therapidandsuccessfuladvancesofintelligentdiscoveriesoffersecurityresearchersandpractitionersnewplatformstoinvestigatechallenging issuesemerginginseveralnetworkingsystems.Thoseintelligentsolutions willboosttheefficiencyandeffectivenessofmultiplecriticalsecurityapplications.Motivatedbythecurrenttechnologicaladvances,thisbookintends toofferthecurrentresearchchallengesinthefieldofcybersecurity,aswellas somenovelsecuritysolutionsthatmakecriticalnetworkingsystemssecure, robust,andintelligent.Specifically,thebookfocusesoncybersecurityandits intersectionswithartificialintelligence,machinelearning,edgecomputing, andprivacypreservation.Thereareeightchaptersinthebook.
Chapter1dealswithcybersecurityintheeraofartificialintelligenceand machinelearning.Thechapterfirstintroducestheconceptsofartificial intelligenceandmachinelearning.Itthenillustratessomekeyadvances andchallengesincybersecurity,includinganomalydetection,trustworthy artificialintelligence,andprivacypreservation.Toolboxtobuildsecureand intelligentsystemsisthenpresented.Thechapterthendemonstratesafew datarepositoriesforcybersecurityresearch.
Chapter2dealswithcyberthreatsanddefensemechanisms.Thechapter firstillustratesmultipleeffectivegatewaydefensemethodsagainstcyber threats.Itthenpresentsaresearchstudythatinnovatesreinforcement learningforpenetrationtest.
Chapter3dealswithedgecomputing.Edgecomputingispresentedto highlightitskeyadvancesanduniquecapabilitiesincommunicationnetworks.Thechapterthenillustratestheconceptofsecureedgeintelligence.
Chapter4dealswithedgeintelligenceforintrusiondetection.Thesystematicdesignofedgeintelligenceisfirstpresented.Threemainmodules inedgeintelligenceareillustrated.Thechapterthendemonstratesacase studyincludingexperimentandevaluation.
Chapter5dealswitharobustintrusiondetectionscheme.Thepreliminariesofrobuststatisticsarefirstintroduced.Thechapterthenpresentsthe detailsoftheproposedscheme.Anexperimentalstudyandevaluationare thendemonstrated.
Chapter6dealswithanefficientprocessingschemeforanomalydetection.Afewrelatedstudiesandbackgroundofprincipalcomponentanalysis arefirstintroduced.Itthenpresentstheproposedefficientpreprocessing schemeforanomalydetection,whoseobjectiveistoachievehighdetectionaccuracywhilelearningfromthepreprocesseddata.Thechapterthen demonstratesacasestudyincludingexperimentandevaluation.
Chapter7dealswithprivacypreservationintheeraofbigdata.Afew modernprivacy-preservingapproachesarefirstillustrated.Itthenpresents aproposedschemethatfocusesondetectinganomalousbehaviorsina privacy-preservingway.Thechapteroffersanexperimentalstudyand evaluation.
Chapter8dealswithadversarialexamplesandadversarialmachinelearning.Theconceptofadversarialexamplesanditschallengesarefirstintroduced.Threeresearchstudiesinadversarialexamplesarethenpresented frombothoffensiveanddefensiveperspectives.
Wehopethatourreaderswillenjoythisbook.
ShengjieXu,SanDiegoStateUniversity YiQian,UniversityofNebraska–Lincoln RoseQingyangHu,UtahStateUniversity
Acknowledgments
First,wewouldliketothankourfamiliesfortheirloveandsupport.
WewouldliketothankourcolleaguesandstudentsatDakotaState University,UniversityofNebraska-Lincoln,UtahStateUniversity,andSan DiegoStateUniversityfortheirsupportandenthusiasminthisbookproject andthistopic.
WeexpressourthankstothestaffatWileyfortheirsupport.Wewouldlike tothankSandraGrayson,JulietBooker,andBeckyCowanfortheirpatience inhandlingpublicationissues.
ThisbookprojectwaspartiallysupportedbytheU.S.NationalScience FoundationundergrantsCNS-1423348,CNS-1423408,EARS-1547312,and EARS-1547330.
Acronyms
ABEattributedbasedencryption
AEadversarialexamples
AESAdvancedEncryptionStandard
AIartificialintelligence
AMLadversarialmachinelearning
APIapplicationprogramminginterface
APTadvancedpersistentthreats
ASRautomaticspeechrecognition
CDNcontentdeliverynetwork
CPScyberphysicalsystem
CPUcentralprocessingunit
CSVcomma-separatedvalues
DBSCANdensity-basedspatialclusteringofapplicationswithnoise
DDOSdistributeddenialofservice
DLdeeplearning
DNNdeepneuralnetwork
DOSdenialofservice
DPdifferentialprivacy
FGSMfastgradientsignmethod
FLfederatedlearning
GANgenerativeadversarialnetworks
GDPRGeneralDataProtectionRegulation
GPUgraphicsprocessingunit
HEhomomorphicencryption
ICTinformationandcommunicationtechnology
IDSintrusiondetectionsystem
IOTInternetofThings
IPInternetProtocol
xvi Acronyms
IQRinterquartilerange
JSONJavaScriptobjectnotation
LANlocalareanetwork
LDAlineardiscriminantanalysis
MADmedianabsolutedeviation
MDMahalanobisdistance
MERmeanerrorrate
MLmachinelearning
NIDSnetworkintrusiondetectionsystem
NISTNationalInstituteofStandardsandTechnology
ODEordinarydifferentialequations
PCprincipalcomponent
PCAprincipalcomponentanalysis
PEportableexecutable
POMDPpartiallyobservableMarkovdecisionprocess
PVEproportionofvarianceexplained
QOEqualityofexperience
RAMrandomaccessmemory
SMPCsecuremulti-partycomputation
TAtrustedauthority
TCPtransmissioncontrolprotocol
TPUtensorprocessingunit
CybersecurityintheEraofArtificialIntelligence
Therapidandsuccessfuladvancesofartificialintelligence(AI)andmachine learning(ML)offersecurityresearchersandpractitionersnewapproaches andplatformstoexploreandinvestigatechallengingissuesemergingin manysafety-criticalsystems.ThoseAI/ML-enabledsolutionshaveboosted theefficiencyandeffectivenessofmultipleimportantsecurityapplications. Forexample,recentadvancesinAIandMLhavebeenwidelyappliedin intrusiondetectionsystem(IDS)(Xuetal.,2017,2019a,b,2020),malware detectionsystem(BradleyandXu,2021;Bradley,2022;AhmedandXu, 2022),andpenetrationtesting(Chaudharyetal.,2020).
However,theriseofAIandMLisoftenconsideredasa“double-edged sword.”WhileAIandMLcanbeadoptedtoidentifythreatsmoreaccurately andpreventcyberattacksmoreefficiently,cybersecurityprofessionalsmust respondtotheincreasinglysophisticatedmotivationsfromadversaries. Modernintelligentnetworkingsystemshavebeenmaliciouslymanipulated,evaded,andmisled,causingsignificantsecurityincidentsinfinancial systems,cyber-physicalsystems,andmanyothercriticaldomains.Threat actorsandadversarialattackershavebeenapplyingtechniquestocarryout adversarialattackstargetingvariousAI/ML-enablednetworkingsystems (BurrandXu,2021;Burr,2022).Forinstance,anadversarycaninject well-designedaudiosignalstoconfusethevoicerecognitionsystemsin smartspeakerstodeliverrandomnoises,orcompromisingtheself-driving vehiclesbycreatingvisualalterationsofthestopsign,leavingtheML modelerroneouslyidentifyastopsignasaspeedlimitsignwith 70 miles perhour(mph)(Yuanetal.,2019).Thoseadversarialattackscouldlead tounauthorizeddisclosureofsensitiveinformation,affectthesafetyand wellnessofusers,andthwartInternetfreedom.Therefore,cybersecurity professionalsmustevolverapidlyastechnologyadvancesandnewcyber threatsemerge.
CybersecurityinIntelligentNetworkingSystems,FirstEdition. ShengjieXu,YiQian,andRoseQingyangHu. ©2023JohnWiley&SonsLtd.Published2023byJohnWiley&SonsLtd.
1.1ArtificialIntelligenceforCybersecurity
TheconceptsofAIandMLarefirstlyintroduced,followedbythedata-driven workflowforcybersecuritytasks.
1.1.1ArtificialIntelligence
Thephrase AI ispopularlydiscussedworldwide.Nowadays,AIgenerally referstothesimulationofhumanintelligentbehaviorbycomputational modelstomakedecisions,anditisarapidlyevolvingfieldofstudy,research, andapplicationthatisbeingusedtoimproveeconomicdevelopment,modernhumanlifestyle,andnationalsecurity.Alongwithrecenttechnological advances,AIisusedforinnovationinvariouscriticaldomains,suchas robotics,manufacturing,business,finance,andmanyothers.
AIapplicationsareprimarilyenabledby ML,whichisconsideredasthe pillarofAI’ssuccess.ManyorganizationstreatMLasthemainapproach toimplementAIapplications.Itisanexcitingfieldinvolvingmultiple subjects,includingstatistics,computerscience,businessmanagement, linguistics,andmore.Traditionallyspeaking,MLreferstotheprocessof learningandunderstandingfromhistoricaldata,miningandextractingthe valuableinformationbyrecognizingthepatternandrelationship,making decisions,andforecastingoutcomes,trends,andbehaviors.Itinvolves avastsetofstatisticalmodelsandtools,includinggeneralizedlinear models,tree-basedmethods,neuralnetworks,supportvectormachines, andnearestneighbors.Nowadays,MLisboostedbyBigData,massive computingpower,andadvancedlearningmodels.Inatechnicalarticle (Copeland,2018),theauthorusesaVenndiagramtodescribeAI,ML,deep learning(DL),andtheirrelationship.InFigure1.1,thebroadconceptofAI includingMLandDLisdisplayed.Currently,DLisleadingthefieldofAI andML,andithasmadeasignificantnumberofprogressesinavarietyof MLdomains,suchasimageclassification,speechrecognition,andobject recognition.
Figure1.1 Artificial intelligence,machinelearning, anddeeplearning.
Table1.1 Exampleofahousepricedataset. (x0 )(x1 )(x2 )(x3 ) … (y )
1.1.2MachineLearning
MLofferscomputerstolearnbyminingmassivedatasets.Here,fourbroad categoriesofMLaredescribed.Theyaresupervisedlearning,unsupervised learning,semi-supervisedlearning,andreinforcementlearning.
1.1.2.1SupervisedLearning
MostoftheMLproblemsfallintosupervisedorunsupervised.Forinstance, thereisahousepricingdataset(Table1.1),inwhicheachrow(observation) representsahouseandeachcolumn(feature)representsanattribute(e.g. numberofbedrooms).Foreachobservation,anassociatedtargetvalueis shown.Here,theobjectiveistobuildamodelthatcapturestherelationshipbetweenthetargetvalue y (price)andtheattributes(x0 , … , x3 )sothat accuratepredictionsforfutureobservationscanbeachieved. Supervisedlearningaddressesthistypeofproblembytrainingthemodel withfeaturesandlabeleddata(y).Asupervisedlearningmodeltakesasetof knowninputdata(features)andknownoutputdata(response/target)and trainsamodeltomakereasonablepredictionsfortheresponsetonewdata. Regressionandclassificationarethemaincategoriesforsupervisedlearning problems.Inregressionproblems,therearemanyclassicalmodelsavailable fortraining,includinglinearregression,ordinalregression,andneuralnetworkregression.Inclassificationproblems,therearealsomanyclassical modelsavailablefortraining,includinglogisticregression,tree-basedmethods,supportvectormachine,randomforest,andboostingmethods.
1.1.2.2UnsupervisedLearning
Unsupervisedlearningtrainsthemodelwithunlabeleddata.Itsgoalis tounveilthepatternsinthedata.Unsupervisedlearningservesasagood
1CybersecurityintheEraofArtificialIntelligence
approachtosimplifythedatabyreducingthedimensionality,finding similargroups,andperceivingintrinsicstructures.Clusteringanddimensionalityreductionarethemaincategoriesforunsupervisedlearning problems.Inclusteringproblems,therearemanyclassicalmodelsavailablefortraining,including K -means,Density-BasedSpatialClustering ofApplicationswithNoise(DBSCAN),andhierarchicalclustering.In dimensionalityreductionproblems,therearealsomanyclassicalmodels availablefortraining,includingprincipalcomponentanalysis(PCA)and lineardiscriminantanalysis(LDA).
1.1.2.3Semi-supervisedLearning
Semi-supervisedlearningdealswithpartiallylabeleddata,whichtypically consistofasmallamountoflabeledandalargeamountofunlabeleddata. Itfallsbetweensupervisedlearning,wherecompletelylabeleddataare needed,andunsupervisedlearning,wherenolabeleddataareneeded. Thetrainedmodelfromsemi-supervisedlearningcanbehighlyaccurate. Semi-supervisedlearningisalsowidelyappliedinthefieldofcybersecurity, especiallyinanomalydetection.
1.1.2.4ReinforcementLearning
ReinforcementlearningisauniqueMLparadigm.Thismodellearnsaseries ofactionsbymaximizingaRewardFunction f .Thefunction f canbemaximizedbypenalizing“badaction”and/orrewarding“goodaction.”Inthe reinforcementlearningsetting,anagenttakesactionsinanenvironment thatistreatedasarewardandarepresentationofthestatethatisfedback totheagent.Therearemanypopularexamplesthatareenabledbyreinforcementlearning,suchasself-drivingvehicles(Gyawalietal.,2020)and AlphaGo(Silveretal.,2017).
1.1.3Data-DrivenWorkflowforCybersecurity
Inthefieldofcybersecurity,data-drivenmethodsareplayingacrucialrole incybersecuritytasks,suchasthreatintelligence,riskanalysis,vulnerabilitytesting,anddefenseagainstadversarialbehaviors.Figure1.2presentsthe generaldata-drivenworkflowtosolvecybersecurityproblems.Thefirststep startsfromformulatingaconcretesecurityproblemandjustifyingtheneed toapplydata-drivenmethods.Forexample,securitypractitionerscanstart definingaproblemaboutintrusiondetection,analyzethepossibleoutputs giventheinputs,andthenarguewhetherdata-drivenmethodsareappropriatetoautomatethetask.Inthesecondstep,datacollectionandpreprocessingareconducted.Dataacquisitionisessential,anditisimportanttoassure
and
solutions
Model design and learning
Figure1.2 Data-drivenworkflowforcybersecurity.
thatnotonlysufficientdataarecollectedbutalsolabelinganddatasampling arecorrectandunbiased.Inthethirdstep,MLandotherstatisticalmodels aredesignedandtrained.Itisimportanttoassurethatthetrainedmodel canbegeneralizedwelltofuturedataandevenunseendata.Inthefourth step,performanceevaluationisconductedtoassessthequalityofthetrained model.Modelassessmentshouldbecarriedoutbyusingsuitablebaseline dataandappropriatemetrics.Inthefifthstep,modeldeploymentisperformed.Itiscrucialtonotethatthedeploymentshouldperformwelloutside ofalabenvironment,anditshouldalsoworkwellunderdifferentsettings invariousthreatmodels.Lastly,amatureandrobustsecuritysolutionfor learning-basedIDSisreleased.
1.2KeyAreasandChallenges
Theresearchcommunitieshavebeenactivelyexploringboththeoffense anddefensesidesofdata-drivencybersecurity.Asmoreachievementsare accomplished,severalresearchchallengesthatemergerapidlyarepending tobesolved.Here,threeaspectsaredescribed.Theyareanomalydetection, trustworthyAI,andprivacypreservation.
1.2.1AnomalyDetection
Inthecontextofcybersecurity,anomaliesrefertothoseabnormalbehaviorsthatharmtheinformationsystems.Todefendagainstthem,anomaly
1CybersecurityintheEraofArtificialIntelligence
detectionfocusesonidentifyinganomalousbehaviorsduringaperiodof operations.Thiscanbeextendedtoafewusecasesinsecurity,suchasintrusiondetection,malwaredetection,phishingdetection,spamdetection,and defenseagainstzero-dayattack.
Thegoalofintrusiondetectionistoexaminetrafficdataandclassifynormalactivitiesandattackbehaviors(Xuetal.,2019).Moretopicsaboutcyber intrusionsandintrusiondetectionwillbecoveredindetailinChapters2,4, 5,and6.
Malwarehasbeenneartheforefrontofmoderncybersecurityissues (AhmedandXu,2022).Thedetectionandpreventionofmalwarehas becomeamajorchallenge.Inmanycases,antivirustoolssuchasWindows DefenderutilizeMLtoscanfilesanddetectmaliciouspatterns.With therecenttrendofransomwarecasesaroundtheworld,ithasbecome moreimportantthanevertohaveeffectiveanti-malwaretokeepusers andorganizationssafe.Modernmalwarecantakemanyforms.Itcan beembeddedindocumentmacros,runasshellcode,andmuchmore. Malwareisalsoversatileinthetaskingsitcancomplete.Itcanimplanta command-and-controlserver(C2)beacon,installakeylogger,orransoma computerbyencryptingallofthefiles.Inadditiontocompletingmalicious tasks,malwarehasasecondaryobjectiveofavoidingdetectionbydisguising itselfasavalidprocessonacomputerandobfuscatethedetector.
Phishingisatacticthatisusedbythreatactorstoachievetheirgoals,such asobtainingcredentialsofemployeesordeliveringmalware(Khonjietal., 2013).Withtheprevalenceofthesetypesofattacks,itisimportanttodetect andstoptheattacksatanypointintheirlifecycle.Beingabletodetectthat awebsiteislikelyaphishingsitecouldbeausefultoolinmitigatingthe successoftheattacks.
Inrecentyears,thepoisoningattackhasbecomeanewformofattack tocompromisenetworkingsystemsandthelearningmodelstheyintegrate. Datapoisoningattackistheintentionalactofpollutingdatathatthealgorithmsneedtotrain(Huangetal.,2021).Itcanimpactorganizationsaswell asindividualsinanegativemanner.Somereal-lifeexamplesofthistypeof attackincludeanattackerchangingwhatanemailspamfiltermightmark asspam.Thiswouldallowanattackertosendanykindofemailtheywant withoutbeingflagged.Similarly,therearefirewalltoolsthatuseMLtomonitornetworktrafficandlookformalwareenteringthenetwork.Anattacker coulduseasimilarmethodtoevadedetection.
1.2.2TrustworthyArtificialIntelligence
MostoftheAI/MLmodelsfacetheissueofbeingtrustworthybecause theyarevulnerabletovariouskindsofattacks(e.g.adversarialexamples)
(Lietal.,2021a,b).Thisisbecausetheyarenotyetexplainableowingtothe black-boxnatureofmanyAI/MLmodels,especiallyDLmodels(Zouetal., 2021),andtheiruncertaintyhasnotbeenquantified(Lietal.,2021c).This highlightstheimportanceofadditionalresearchactivitiesinthetrustworthinessofAI/MLmodels.Moreover,morestudiesareneededtosystematicallyunderstandthetrustworthinessofAI/MLmodels,aswellastoadvance thestate-of-the-arttrustworthiness,robustness,andinterpretability ofAI/ML.
Currently,researchchallengesoftrustworthyAIandadversarialML include,butnotlimitedto,thefollowingaspects:quantifyingtrustworthinessofAI/MLmethodsagainstsophisticatedattacks,suchasadversarial examples,poisoningattacks,orevasiveattacks;determiningtheconditions towhethertrustAI/MLmodelsornot;explainingandinterpretingrecommendationsmadebyAI/MLmodels;detectingandmitigatingadversarial examplesagainstAI/MLmodels;andenhancingtrustworthinessofAI/ML modelsbyincorporatingcountermeasures.TrustworthyAIandadversarial MLwillbecoveredindetailinChapter8.
1.2.3PrivacyPreservation
Theprotectionofsensitivedataiscritical.Inthefieldofhealthstudy,the dataprivacyproblemisgrowingbecauseofthechallengesindataprivacy regulations,privacyleakagebyattackers,andthepervasivedatamining operations.Inonestudy(Naetal.,2018),theauthorsdiscussedthatby utilizinglargenationalphysicalactivitydatasets,thechildrenandadults werereidentifiedbyMLwhen20minutedatawithseveralpiecesof demographicinformationwereused.Unfortunately,moreincidentsare reportedworldwideregardingprivacyleakage.Withtheenforcementof GeneralDataProtectionRegulation(GDPR)(VoigtandVondemBussche, 2017),companiesandorganizationswillneedtocomplywithspecificterms andconditionstoprotectthedataprivacyofEuropeanUnioncitizens.
Theincreasingattentiononsecurityandprivacyhasmotivatedthe rapiddesignandimplementationofmultipleprivacy-preservingmethods. Forinstance,federatedlearning(FL)wasdesignedtotrainMLmodels acrossmultipleendnodeswithoutsharingthelocaldatatoacentralized server.OnearticlesharedthatGooglehasimplementedamobileapplication thatoffersprivacy-preservingwordprediction-basedFL(Hardetal.,2018).
Thereareafewmodernapproachesforprivacypreservation,including anonymization,differentialprivacy(DP),homomorphicencryption(HE), andsecuremulti-partycomputation(SMPC).TheseapproacheswillbecoveredindetailinChapter7.
1.3ToolboxtoBuildSecureandIntelligent Systems
Afewscientificcomputingtoolsarecommonlyutilizedasthetoolboxto buildintelligentsystems.Here,afewPythonlibrariesaredescribed.They areforMLandDL,privacy-preservingML,andadversarialML.
1.3.1MachineLearningandDeepLearning
FivePythonlibrariesarewidelyusedforMLandscientificcomputing.They areNumPy,1 SciPy,2 scikit-learn,3 PyTorch,4 andTensorFlow.5
1.3.1.1NumPy
NumPyisthefundamentallibraryforscientificcomputinginPython.It isaPythonlibrarythatprovidesamulti-dimensionalarrayobject,various derivedobjects(suchasmaskedarraysandmatrices),andanassortmentof routinesforfastoperationsonarrays,includingmathematical,logical,shape manipulation,sorting,selecting,I/O,discreteFouriertransforms,basiclinearalgebra,basicstatisticaloperations,randomsimulation,andmuchmore (Harrisetal.,2020).
1.3.1.2SciPy
SciPyisaPythonlibraryformathematics,science,andengineering.It includesmodulesforstatistics,optimization,integration,linearalgebra, Fouriertransforms,signalandimageprocessing,ordinarydifferential equations(ODE)solvers,andmore(Virtanenetal.,2020).SciPyisbuilt toworkwithNumPyarraysandprovidesmanyuser-friendlyandefficientnumericalroutines,suchasroutinesfornumericalintegrationand optimization.
1.3.1.3Scikit-learn
Scikit-learnisaPythonlibraryforMLbuiltontopofSciPy(Pedregosaetal., 2011).ItoffersmultiplemodulesforMLtasks,includingclassification, regression,clustering,dimensionalityreduction,modelselection,and preprocessing.
1.3.1.4PyTorch
PyTorchisanoptimizedtensorlibraryforDL.Itprovidesdeepneuralnetworksbuiltonanautomaticdifferentiationsystem(autograd),aswellastensorcomputationwithstronggraphicsprocessingunit(GPU)acceleration.