Instant Access to Cybersecurity in intelligent networking systems 1st edition shengjie xu ebook Full

Page 1


https://ebookmass.com/product/cybersecurity-in-intelligentnetworking-systems-1st-edition-shengjie-xu/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

The AI Revolution in Networking, Cybersecurity, and Emerging Technologies Omar Santos

https://ebookmass.com/product/the-ai-revolution-in-networkingcybersecurity-and-emerging-technologies-omar-santos/

ebookmass.com

Demystifying Intelligent Multimode Security Systems: An Edge-to-Cloud Cybersecurity Solutions Guide Jody Booth

https://ebookmass.com/product/demystifying-intelligent-multimodesecurity-systems-an-edge-to-cloud-cybersecurity-solutions-guide-jodybooth/

ebookmass.com

Intelligent Systems and Learning Data Analytics in Online Education: A volume in Intelligent Data-Centric Systems Santi Caballé

https://ebookmass.com/product/intelligent-systems-and-learning-dataanalytics-in-online-education-a-volume-in-intelligent-data-centricsystems-santi-caballe/ ebookmass.com

Designing Control Loops for Linear and Switching Power Supplies: A Tutorial Guide – Ebook PDF Version

https://ebookmass.com/product/designing-control-loops-for-linear-andswitching-power-supplies-a-tutorial-guide-ebook-pdf-version/

ebookmass.com

Media Ethics: Key Principles for Responsible Practice (Ebook PDF)

https://ebookmass.com/product/media-ethics-key-principles-forresponsible-practice-ebook-pdf/

ebookmass.com

Why Am I Afraid to Love? John Powell

https://ebookmass.com/product/why-am-i-afraid-to-love-john-powell/

ebookmass.com

Discrepant solace. Contemporary literature and the work of consolation. James

https://ebookmass.com/product/discrepant-solace-contemporaryliterature-and-the-work-of-consolation-james/

ebookmass.com

God, Grades, and Graduation: Religion's Surprising Impact on Academic Success Ilana M. Horwitz

https://ebookmass.com/product/god-grades-and-graduation-religionssurprising-impact-on-academic-success-ilana-m-horwitz/

ebookmass.com

Campbell’s Physical Therapy for Children Expert Consult E Book (Ebook PDF)

https://ebookmass.com/product/campbells-physical-therapy-for-childrenexpert-consult-e-book-ebook-pdf/

ebookmass.com

Wide Bandgap Semiconductor Power Devices_ Materials, Physics, Design, and Applications B. Jayant Baliga

https://ebookmass.com/product/wide-bandgap-semiconductor-powerdevices_-materials-physics-design-and-applications-b-jayant-baliga/

ebookmass.com

CybersecurityinIntelligentNetworking Systems

ShengjieXu SanDiegoStateUniversity,USA

YiQian UniversityofNebraska-Lincoln,USA

RoseQingyangHu UtahStateUniversity,USA

Thiseditionfirstpublished2023 ©2023JohnWiley&SonsLtd

Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,or transmitted,inanyformorbyanymeans,electronic,mechanical,photocopying,recordingor otherwise,exceptaspermittedbylaw.Adviceonhowtoobtainpermissiontoreusematerialfromthis titleisavailableathttp://www.wiley.com/go/permissions.

TherightofShengjieXu,YiQian,andRoseQingyangHutobeidentifiedastheauthorsofthiswork hasbeenassertedinaccordancewithlaw.

RegisteredOffices

JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA

JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UK

Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWiley productsvisitusatwww.wiley.com.

Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Some contentthatappearsinstandardprintversionsofthisbookmaynotbeavailableinotherformats.

Trademarks:WileyandtheWileylogoaretrademarksorregisteredtrademarksofJohnWiley&Sons, Inc.and/oritsaffiliatesintheUnitedStatesandothercountriesandmaynotbeusedwithoutwritten permission.Allothertrademarksarethepropertyoftheirrespectiveowners.JohnWiley&Sons,Inc. isnotassociatedwithanyproductorvendormentionedinthisbook.

LimitofLiability/DisclaimerofWarranty

Whilethepublisherandauthorshaveusedtheirbesteffortsinpreparingthiswork,theymakeno representationsorwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthis workandspecificallydisclaimallwarranties,includingwithoutlimitationanyimpliedwarrantiesof merchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedorextendedbysales representatives,writtensalesmaterialsorpromotionalstatementsforthiswork.Thefactthatan organization,website,orproductisreferredtointhisworkasacitationand/orpotentialsourceof furtherinformationdoesnotmeanthatthepublisherandauthorsendorsetheinformationorservices theorganization,website,orproductmayprovideorrecommendationsitmaymake.Thisworkissold withtheunderstandingthatthepublisherisnotengagedinrenderingprofessionalservices.The adviceandstrategiescontainedhereinmaynotbesuitableforyoursituation.Youshouldconsultwith aspecialistwhereappropriate.Further,readersshouldbeawarethatwebsiteslistedinthisworkmay havechangedordisappearedbetweenwhenthisworkwaswrittenandwhenitisread.Neitherthe publishernorauthorsshallbeliableforanylossofprofitoranyothercommercialdamages,including butnotlimitedtospecial,incidental,consequential,orotherdamages.

LibraryofCongressCataloging-in-PublicationData

Names:Xu,Shengjie(Professor),author.|Qian,Yi,1962-author.|Hu, RoseQingyang,author.

Title:Cybersecurityinintelligentnetworkingsystems/ShengjieXu,Yi Qian,RoseQingyangHu.

Description:Chichester,WestSussex,UK:Wiley,[2023]|Includes bibliographicalreferencesandindex.

Identifiers:LCCN2022033498(print)|LCCN2022033499(ebook)|ISBN 9781119783916(hardback)|ISBN9781119784104(adobepdf)|ISBN 9781119784128(epub)

Subjects:LCSH:Computernetworks–Securitymeasures.

Classification:LCCTK5105.59.X872023(print)|LCCTK5105.59(ebook)| DDC005.8–dc23/eng/20220826

LCrecordavailableathttps://lccn.loc.gov/2022033498

LCebookrecordavailableathttps://lccn.loc.gov/2022033499

CoverDesign:Wiley

CoverImage:©jijomathaidesigners/Shutterstock

Setin9.5/12.5ptSTIXTwoTextbyStraive,Chennai,India

Contents

AbouttheAuthors xi

Preface xii

Acknowledgments xiv Acronyms xv

1CybersecurityintheEraofArtificialIntelligence 1

1.1ArtificialIntelligenceforCybersecurity 2

1.1.1ArtificialIntelligence 2

1.1.2MachineLearning 3

1.1.2.1SupervisedLearning 3

1.1.2.2UnsupervisedLearning 3

1.1.2.3Semi-supervisedLearning 4

1.1.2.4ReinforcementLearning 4

1.1.3Data-DrivenWorkflowforCybersecurity 4

1.2KeyAreasandChallenges 5

1.2.1AnomalyDetection 5

1.2.2TrustworthyArtificialIntelligence 6

1.2.3PrivacyPreservation 7

1.3ToolboxtoBuildSecureandIntelligentSystems 8

1.3.1MachineLearningandDeepLearning 8

1.3.1.1NumPy 8

1.3.1.2SciPy 8

1.3.1.3Scikit-learn 8

1.3.1.4PyTorch 8

1.3.1.5TensorFlow 9

1.3.2Privacy-PreservingMachineLearning 9

1.3.2.1Syft 9

1.3.2.2TensorFlowFederated 9

1.3.2.3TensorFlowPrivacy 9

1.3.3AdversarialMachineLearning 9

1.3.3.1SecMLandSecMLMalware 9

1.3.3.2Foolbox 10

1.3.3.3CleverHans 10

1.3.3.4Counterfit 10

1.3.3.5MintNV 10

1.4DataRepositoriesforCybersecurityResearch 10

1.4.1NSL-KDD 10

1.4.2UNSW-NB15 11

1.4.3EMBER 11

1.5Summary 11

Notes 12

References 12

2CyberThreatsandGatewayDefense 17

2.1CyberThreats 17

2.1.1CyberIntrusions 17

2.1.2DistributedDenialofServicesAttack 19

2.1.3MalwareandShellcode 19

2.2GatewayDefenseApproaches 20

2.2.1NetworkAccessControl 20

2.2.2AnomalyIsolation 20

2.2.3CollaborativeLearning 20

2.2.4SecureLocalDataLearning 22

2.3EmergingData-drivenMethodsforGatewayDefense 22

2.3.1Semi-supervisedLearningforIntrusionDetection 22

2.3.2TransferLearningforIntrusionDetection 23

2.3.3FederatedLearningforPrivacyPreservation 23

2.3.4ReinforcementLearningforPenetrationTest 24

2.4CaseStudy:ReinforcementLearningforAutomatedPost-breach PenetrationTest 24

2.4.1LiteratureReview 25

2.4.2ResearchIdea 25

2.4.3TrainingAgentUsingDeep Q-Learning 26

2.5Summary 27

References 27

3EdgeComputingandSecureEdgeIntelligence 31

3.1EdgeComputing 31

3.2KeyAdvancesinEdgeComputing 33

3.2.1Security 33

3.2.2Reliability 35

3.2.3Survivability 36

3.3SecureEdgeIntelligence 36

3.3.1BackgroundandMotivation 37

3.3.2DesignofDetectionModule 38

3.3.2.1DataPre-processing 38

3.3.2.2ModelLearning 38

3.3.2.3ModelUpdating 39

3.3.3ChallengesAgainstPoisoningAttacks 40

3.4Summary 40 References 40

4EdgeIntelligenceforIntrusionDetection 45

4.1EdgeCyberinfrastructure 45

4.2EdgeAIEngine 46

4.2.1FeatureEngineering 47

4.2.2ModelLearning 48

4.2.3ModelUpdate 48

4.2.4PredictiveAnalytics 49

4.3ThreatIntelligence 49

4.4PreliminaryStudy 49

4.4.1Dataset 49

4.4.2EnvironmentalSetup 50

4.4.3PerformanceEvaluation 51

4.4.3.1ComputationalEfficiency 51

4.4.3.2PredictionAccuracy 52

4.5Summary 53 References 53

5RobustIntrusionDetection 55

5.1Preliminaries 55

5.1.1MedianAbsoluteDeviation 55

5.1.2MahalanobisDistance 55

5.2RobustIntrusionDetection 56

5.2.1ProblemFormulation 56

5.2.2Step1:RobustDataPre-processing 57

5.2.3Step2:BaggingforLabeledAnomalies 58

5.2.4Step3:One-classSVMforUnlabeledSamples 58

5.2.4.1One-classClassification 59

5.2.4.2AlgorithmofOptimalSamplingRatioSection 60

5.2.5Step4:TheFinalClassifier 61

5.3ExperimentalandEvaluation 63

5.3.1ExperimentSetup 63

5.3.1.1Datasets 63

5.3.1.2EnvironmentalSetup 64

5.3.1.3EvaluationMetrics 64

5.3.2PerformanceEvaluation 64

5.3.2.1Step1 64

5.3.2.2Step2 65

5.3.2.3Step3 65

5.3.2.4Step4 71

5.4Summary 72 References 72

6EfficientPre-processingSchemeforAnomaly Detection 75

6.1EfficientAnomalyDetection 75

6.1.1RelatedWork 76

6.1.2PrincipalComponentAnalysis 77

6.2ProposedPre-processingSchemeforAnomalyDetection 78

6.2.1RobustPre-processingScheme 79

6.2.2Real-TimeProcessing 80

6.2.3Discussion 82

6.3CaseStudy 83

6.3.1DescriptionoftheRawData 83

6.3.1.1Dimension 83

6.3.1.2Predictors 83

6.3.1.3ResponseVariables 84

6.3.2Experiment 84

6.3.3Results 86

6.4Summary 87 References 87

7PrivacyPreservationintheEraofBigData 91

7.1PrivacyPreservationApproaches 91

7.1.1Anonymization 91

7.1.2DifferentialPrivacy 92

7.1.3FederatedLearning 93

7.1.4HomomorphicEncryption 94

7.1.5SecureMulti-partyComputation 95

7.1.6Discussion 96

7.2Privacy-PreservingAnomalyDetection 96

7.2.1LiteratureReview 97

7.2.2Preliminaries 99

7.2.2.1BilinearGroups 99

7.2.2.2AsymmetricPredicateEncryption 99

7.2.3SystemModelandSecurityModel 99

7.2.3.1SystemModel 100

7.2.3.2SecurityModel 100

7.3ObjectivesandWorkflow 101

7.3.1Objectives 101

7.3.2Workflow 102

7.4PredicateEncryption-BasedAnomalyDetection 103

7.4.1Procedures 103

7.4.2DevelopmentofPredicate 104

7.4.3DeploymentofAnomalyDetection 105

7.5CaseStudyandEvaluation 106

7.5.1Overhead 106

7.5.2Detection 106

7.6Summary 109 References 109

8AdversarialExamples:ChallengesandSolutions 113

8.1AdversarialExamples 113

8.1.1ProblemFormulationinMachineLearning 113

8.1.2CreationofAdversarialExamples 114

8.1.3TargetedandNon-targetedAttacks 114

8.1.4Black-boxandWhite-boxAttacks 115

8.1.5DefensesAgainstAdversarialExamples 115

8.2AdversarialAttacksinSecurityApplications 115

8.2.1Malware 115

8.2.2CyberIntrusions 116

8.3CaseStudy:ImprovingAdversarialAttacksAgainstMalware Detectors 116

8.3.1Background 116

8.3.2AdversarialAttacksonMalwareDetectors 117

8.3.3MalConvArchitecture 118

8.3.4ResearchIdea 119

8.4CaseStudy:AMetricforMachineLearningVulnerabilityto AdversarialExamples 119

8.4.1Background 120

x Contents

8.4.2ResearchIdea 120

8.5CaseStudy:ProtectingSmartSpeakersfromAdversarialVoice Commands 122

8.5.1Background 122

8.5.2Challenges 122

8.5.3DirectionsandTasks 123

8.6Summary 124 References 124

Index 127

AbouttheAuthors

ShengjieXu,PhD,isanassistantprofessorintheManagementInformationSystemsDepartmentatSanDiegoStateUniversity,USA.HeisarecipientoftheIETJournalsPremiumAwardforBestPaperin2020,theMiltonE. MohrGraduateFellowshipAwardfromtheUniversityofNebraska–Lincoln in2017,andtheBestPosterAwardfromtheInternationalConferenceon DesignofReliableCommunicationNetworksin2015.HeservesasaTechnicalEditorfor IEEEWirelessCommunications Magazine.Heholdsmultiple professionalcertificationsincybersecurityandcomputernetworking.

YiQian,PhD,isaprofessorintheDepartmentofElectricalandComputer EngineeringattheUniversityofNebraska–Lincoln,USA.Heisarecipient oftheHenryY.KleinkaufFamilyDistinguishedNewFacultyTeaching Awardin2011,theHollingFamilyDistinguishedTeachingAwardin2012, theHollingFamilyDistinguishedTeaching/Advising/MentoringAward in2018,andtheHollingFamilyDistinguishedTeachingAwardforInnovativeUseofInstructionalTechnologyin2018,allfromtheUniversityof Nebraska–Lincoln,USA.

RoseQingyangHu,PhD,isaprofessorintheDepartmentofElectricaland ComputerEngineeringandAssociateDeanofResearchintheCollegeof EngineeringatUtahStateUniversity,USA.Sheisarecipientofoutstanding facultyresearcheroftheyearin2014and2016andoutstandinggraduate mentoroftheyearin2022,allfromUtahStateUniversity,USA.Sheisa FellowofIEEE,IEEEComSocDistinguishedLecturer2015–2018,IEEEVTS DistinguishedLecturer2020–2022.

Preface

Nowadays,maliciousattacksandemergingcyberthreatshavebeeninducingcatastrophicdamagetocriticalinfrastructureandcausingwidespread outages.Therearethreemajortypesofcyberattacksthatarecompromisingmodernnetworkingsystems:(i) AttackstargetingConfidentiality intend toacquireunauthorizedinformationfromnetworkresources;(ii) Attacks targetingIntegrity aimatdeliberatelyandillegallymodifyingordisrupting dataexchange;and(iii) AttackstargetingAvailability attempttodelay,block orcorruptservicedelivery.Confidentiality,integrity,andavailabilityarethe threepillarsofcybersecurity.Itisurgenttodefendcriticalnetworkingsystemsagainstanyformsofcyberthreatsfromadversaries.

Therapidandsuccessfuladvancesofintelligentdiscoveriesoffersecurityresearchersandpractitionersnewplatformstoinvestigatechallenging issuesemerginginseveralnetworkingsystems.Thoseintelligentsolutions willboosttheefficiencyandeffectivenessofmultiplecriticalsecurityapplications.Motivatedbythecurrenttechnologicaladvances,thisbookintends toofferthecurrentresearchchallengesinthefieldofcybersecurity,aswellas somenovelsecuritysolutionsthatmakecriticalnetworkingsystemssecure, robust,andintelligent.Specifically,thebookfocusesoncybersecurityandits intersectionswithartificialintelligence,machinelearning,edgecomputing, andprivacypreservation.Thereareeightchaptersinthebook.

Chapter1dealswithcybersecurityintheeraofartificialintelligenceand machinelearning.Thechapterfirstintroducestheconceptsofartificial intelligenceandmachinelearning.Itthenillustratessomekeyadvances andchallengesincybersecurity,includinganomalydetection,trustworthy artificialintelligence,andprivacypreservation.Toolboxtobuildsecureand intelligentsystemsisthenpresented.Thechapterthendemonstratesafew datarepositoriesforcybersecurityresearch.

Chapter2dealswithcyberthreatsanddefensemechanisms.Thechapter firstillustratesmultipleeffectivegatewaydefensemethodsagainstcyber threats.Itthenpresentsaresearchstudythatinnovatesreinforcement learningforpenetrationtest.

Chapter3dealswithedgecomputing.Edgecomputingispresentedto highlightitskeyadvancesanduniquecapabilitiesincommunicationnetworks.Thechapterthenillustratestheconceptofsecureedgeintelligence.

Chapter4dealswithedgeintelligenceforintrusiondetection.Thesystematicdesignofedgeintelligenceisfirstpresented.Threemainmodules inedgeintelligenceareillustrated.Thechapterthendemonstratesacase studyincludingexperimentandevaluation.

Chapter5dealswitharobustintrusiondetectionscheme.Thepreliminariesofrobuststatisticsarefirstintroduced.Thechapterthenpresentsthe detailsoftheproposedscheme.Anexperimentalstudyandevaluationare thendemonstrated.

Chapter6dealswithanefficientprocessingschemeforanomalydetection.Afewrelatedstudiesandbackgroundofprincipalcomponentanalysis arefirstintroduced.Itthenpresentstheproposedefficientpreprocessing schemeforanomalydetection,whoseobjectiveistoachievehighdetectionaccuracywhilelearningfromthepreprocesseddata.Thechapterthen demonstratesacasestudyincludingexperimentandevaluation.

Chapter7dealswithprivacypreservationintheeraofbigdata.Afew modernprivacy-preservingapproachesarefirstillustrated.Itthenpresents aproposedschemethatfocusesondetectinganomalousbehaviorsina privacy-preservingway.Thechapteroffersanexperimentalstudyand evaluation.

Chapter8dealswithadversarialexamplesandadversarialmachinelearning.Theconceptofadversarialexamplesanditschallengesarefirstintroduced.Threeresearchstudiesinadversarialexamplesarethenpresented frombothoffensiveanddefensiveperspectives.

Wehopethatourreaderswillenjoythisbook.

ShengjieXu,SanDiegoStateUniversity YiQian,UniversityofNebraska–Lincoln RoseQingyangHu,UtahStateUniversity

Acknowledgments

First,wewouldliketothankourfamiliesfortheirloveandsupport.

WewouldliketothankourcolleaguesandstudentsatDakotaState University,UniversityofNebraska-Lincoln,UtahStateUniversity,andSan DiegoStateUniversityfortheirsupportandenthusiasminthisbookproject andthistopic.

WeexpressourthankstothestaffatWileyfortheirsupport.Wewouldlike tothankSandraGrayson,JulietBooker,andBeckyCowanfortheirpatience inhandlingpublicationissues.

ThisbookprojectwaspartiallysupportedbytheU.S.NationalScience FoundationundergrantsCNS-1423348,CNS-1423408,EARS-1547312,and EARS-1547330.

Acronyms

ABEattributedbasedencryption

AEadversarialexamples

AESAdvancedEncryptionStandard

AIartificialintelligence

AMLadversarialmachinelearning

APIapplicationprogramminginterface

APTadvancedpersistentthreats

ASRautomaticspeechrecognition

CDNcontentdeliverynetwork

CPScyberphysicalsystem

CPUcentralprocessingunit

CSVcomma-separatedvalues

DBSCANdensity-basedspatialclusteringofapplicationswithnoise

DDOSdistributeddenialofservice

DLdeeplearning

DNNdeepneuralnetwork

DOSdenialofservice

DPdifferentialprivacy

FGSMfastgradientsignmethod

FLfederatedlearning

GANgenerativeadversarialnetworks

GDPRGeneralDataProtectionRegulation

GPUgraphicsprocessingunit

HEhomomorphicencryption

ICTinformationandcommunicationtechnology

IDSintrusiondetectionsystem

IOTInternetofThings

IPInternetProtocol

xvi Acronyms

IQRinterquartilerange

JSONJavaScriptobjectnotation

LANlocalareanetwork

LDAlineardiscriminantanalysis

MADmedianabsolutedeviation

MDMahalanobisdistance

MERmeanerrorrate

MLmachinelearning

NIDSnetworkintrusiondetectionsystem

NISTNationalInstituteofStandardsandTechnology

ODEordinarydifferentialequations

PCprincipalcomponent

PCAprincipalcomponentanalysis

PEportableexecutable

POMDPpartiallyobservableMarkovdecisionprocess

PVEproportionofvarianceexplained

QOEqualityofexperience

RAMrandomaccessmemory

SMPCsecuremulti-partycomputation

TAtrustedauthority

TCPtransmissioncontrolprotocol

TPUtensorprocessingunit

CybersecurityintheEraofArtificialIntelligence

Therapidandsuccessfuladvancesofartificialintelligence(AI)andmachine learning(ML)offersecurityresearchersandpractitionersnewapproaches andplatformstoexploreandinvestigatechallengingissuesemergingin manysafety-criticalsystems.ThoseAI/ML-enabledsolutionshaveboosted theefficiencyandeffectivenessofmultipleimportantsecurityapplications. Forexample,recentadvancesinAIandMLhavebeenwidelyappliedin intrusiondetectionsystem(IDS)(Xuetal.,2017,2019a,b,2020),malware detectionsystem(BradleyandXu,2021;Bradley,2022;AhmedandXu, 2022),andpenetrationtesting(Chaudharyetal.,2020).

However,theriseofAIandMLisoftenconsideredasa“double-edged sword.”WhileAIandMLcanbeadoptedtoidentifythreatsmoreaccurately andpreventcyberattacksmoreefficiently,cybersecurityprofessionalsmust respondtotheincreasinglysophisticatedmotivationsfromadversaries. Modernintelligentnetworkingsystemshavebeenmaliciouslymanipulated,evaded,andmisled,causingsignificantsecurityincidentsinfinancial systems,cyber-physicalsystems,andmanyothercriticaldomains.Threat actorsandadversarialattackershavebeenapplyingtechniquestocarryout adversarialattackstargetingvariousAI/ML-enablednetworkingsystems (BurrandXu,2021;Burr,2022).Forinstance,anadversarycaninject well-designedaudiosignalstoconfusethevoicerecognitionsystemsin smartspeakerstodeliverrandomnoises,orcompromisingtheself-driving vehiclesbycreatingvisualalterationsofthestopsign,leavingtheML modelerroneouslyidentifyastopsignasaspeedlimitsignwith 70 miles perhour(mph)(Yuanetal.,2019).Thoseadversarialattackscouldlead tounauthorizeddisclosureofsensitiveinformation,affectthesafetyand wellnessofusers,andthwartInternetfreedom.Therefore,cybersecurity professionalsmustevolverapidlyastechnologyadvancesandnewcyber threatsemerge.

CybersecurityinIntelligentNetworkingSystems,FirstEdition. ShengjieXu,YiQian,andRoseQingyangHu. ©2023JohnWiley&SonsLtd.Published2023byJohnWiley&SonsLtd.

1.1ArtificialIntelligenceforCybersecurity

TheconceptsofAIandMLarefirstlyintroduced,followedbythedata-driven workflowforcybersecuritytasks.

1.1.1ArtificialIntelligence

Thephrase AI ispopularlydiscussedworldwide.Nowadays,AIgenerally referstothesimulationofhumanintelligentbehaviorbycomputational modelstomakedecisions,anditisarapidlyevolvingfieldofstudy,research, andapplicationthatisbeingusedtoimproveeconomicdevelopment,modernhumanlifestyle,andnationalsecurity.Alongwithrecenttechnological advances,AIisusedforinnovationinvariouscriticaldomains,suchas robotics,manufacturing,business,finance,andmanyothers.

AIapplicationsareprimarilyenabledby ML,whichisconsideredasthe pillarofAI’ssuccess.ManyorganizationstreatMLasthemainapproach toimplementAIapplications.Itisanexcitingfieldinvolvingmultiple subjects,includingstatistics,computerscience,businessmanagement, linguistics,andmore.Traditionallyspeaking,MLreferstotheprocessof learningandunderstandingfromhistoricaldata,miningandextractingthe valuableinformationbyrecognizingthepatternandrelationship,making decisions,andforecastingoutcomes,trends,andbehaviors.Itinvolves avastsetofstatisticalmodelsandtools,includinggeneralizedlinear models,tree-basedmethods,neuralnetworks,supportvectormachines, andnearestneighbors.Nowadays,MLisboostedbyBigData,massive computingpower,andadvancedlearningmodels.Inatechnicalarticle (Copeland,2018),theauthorusesaVenndiagramtodescribeAI,ML,deep learning(DL),andtheirrelationship.InFigure1.1,thebroadconceptofAI includingMLandDLisdisplayed.Currently,DLisleadingthefieldofAI andML,andithasmadeasignificantnumberofprogressesinavarietyof MLdomains,suchasimageclassification,speechrecognition,andobject recognition.

Figure1.1 Artificial intelligence,machinelearning, anddeeplearning.

Table1.1 Exampleofahousepricedataset. (x0 )(x1 )(x2 )(x3 ) … (y )

1.1.2MachineLearning

MLofferscomputerstolearnbyminingmassivedatasets.Here,fourbroad categoriesofMLaredescribed.Theyaresupervisedlearning,unsupervised learning,semi-supervisedlearning,andreinforcementlearning.

1.1.2.1SupervisedLearning

MostoftheMLproblemsfallintosupervisedorunsupervised.Forinstance, thereisahousepricingdataset(Table1.1),inwhicheachrow(observation) representsahouseandeachcolumn(feature)representsanattribute(e.g. numberofbedrooms).Foreachobservation,anassociatedtargetvalueis shown.Here,theobjectiveistobuildamodelthatcapturestherelationshipbetweenthetargetvalue y (price)andtheattributes(x0 , … , x3 )sothat accuratepredictionsforfutureobservationscanbeachieved. Supervisedlearningaddressesthistypeofproblembytrainingthemodel withfeaturesandlabeleddata(y).Asupervisedlearningmodeltakesasetof knowninputdata(features)andknownoutputdata(response/target)and trainsamodeltomakereasonablepredictionsfortheresponsetonewdata. Regressionandclassificationarethemaincategoriesforsupervisedlearning problems.Inregressionproblems,therearemanyclassicalmodelsavailable fortraining,includinglinearregression,ordinalregression,andneuralnetworkregression.Inclassificationproblems,therearealsomanyclassical modelsavailablefortraining,includinglogisticregression,tree-basedmethods,supportvectormachine,randomforest,andboostingmethods.

1.1.2.2UnsupervisedLearning

Unsupervisedlearningtrainsthemodelwithunlabeleddata.Itsgoalis tounveilthepatternsinthedata.Unsupervisedlearningservesasagood

1CybersecurityintheEraofArtificialIntelligence

approachtosimplifythedatabyreducingthedimensionality,finding similargroups,andperceivingintrinsicstructures.Clusteringanddimensionalityreductionarethemaincategoriesforunsupervisedlearning problems.Inclusteringproblems,therearemanyclassicalmodelsavailablefortraining,including K -means,Density-BasedSpatialClustering ofApplicationswithNoise(DBSCAN),andhierarchicalclustering.In dimensionalityreductionproblems,therearealsomanyclassicalmodels availablefortraining,includingprincipalcomponentanalysis(PCA)and lineardiscriminantanalysis(LDA).

1.1.2.3Semi-supervisedLearning

Semi-supervisedlearningdealswithpartiallylabeleddata,whichtypically consistofasmallamountoflabeledandalargeamountofunlabeleddata. Itfallsbetweensupervisedlearning,wherecompletelylabeleddataare needed,andunsupervisedlearning,wherenolabeleddataareneeded. Thetrainedmodelfromsemi-supervisedlearningcanbehighlyaccurate. Semi-supervisedlearningisalsowidelyappliedinthefieldofcybersecurity, especiallyinanomalydetection.

1.1.2.4ReinforcementLearning

ReinforcementlearningisauniqueMLparadigm.Thismodellearnsaseries ofactionsbymaximizingaRewardFunction f .Thefunction f canbemaximizedbypenalizing“badaction”and/orrewarding“goodaction.”Inthe reinforcementlearningsetting,anagenttakesactionsinanenvironment thatistreatedasarewardandarepresentationofthestatethatisfedback totheagent.Therearemanypopularexamplesthatareenabledbyreinforcementlearning,suchasself-drivingvehicles(Gyawalietal.,2020)and AlphaGo(Silveretal.,2017).

1.1.3Data-DrivenWorkflowforCybersecurity

Inthefieldofcybersecurity,data-drivenmethodsareplayingacrucialrole incybersecuritytasks,suchasthreatintelligence,riskanalysis,vulnerabilitytesting,anddefenseagainstadversarialbehaviors.Figure1.2presentsthe generaldata-drivenworkflowtosolvecybersecurityproblems.Thefirststep startsfromformulatingaconcretesecurityproblemandjustifyingtheneed toapplydata-drivenmethods.Forexample,securitypractitionerscanstart definingaproblemaboutintrusiondetection,analyzethepossibleoutputs giventheinputs,andthenarguewhetherdata-drivenmethodsareappropriatetoautomatethetask.Inthesecondstep,datacollectionandpreprocessingareconducted.Dataacquisitionisessential,anditisimportanttoassure

and

solutions

Model design and learning

Figure1.2 Data-drivenworkflowforcybersecurity.

thatnotonlysufficientdataarecollectedbutalsolabelinganddatasampling arecorrectandunbiased.Inthethirdstep,MLandotherstatisticalmodels aredesignedandtrained.Itisimportanttoassurethatthetrainedmodel canbegeneralizedwelltofuturedataandevenunseendata.Inthefourth step,performanceevaluationisconductedtoassessthequalityofthetrained model.Modelassessmentshouldbecarriedoutbyusingsuitablebaseline dataandappropriatemetrics.Inthefifthstep,modeldeploymentisperformed.Itiscrucialtonotethatthedeploymentshouldperformwelloutside ofalabenvironment,anditshouldalsoworkwellunderdifferentsettings invariousthreatmodels.Lastly,amatureandrobustsecuritysolutionfor learning-basedIDSisreleased.

1.2KeyAreasandChallenges

Theresearchcommunitieshavebeenactivelyexploringboththeoffense anddefensesidesofdata-drivencybersecurity.Asmoreachievementsare accomplished,severalresearchchallengesthatemergerapidlyarepending tobesolved.Here,threeaspectsaredescribed.Theyareanomalydetection, trustworthyAI,andprivacypreservation.

1.2.1AnomalyDetection

Inthecontextofcybersecurity,anomaliesrefertothoseabnormalbehaviorsthatharmtheinformationsystems.Todefendagainstthem,anomaly

1CybersecurityintheEraofArtificialIntelligence

detectionfocusesonidentifyinganomalousbehaviorsduringaperiodof operations.Thiscanbeextendedtoafewusecasesinsecurity,suchasintrusiondetection,malwaredetection,phishingdetection,spamdetection,and defenseagainstzero-dayattack.

Thegoalofintrusiondetectionistoexaminetrafficdataandclassifynormalactivitiesandattackbehaviors(Xuetal.,2019).Moretopicsaboutcyber intrusionsandintrusiondetectionwillbecoveredindetailinChapters2,4, 5,and6.

Malwarehasbeenneartheforefrontofmoderncybersecurityissues (AhmedandXu,2022).Thedetectionandpreventionofmalwarehas becomeamajorchallenge.Inmanycases,antivirustoolssuchasWindows DefenderutilizeMLtoscanfilesanddetectmaliciouspatterns.With therecenttrendofransomwarecasesaroundtheworld,ithasbecome moreimportantthanevertohaveeffectiveanti-malwaretokeepusers andorganizationssafe.Modernmalwarecantakemanyforms.Itcan beembeddedindocumentmacros,runasshellcode,andmuchmore. Malwareisalsoversatileinthetaskingsitcancomplete.Itcanimplanta command-and-controlserver(C2)beacon,installakeylogger,orransoma computerbyencryptingallofthefiles.Inadditiontocompletingmalicious tasks,malwarehasasecondaryobjectiveofavoidingdetectionbydisguising itselfasavalidprocessonacomputerandobfuscatethedetector.

Phishingisatacticthatisusedbythreatactorstoachievetheirgoals,such asobtainingcredentialsofemployeesordeliveringmalware(Khonjietal., 2013).Withtheprevalenceofthesetypesofattacks,itisimportanttodetect andstoptheattacksatanypointintheirlifecycle.Beingabletodetectthat awebsiteislikelyaphishingsitecouldbeausefultoolinmitigatingthe successoftheattacks.

Inrecentyears,thepoisoningattackhasbecomeanewformofattack tocompromisenetworkingsystemsandthelearningmodelstheyintegrate. Datapoisoningattackistheintentionalactofpollutingdatathatthealgorithmsneedtotrain(Huangetal.,2021).Itcanimpactorganizationsaswell asindividualsinanegativemanner.Somereal-lifeexamplesofthistypeof attackincludeanattackerchangingwhatanemailspamfiltermightmark asspam.Thiswouldallowanattackertosendanykindofemailtheywant withoutbeingflagged.Similarly,therearefirewalltoolsthatuseMLtomonitornetworktrafficandlookformalwareenteringthenetwork.Anattacker coulduseasimilarmethodtoevadedetection.

1.2.2TrustworthyArtificialIntelligence

MostoftheAI/MLmodelsfacetheissueofbeingtrustworthybecause theyarevulnerabletovariouskindsofattacks(e.g.adversarialexamples)

(Lietal.,2021a,b).Thisisbecausetheyarenotyetexplainableowingtothe black-boxnatureofmanyAI/MLmodels,especiallyDLmodels(Zouetal., 2021),andtheiruncertaintyhasnotbeenquantified(Lietal.,2021c).This highlightstheimportanceofadditionalresearchactivitiesinthetrustworthinessofAI/MLmodels.Moreover,morestudiesareneededtosystematicallyunderstandthetrustworthinessofAI/MLmodels,aswellastoadvance thestate-of-the-arttrustworthiness,robustness,andinterpretability ofAI/ML.

Currently,researchchallengesoftrustworthyAIandadversarialML include,butnotlimitedto,thefollowingaspects:quantifyingtrustworthinessofAI/MLmethodsagainstsophisticatedattacks,suchasadversarial examples,poisoningattacks,orevasiveattacks;determiningtheconditions towhethertrustAI/MLmodelsornot;explainingandinterpretingrecommendationsmadebyAI/MLmodels;detectingandmitigatingadversarial examplesagainstAI/MLmodels;andenhancingtrustworthinessofAI/ML modelsbyincorporatingcountermeasures.TrustworthyAIandadversarial MLwillbecoveredindetailinChapter8.

1.2.3PrivacyPreservation

Theprotectionofsensitivedataiscritical.Inthefieldofhealthstudy,the dataprivacyproblemisgrowingbecauseofthechallengesindataprivacy regulations,privacyleakagebyattackers,andthepervasivedatamining operations.Inonestudy(Naetal.,2018),theauthorsdiscussedthatby utilizinglargenationalphysicalactivitydatasets,thechildrenandadults werereidentifiedbyMLwhen20minutedatawithseveralpiecesof demographicinformationwereused.Unfortunately,moreincidentsare reportedworldwideregardingprivacyleakage.Withtheenforcementof GeneralDataProtectionRegulation(GDPR)(VoigtandVondemBussche, 2017),companiesandorganizationswillneedtocomplywithspecificterms andconditionstoprotectthedataprivacyofEuropeanUnioncitizens.

Theincreasingattentiononsecurityandprivacyhasmotivatedthe rapiddesignandimplementationofmultipleprivacy-preservingmethods. Forinstance,federatedlearning(FL)wasdesignedtotrainMLmodels acrossmultipleendnodeswithoutsharingthelocaldatatoacentralized server.OnearticlesharedthatGooglehasimplementedamobileapplication thatoffersprivacy-preservingwordprediction-basedFL(Hardetal.,2018).

Thereareafewmodernapproachesforprivacypreservation,including anonymization,differentialprivacy(DP),homomorphicencryption(HE), andsecuremulti-partycomputation(SMPC).TheseapproacheswillbecoveredindetailinChapter7.

1.3ToolboxtoBuildSecureandIntelligent Systems

Afewscientificcomputingtoolsarecommonlyutilizedasthetoolboxto buildintelligentsystems.Here,afewPythonlibrariesaredescribed.They areforMLandDL,privacy-preservingML,andadversarialML.

1.3.1MachineLearningandDeepLearning

FivePythonlibrariesarewidelyusedforMLandscientificcomputing.They areNumPy,1 SciPy,2 scikit-learn,3 PyTorch,4 andTensorFlow.5

1.3.1.1NumPy

NumPyisthefundamentallibraryforscientificcomputinginPython.It isaPythonlibrarythatprovidesamulti-dimensionalarrayobject,various derivedobjects(suchasmaskedarraysandmatrices),andanassortmentof routinesforfastoperationsonarrays,includingmathematical,logical,shape manipulation,sorting,selecting,I/O,discreteFouriertransforms,basiclinearalgebra,basicstatisticaloperations,randomsimulation,andmuchmore (Harrisetal.,2020).

1.3.1.2SciPy

SciPyisaPythonlibraryformathematics,science,andengineering.It includesmodulesforstatistics,optimization,integration,linearalgebra, Fouriertransforms,signalandimageprocessing,ordinarydifferential equations(ODE)solvers,andmore(Virtanenetal.,2020).SciPyisbuilt toworkwithNumPyarraysandprovidesmanyuser-friendlyandefficientnumericalroutines,suchasroutinesfornumericalintegrationand optimization.

1.3.1.3Scikit-learn

Scikit-learnisaPythonlibraryforMLbuiltontopofSciPy(Pedregosaetal., 2011).ItoffersmultiplemodulesforMLtasks,includingclassification, regression,clustering,dimensionalityreduction,modelselection,and preprocessing.

1.3.1.4PyTorch

PyTorchisanoptimizedtensorlibraryforDL.Itprovidesdeepneuralnetworksbuiltonanautomaticdifferentiationsystem(autograd),aswellastensorcomputationwithstronggraphicsprocessingunit(GPU)acceleration.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.