Issuu

https://ebookmass.com/product/scada-security-abdulmohsenalmalawi/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Understanding Homeland Security

https://ebookmass.com/product/understanding-homeland-security/

ebookmass.com

Physical Security Principles

https://ebookmass.com/product/physical-security-principles/

ebookmass.com

Implementing Information Security in Healthcare: Building a Security Program (Ebook PDF)

https://ebookmass.com/product/implementing-information-security-inhealthcare-building-a-security-program-ebook-pdf/

ebookmass.com

Constitutional Law for a Changing America: Rights, Liberties, and Justice 10th Edition, (Ebook PDF)

https://ebookmass.com/product/constitutional-law-for-a-changingamerica-rights-liberties-and-justice-10th-edition-ebook-pdf/

ebookmass.com

Conventionally Yours Annabeth Albert

https://ebookmass.com/product/conventionally-yours-annabeth-albert/

ebookmass.com

Agile Project Management: Creating Innovative Products (Agile Software Development Series) 2nd Edition, (Ebook PDF)

https://ebookmass.com/product/agile-project-management-creatinginnovative-products-agile-software-development-series-2nd-editionebook-pdf/ ebookmass.com

First Aid for the USMLE Step 1 2023, 33rd Edition Vikas Bhushan

https://ebookmass.com/product/first-aid-for-the-usmlestep-1-2023-33rd-edition-vikas-bhushan/

ebookmass.com

Underneath the Sycamore Tree B. Celeste

https://ebookmass.com/product/underneath-the-sycamore-tree-b-celeste/

ebookmass.com

High Velocity Hiring Scott Wintrip

https://ebookmass.com/product/high-velocity-hiring-scott-wintrip/

ebookmass.com

https://ebookmass.com/product/two-tribes-fearne-hill-3/

ebookmass.com

SCADASECURITY:MACHINELEARNING CONCEPTSFORINTRUSIONDETECTION ANDPREVENTION

WileySeriesOnParallelandDistributedComputing

SeriesEditor:AlbertY.Zomaya

Acompletelistoftitlesinthisseriesappearsattheendofthisvolume.

SCADASECURITY: MACHINELEARNING CONCEPTSFOR INTRUSIONDETECTION ANDPREVENTION

SCADA-BASEDIDsSECURITY

AbdulmohsenAlmalawi

KingAbdulazizUniversity

ZahirTari

RMITUniversity

AdilFahad

AlBahaUniversity

XunYi

RMITUniversity

Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmitted,inany formorbyanymeans,electronic,mechanical,photocopying,recordingorotherwise,exceptaspermittedbylaw.Advice onhowtoobtainpermissiontoreusematerialfromthistitleisavailableathttp://www.wiley.com/go/permissions.

TherightofAbdulmohsenAlmalawi,ZahirTari,AdilFahad,XunYitobeidentifiedastheauthorsofthisworkhas beenassertedinaccordancewithlaw.

RegisteredOffice

JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA

EditorialOffice 111RiverStreet,Hoboken,NJ07030,USA

Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWileyproductsvisitusat www.wiley.com.

Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Somecontentthatappearsin standardprintversionsofthisbookmaynotbeavailableinotherformats.

LimitofLiability/DisclaimerofWarranty

Inviewofongoingresearch,equipmentmodifications,changesingovernmentalregulations,andtheconstantflowof informationrelatingtotheuseofexperimentalreagents,equipment,anddevices,thereaderisurgedtoreviewand evaluatetheinformationprovidedinthepackageinsertorinstructionsforeachchemical,pieceofequipment,reagent, ordevicefor,amongotherthings,anychangesintheinstructionsorindicationofusageandforaddedwarningsand precautions.Whilethepublisherandauthorshaveusedtheirbesteffortsinpreparingthiswork,theymakeno representationsorwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthisworkandspecifically disclaimallwarranties,includingwithoutlimitationanyimpliedwarrantiesofmerchantabilityorfitnessforaparticular purpose.Nowarrantymaybecreatedorextendedbysalesrepresentatives,writtensalesmaterialsorpromotional statementsforthiswork.Thefactthatanorganization,website,orproductisreferredtointhisworkasacitationand/or potentialsourceoffurtherinformationdoesnotmeanthatthepublisherandauthorsendorsetheinformationorservices theorganization,website,orproductmayprovideorrecommendationsitmaymake.Thisworkissoldwiththe understandingthatthepublisherisnotengagedinrenderingprofessionalservices.Theadviceandstrategiescontained hereinmaynotbesuitableforyoursituation.Youshouldconsultwithaspecialistwhereappropriate.Further,readers shouldbeawarethatwebsiteslistedinthisworkmayhavechangedordisappearedbetweenwhenthisworkwaswritten andwhenitisread.Neitherthepublishernorauthorsshallbeliableforanylossofprofitoranyothercommercial damages,includingbutnotlimitedtospecial,incidental,consequential,orotherdamages.

LibraryofCongressCataloging-in-PublicationData:

Names:Almalawi,Abdulmohsen,author.|Tari,Zahir,author.|Fahad,Adil, author.|Yi,Xun,author.

Title:SCADAsecurity:machinelearningconceptsforintrusiondetection andprevention/AbdulmohsenAlmalawi,KingAbdulazizUniversity,Zahir Tari,RMITUniversity,AdilFahad,AlBahaUniversity,XunYi,Royal MelbourneInstituteofTechnology.

Description:Hoboken,NJ,USA:Wiley,2021.|Series:Wileyserieson parallelanddistributedcomputing|Includesbibliographicalreferences andindex.

Identifiers:LCCN2020027876(print)|LCCN2020027877(ebook)|ISBN 9781119606031(cloth)|ISBN9781119606079(adobepdf)|ISBN 9781119606352(epub)

Subjects:LCSH:Supervisorycontrolsystems.|Automaticcontrol–Security measures.|Intrusiondetectionsystems(Computersecurity)|Machine learning.

Classification:LCCTJ222.A462021(print)|LCCTJ222(ebook)|DDC 629.8/95583–dc23

LCrecordavailableathttps://lccn.loc.gov/2020027876

LCebookrecordavailableathttps://lccn.loc.gov/2020027877

CoverDesign:Wiley

CoverImage:©Nostal6ie/GettyImages

Setin9.5/12.5ptSTIXTwoTextbySPiGlobal,Pondicherry,India PrintedintheUnitedStatesofAmerica

10987654321

Toourdearparents

Foreword ix

Preface xi

Acronyms xv

1.Introduction 1

2.Background 15

3.SCADA-BasedSecurityTestbed 25

4.Efficient k-NearestNeighbourApproachBasedonVarious-Widths Clustering 63

5.SCADAData-DrivenAnomalyDetection 87

6.AGlobalAnomalyThresholdtoUnsupervisedDetection 119

7.ThresholdPassword-AuthenticatedSecretSharingProtocols151

8.Conclusion 179 References 185 Index

FOREWORD

Inrecentyears,SCADAsystemshavebeeninterfacedwithenterprisesystems,which thereforeexposedthemtothevulnerabilitiesoftheInternetandtosecuritythreats. Therefore,therehasbeenanincreaseincyberintrusionstargetingthesesystemsand theyarebecominganincreasinglyglobalandurgentproblem.ThisisbecausecompromisingaSCADAsystemcanleadtolargefinanciallossesandseriousimpacton publicsafetyandtheenvironment.Asacountermeasure,IntrusionDetectionSystems (IDSs)tailoredforSCADAaredesignedtoidentifyintrusionsbycomparingobservablebehavioragainstsuspiciouspatterns,andtonotifyadministratorsbyraising intrusionalarms.Intheexistingliterature,therearethreetypesoflearningmethods thatareoftenadoptedbyIDSforlearningsystembehaviorandbuildingthedetection models,namely supervised,semisupervised,and unsupervised.Insupervisedlearning,anomaly-basedIDSrequiresclasslabelsforbothnormalandabnormalbehavior inordertobuildnormal/abnormalprofiles.Thistypeoflearningiscostlyhowever andtime-expensivewhenidentifyingtheclasslabelsforalargeamountofdata. Hence,semi-supervisedlearningisintroducedasanalternativesolution,wherean anomaly-basedIDSbuildsonlynormalprofilesfromthenormaldatathatiscollected overaperiodof“normal”operations.However,themaindrawbackofthislearning methodisthatcomprehensiveand“purely”normaldataarenoteasytoobtain.Thisis becausethecollectionofnormaldatarequiresthatagivensystemoperatesundernormalconditionsforalongtime,andintrusiveactivitiesmayoccurduringthisperiod ofthedatacollectionprocess.Ontheanotherhand,therelianceonlyonabnormal dataforbuildingabnormalprofilesisinfeasiblesincethepossibleabnormalbehavior thatmayoccurinthefuturecannotbeknowninadvance.Alternatively,andforpreventingthreatsthatareneworunknown,ananomaly-basedIDSusesunsupervised learningmethodstobuildnormal/abnormalprofilesfromunlabeleddata,whereprior knowledgeaboutnormal/abnormaldataisnotknown.Indeed,thisisacost-efficient methodsinceitcanlearnfromunlabeleddata.Thisisbecausehumanexpertiseisnot requiredtoidentifythebehavior(whethernormalorabnormal)foreachobservation inalargeamountoftrainingdatasets.However,itsuffersfromlowefficiencyand pooraccuracy.

Thisbookprovidesthelatestresearchandbestpracticesofunsupervised intrusiondetectionmethodstailoredforSCADAsystems.InChapter3,framework foraSCADAsecuritytestbedbasedonvirtualisationtechnologyisdescribedfor evaluatingandtestingthepracticalityandefficacyofanyproposedSCADAsecurity solution.Undoubtedly,theproposedtestbedisasalientpartforevaluatingand

x FOREWORD

testingbecausetheactualSCADAsystemscannotbeusedforsuchpurposesbecause availabilityandperformance,whicharethemostimportantissues,aremostlikely tobeaffectedwhenanalysingvulnerabilities,threats,andtheimpactofattacks.In theliterature,thek-NearestNeighbour(k-NN)algorithmwasfoundtobeoneoftop tenmostinterestingandbestalgorithmsfordataminingingeneralandinparticular ithasdemonstratedpromisingresultsinanomalydetection.However,thetraditional k-NNalgorithmsuffersfromhighand“curseofdimensionality”sinceitneedsa largeamountofdistancecalculations.Chapter4describesanovel k-NNalgorithm thatefficientlyworksonhigh-dimensionaldataofvariousdistributions.Inaddition, anextensiveexperimentalstudyandcomparisonwithseveralalgorithmsusing benchmarkdatasetswereconducted.Chapters5and6introducethepracticalityand possibilityofunsupervisedintrusiondetectionmethodstailoredforSCADAsystems, anddemonstratetheaccuracyofunsupervisedanomalydetectionmethodsthatbuild normal/abnormalprofilesfromunlabeleddata.Finally,Chapter7describestwo authenticationprotocolstoefficientlyprotectSCADASystems,andChapter8nicely concludeswiththevarioussolutions/methodsdescribedinthisbookwiththeaimto outlinepossiblefutureextensionsofthesedescribedmethods.

PREFACE

SupervisoryControlandDataAcquisition(SCADA)systemshavebeenintegrated tocontrolandmonitorindustrialprocessesandourdailycriticalinfrastructures,such aselectricpowergeneration,waterdistribution,andwastewatercollectionsystems. Thisintegrationaddsvaluableinputtoimprovethesafetyoftheprocessandthe personnel,aswellastoreduceoperationcosts.However,anydisruptiontoSCADA systemscouldresultinfinancialdisastersormayleadtolossoflifeinaworstcase scenario.Therefore,inthepast,suchsystemsweresecurebyvirtueoftheirisolation andonlyproprietaryhardwareandsoftwarewereusedtooperatethesesystems.In otherwords,thesesystemswereself-containedandtotallyisolatedfromthepublic network(e.g.,theInternet).Thisisolationcreatedthemyththatmaliciousintrusions andattacksfromtheoutsideworldwerenotabigconcern,andsuchattackswere expectedtocomefromtheinside.Therefore,whendevelopingSCADAprotocols, thesecurityoftheinformationsystemwasgivennoconsideration.

Inrecentyears,SCADAsystemshavebeguntoshiftawayfromusingproprietary andcustomizedhardwareandsoftwaretousingCommercial-Off-The-Shelf(COTS) solutions.Thisshifthasincreasedtheirconnectivitytothepublicnetworksusing standardprotocols(e.g.,TCP/IP).Inaddition,thereisdecreasedrelianceonspecific vendors.Undoubtedly,thisincreasesproductivityandprofitabilitybutwill,however, exposethesesystemstocyberthreats.AlowpercentageofcompaniescarryoutsecurityreviewsofCOTSapplicationsthatarebeingused.Whileahighpercentageof othercompaniesdonotperformsecurityassessments,andthusrelyonlyonthevendorreputationorthelegalliabilityagreements,somemayhavenopoliciesatall regardingtheuseofCOTSsolutions.

TheadoptionofCOTSsolutionsisatime-andcost-efficientmeansofbuilding SCADAsystems.Inaddition,COST-baseddevicesareintendedtooperateontraditionalEthernetnetworksandtheTCP/IPstack.Thisfeatureallowsdevicesfrom variousvendorstocommunicatewitheachotheranditalsohelpstoremotelysuperviseandcontrolcriticalindustrialsystemsfromanyplaceandatanytimeusingthe Internet.Moreover,wirelesstechnologiescanefficientlybeusedtoprovidemobility andlocalcontrolformultivendordevicesatalowcostforinstallationandmaintenance.However,theconvergenceofstate-of-the-artcommunicationtechnologies exposesSCADAsystemstoalltheinherentvulnerabilitiesofthesetechnologies.

AnawarenessofthepotentialthreatstoSCADAsystemsandtheneedtoreduce riskandmitigatevulnerabilitieshasrecentlybecomeahotresearchtopicinthe securityarea.Indeed,theincreaseofSCADAnetworktrafficmakesthemanual xi

monitoringandanalysisoftrafficdatabyexpertstime-consuming,infeasible,and veryexpensive.Forthisreason,researchersbegintoemployMachineLearning (ML)-basedmethodstodevelopIntrusionDetectionSystems(IDSs)bywhich normalandabnormalbehaviorsofnetworktrafficareautomaticallylearnedwith noorlimiteddomainexpertinterference.InadditiontotheacceptanceofIDSs asafundamentalpieceofsecurityinfrastructureindetectingnewattacks,they arecost-efficientsolutionsforminoringnetworkbehaviorswithhigh-accuracy performance.Therefore,IDShasbeenadoptedinSCADAsystems.Thetypeof informationsourceanddetectionmethodsarethesalientcomponentsthatplay amajorroleindevelopinganIDS.Thenetworktrafficandeventsatsystemand applicationlevelsareexamplesofinformationsources.Thedetectionmethods arebroadlycategorizedintotwotypesintermsofdetection: signature-based and anomaly-based.Theformercandetectonlyanattackwhosesignatureisalready known,whilethelattercandetectunknownattacksbylookingforactivitiesthat deviatefromanexpectedpattern(orbehavior).Thedifferencesbetweenthenature andcharacteristicsoftraditionalITandSCADAsystemshavemotivatedsecurity researcherstodevelopSCADA-specificIDSs.Recentresearchesonthistopic foundthatthemodellingofmeasurementandcontroldata,called SCADAdata, ispromisingasameansofdetectingmaliciousattacksintendedtojeopardize SCADAsystems.However,thedevelopmentofefficientandaccuratedetection models/methodsisstillanopenresearcharea.

Anomaly-baseddetectionmethodscanbebuiltbyusingthreemodes,namely supervised, semi-supervised,or unsupervised.Theclasslabelsmustbeavailablefor thefirstmode;however,thistypeoflearningiscostlyandtime-consumingbecause domainexpertsarerequiredtolabelhundredsofthousandsofdataobservations.The secondmodeisbasedontheassumptionthatthetrainingdatasetrepresentsonly onebehavior,eithernormalorabnormal.Thereareanumberofissuespertaining tothismode.Thesystemhastooperateforalongtimeundernormalconditionsin ordertoobtainpurelynormaldatathatcomprehensivelyrepresentnormalbehaviors. However,thereisnoguaranteethatanyanomalousactivitywilloccurduringthedata collectionperiod.Ontheotherhand,itischallengingtoobtainatrainingdatasetthat coversallpossibleanomalousbehaviorsthatcanoccurinthefuture.Alternatively,the unsupervisedmodecanbethemostpopularformofanomaly-baseddetectionmodelsthataddressestheaforementionedissues,wherethesemodelscanbebuiltfrom unlabeleddatawithoutpriorknowledgeaboutnormal/abnormalbehaviors.However, thelowefficiencyandaccuracyarechallengingissuesofthistypeoflearning.

TherearebooksinthemarketthatdescribethevariousSCADA-basedunsupervisedintrusiondetectionmethods;theyare,however,relativelyunfocusedand lackingmuchdetailsonthemethodsforSCADAsystemsintermsofdetection approaches,implementation,datacollection,evaluation,andintrusionresponse. Briefly,thisbookprovidesthereaderwiththetoolsthatareintendedtoprovide practicaldevelopmentandimplementationofSCADAsecurityingeneral.Moreover, thisbookintroducessolutionstopracticalproblemsthatSCADAintrusiondetection systemsexperiencewhenbuildingunsupervisedintrusiondetectionmethodsfrom unlabeleddata.ThemajorchallengewastobringvariousaspectsofSCADA

xiii intrusiondetectionsystems,suchasbuildingunsupervisedanomalydetection methodsandevaluatingtheirrespectiveperformance,underasingleumbrella.

Thetargetaudienceofthisbookiscomposedofprofessionalsandresearchers workinginthefieldofSCADAsecurity.Atthesametime,itcanbeusedby researcherswhocouldbeinterestedinSCADAsecurityingeneralandbuilding SCADAunsupervisedintrusiondetectionsystemsinparticular.Moreover,this bookmayaidthemtogainanoverviewofafieldthatisstilllargelydominatedby conferencepublicationsandadisparatebodyofliterature.

Thebookhassevenmainchaptersthatareorganizedasfollows.InChapter3, thebookdealswiththeestablishmentofaSCADAsecuritytestbedthatisasalient partforevaluatingandtestingthepracticalityandefficacyofanyproposedSCADA securitysolution.ThisisbecausetheevaluationandtestingusingactualSCADA systemsarenotfeasiblesincetheiravailabilityandperformancearemostlikelyto beaffected.Chapter4looksinmuchmoredetailatthenovelefficient k-Nearest NeighbourapproachbasedonVarious-WidthsClustering,named kNNVWC,to efficientlyaddresstheinfeasibilityoftheuseofthe k-nearestneighbourapproach withlargeandhigh-dimensionaldata.InChapter5,anovelSCADAData-Driven AnomalyDetection(SDAD)approachisdescribedindetail.Thischapterdemonstratesthepracticalityoftheclustering-basedmethodtoextractproximity-based detectionrulesthatcompriseatinyportioncomparedtothetrainingdata,while meanwhilemaintaintherepresentativenatureoftheoriginaldata.Chapter6looks indetailatanovelpromisingapproach,calledGATUD(GlobalAnomalyThreshold toUnsupervisedDetection),thatcanimprovetheaccuracyofunsupervisedanomaly detectionapproachesthatarecompliantwiththefollowingassumptions:(i)the numberofnormalobservationsinthedatasetvastlyoutperformstheabnormal observationsand(ii)theabnormalobservationsmustbestatisticallydifferentfrom normalones.Finally,Chapter7looksattheauthenticationprotocolsinSCADA systems,whichenablesecurecommunicationbetweenallthecomponentsofsuch systems.ThischapterdescribestwoefficientTPASSprotocolsforSCADAsystems: oneisbuiltontwo-phasecommitmentandhaslowercomputationcomplexityand theotherisbasedonzero-knowledgeproofandhaslesscommunicationrounds. Bothprotocolsareparticularlyefficientfortheclient,whoonlyneedstosenda requestandreceivearesponse.

ACRONYMS

AGAAmericanGasAssociation

ASCIIAmericanStandardCodeforInformationInterchange

COTSCommercial-Off-The-Shelf

CORECommonOpenResearchEmulator

CRCCyclicRedundancyCheck

DDLDynamicLinkLibrary

DNPDistributedNetworkProtocol

DOSDenialOfService

EDMMEnsemble-basedDecision-MakingModel

Ek-NNExhaustive k-NearestNeighbor

EMANEExtendableMobileAd-hocNetworkEmulator

EPANETEnvironmentalProtectionAgencyNetwork

FEPFrontEndProcessor

GATUDGlobalAnomalyThresholdtoUnsupervizedDetection

HMIHumanMachineInterface

k-NN k-NearestNeighbor

kNNVWC k-NNbasedonVarious-WidthsClustering

IDSIntrusionDetectionSystem

IEDIntelligentElectronicDevice

IPInternetProtocol

ITInformationTechnology

LANLocalAreaNetwork

NISCCNationalInfrastructureSecurityCoordinationCenter

NS2NetworkSimulator2

NS3NetworkSimulator3

OMNETObjectiveModularNetworkTestbed

OPNETOptimizedNetworkEngineeringTool

OSTOrthogonalStructureTree

OSVDBOpenSourceVulnerabilityDataBase

PCAPrincipalComponentAnalysis

PLCProgrammableLogicController

PLSPartialLeastSquares

RTURemoteTerminalUnit

SCADASupervisoryControlAndDataAcquisition

SCADAVTSCADAsecuritytestbedbasedonVirtualizationTechnology xv

xvi ACRONYMS

SDADSCADAData-drivenAnomalyDetection

TCPTransmissionControlProtocol

TPASSThresholdPassword-AuthenticatedSecretSintheboo..Itisharing

UDPUserDatagramProtocol

USBUniversalSerialBus

Thisaimofthisintroductorychapteristomotivatetheextensiveresearchworkcarriedinthisbook,highlightingtheexistingsolutionsandtheirlimitations,andputting incontexttheinnovativeworkandideasdescribedinthisbook.

1.1OVERVIEW

SupervisoryControlandDataAcquisition (SCADA)systemshavebeenintegrated tocontrolandmonitorindustrialprocessesandourdailycriticalinfrastructuressuch aselectricpowergeneration,waterdistributionandwastewatercollectionsystems. Thisintegrationaddsvaluableinputtoimprovethesafetyoftheprocessandthe personnelandtoreduceoperationcosts(Boyer,2009).However,anydisruptionto SCADAsystemscanresultinfinancialdisastersormayleadtolossoflifeinaworst casescenario.Therefore,inthepast,suchsystemsweresecurebyvirtueoftheirisolationandonlyproprietaryhardwareandsoftwarewereusedtooperatethesesystems. Inotherwords,thesesystemswereself-containedandtotallyisolatedfromthepublic network(e.g.,theInternet).Thisisolationcreatedthemyththatmaliciousintrusions andattacksfromtheoutsideworldwerenotabigconcernandthatsuchattackswere expectedtocomefromtheinside.Therefore,whendevelopingSCADAprotocols, thesecurityoftheinformationsystemwasgivennoconsideration.

Inrecentyears,SCADAsystemshavebeguntoshiftawayfromusingproprietary andcustomizedhardwareandsoftwaretousingCommercial-Off-The-Shelf(COTS) solutions.Thisshifthasincreasedtheirconnectivitytothepublicnetworksusing standardprotocols(e.g.,TCP/IP).Inaddition,thereisdecreasedrelianceonasingle vendor.Undoubtedly,thisincreasesproductivityandprofitabilitybutwill,however, exposethesesystemstocyberthreats(Omanetal.,2000).Accordingtoasurvey publishedbytheSANSInstitute(BirdandKim,2012),only14%oforganizations carryoutsecurityreviewsofCOTSapplicationsthatarebeingused,whileover50% ofotherorganizationsdonotperformsecurityassessmentsandrelyonlyonvendor reputationorthelegalliabilityagreements,ortheyhavenopoliciesatallregarding theuseofCOTSsolutions.

TheadoptionofCOTSsolutionsisatime-andcost-efficientmeansofbuildingSCADAsystems.Inaddition,COST-baseddevicesareintendedtooperateon

INTRODUCTION

traditionalEthernetnetworksandtheTCP/IPstack.Thisfeatureallowsdevicesfrom variousvendorstocommunicatewitheachother,andalsohelpstoremotelysupervise andcontrolcriticalindustrialsystemsfromanyplaceandatanytimeusingtheInternet.Moreover,wirelesstechnologiescanefficientlybeusedtoprovidemobilityand localcontrolformultivendordevicesatalowcostforinstallationandmaintenance. However,theconvergenceofstate-of-the-artcommunicationtechnologiesexposes SCADAsystemstoalltheinherentvulnerabilitiesofthesetechnologies.Inwhatfollows,wediscusshowthepotentialcyber-attacksagainsttraditionalITcanalsobe possibleagainstSCADAsystems.

• DenialofServices(DoS)attacks. Thisisapotentialattackonany Internet-connecteddevicewherealargenumberofspuriouspacketsare senttoavictiminordertoconsumeexcessiveamountsofendpointnetwork bandwidth.Apacketfloodingattack(Houleetal.,2001)isoftenusedasanother termforaDoSattack.Thistypeofattackdelaysortotallypreventsthevictim fromreceivingthelegitimatepackets(Householderetal.,2001).SCADA networkingdevicesthatareexposedtotheInternetsuchasrouters,gateways andfirewallsaresusceptibletothistypeofattack.Longetal.(2005)proposed twomodelsofDoSattacksonaSCADAnetworkusingreliablesimulation. Thefirstmodelwasdirectlylaunchedtoanendpoint(e.g.,controllerora customer-edgerouterconnectingtotheInternet),whilethesecondmodel isanindirectattack,wheretheDoSattackislaunchedonarouter(onthe Internet)thatislocatedinthepathbetweentheplantandendpoint.Inthis study,itwasfoundthatDoSattacksthatwerelauncheddirectly(orindirectly) causeexcessivepacketlosses.Consequently,acontrollerthatreceivesthe measurementandcontroldatalateornotatallfromthedevicesdeployedin thefieldwillmakeadecisionbasedonolddata.

• Propagationofmaliciouscodes. Suchtypesofattackcanoccurinvarious formssuchasviruses,Trojanhorses,andworms.Theyarepotentialthreats toSCADAsystemsthataredirectly(orindirectly)connectedtotheInternet. Unlikeworms,virusesandTrojansrequireahumanactiontobeinitiated.However,allthesethreatsarehighlylikelyaslongasthepersonnelareconnected totheInternetthroughthecorporatenetwork,whichisdirectlyconnectedto theSCADAsystem,oriftheyareallowedtoplugtheirpersonalUSBsinto thecorporateworkstations.Therefore,ausercanbedeceivedintodownloading acontaminatedfilecontainingavirusorinstallingsoftwarethatappearstobe useful.Shamoon(BronkandTikk-Ringas,2013),Stuxnet(Falliereetal.,2011), Duqu(Bencsáthetal.,2012),andFlame(Munro,2012)areexamplesofsuch threatstargetingSCADAsystemsandoilandenergysectors.

• Insidethreats. Theemployeeswhoaredisgruntledorintendtodivulgevaluableinformationformaliciousreasonscanposerealthreatsandrisksthatshould betakenseriously.Thisisbecauseemployeesusuallyhaveunrestrictedaccess totheSCADAsystemsandalsoknowtheconfigurationsettingsofthesesystems.Forinstance,theattackonthesewagetreatmentsysteminMaroochy Shire,South-EastQueensland(Australia)in2001(SlayandMiller,2007)isan

Figure1.1 SCADAvulnerabilitiesrevealedsince2001inOSVDB.

exampleofanattackthatwaslaunchedbyadisgruntledemployee,wherethe attackertookoverthecontroldevicesofaSCADAsystemandcaused800,000 litresofrawsewagetospilloutintolocalparksandrivers.

• Unpatchedvulnerabilities. Theexistenceofvulnerabilitiesishighlyexpected inanysystemanditisknownthathackersalwaysexploitunpatchedvulnerabilitiestoobtainaccessandtocontrolthetargetedsystem.Eventhoughthevendors immediatelyreleasethepatchesfortheidentifiedvulnerabilities,itischallengingtoinstallthesepatchesonSCADAsystemsthatruntwenty-four-by-seven. Therefore,suchsystemswillremainvulnerableforweeksormonths.As depictedinFigure1.1,andaccordingtotheindependentandOpenSource VulnerabilityDataBase(OSVDB)1 forthesecuritycommunity,vulnerabilities targetingSCADAsystemshavesubstantiallyincreasedoverthepastthreeyears since2011.

• Nontechnical(socialengineering)attacks. Thistypeofattackcanbypass state-of-the-artsecuritytechnologiesthatcostmillionsofdollars.Ingeneral, theattackersinitiallytrytoobtainsensitiveinformationsuchasthedesign, operations,orsecuritycontrolsofthetargetedSCADAsystem.Therearea numberofwaystogathersuchinformation.Ifthenetworkaccesscredentials ofex-employeesarenotimmediatelydisabled,theycanberevealedtoanother partyinordertoprofitfromtheinformation,orasadesireforrevenge.In anotherway,suchcriticalinformationcanbeeasilyobtainedfromcurrent employeesaslongastheyareknownbybuildingatrustrelationshiporby knowingsomeinformationaboutanaiveemployeewhoisallowedtoremotely controlandmonitorthesystemsviatheInternet,allofwhichcanhelpthe attackertoanswertheexpectedquestionswhencallingupthecentraloffice

1 http://osvdb.org/

INTRODUCTION

totellthemthats/heforgotthenetworkaccesscredentialsandassistanceis neededtoconnecttothefieldnetwork.

ThesecurityconceptsthathavebeenextensivelyusedintraditionalITsystems (e.g., management, filtering, encryption,and intrusiondetection)canbeadaptedto mitigatetheriskoftheaforementionedpotentialthreatsagainstSCADAsystems. However,theseconceptscannotbedirectlyappliedwithoutconsideringthe nature ofSCADAsystems.Forinstance,theresourceconstraintsofSCADAdevices,suchas lowbandwidth,processingpower,andmemory,complicatetheintegrationofcomplexcryptography,especiallywithlegacydevices.AlltheSCADAprotocolswere developedwithoutanyconsiderationgiventoinformationsecurityand,therefore, theylackauthenticationandintegrity.TwosolutionstosecuretheSCADAcommunicationsare:placingthecryptographictechnologiesateachendofthecommunicationmedium(AmericanGasAssociation(AGA),2006;TsangandSmith,2008),or directlyintegratingthemintotheprotocol,suchasasecureDNP3thatprotectsthe communicationbetweenmasterstationsandoutstationssuchasPLCs,RTUs,and IEDs(Majdalawiehetal.,2006).

ApartfromtheeffortstoauthenticateandencryptSCADAcommunicationlinks, itisstillan openresearchchallenge tosecurethetensofSCADAprotocolsthat arebeingusedortodevelopsecuritymodulestoprotectthecommunicationlink betweentwoparties.AGA(AmericanGasAssociation(AGA),2006)highlighted thechallengesinbuildingsecuritymodulesthatcanbebroadlysummarizedintotwo points:(i)theadditionallatencycanbeintroducedbyasecureprotocoland(ii)the sophisticatedkeymanagementsystemrequireshighbandwidthandadditionalcommunicationchannelsthatSCADAcommunicationlinksarelacking.

Similarly,thetrafficfilteringprocessbetweenaSCADAnetworkandacorporate networkusingfirewallsisaconsiderablecountermeasuretomitigatethepotential threats.However,althoughmodernfirewallsareefficientforanalysingtraditionalIT traffic,theyareincapableofin-depthanalysisoftheSCADAprotocols.Todesign firewallstailoredtoSCADAsystems,theUKgovernmentsNationalInfrastructure SecurityCo-ordinationCenter(NISCC)publisheditsguidelinesfortheappropriateuseoffirewallsinSCADAnetworks(Byresetal.,2005).Itwasproposedthat amicrofirewallshouldbeembeddedwithineachSCADAdevicetoallowonlythe trafficrelevanttothehostdevices.However,thecomputationalpowerofSCADA devicescanbeachallengingissuetosupportthistypeoffirewall.

Firewallscanbeconfiguredusingrestrict-constrainedrulestocontroltrafficin andoutoftheSCADAnetwork;however,thiswillconflictwiththefeatureallowing remotemaintenanceandoperationbyvendorsandoperators.Additionally,firewalls areassumedtobephysicallyplacedbetweenthecommunicationendpointstoexamineeachpacketpriortopassingittothereceiver.Thismaycausealatencythatisnot acceptableinreal-timenetworks.Sincefirewallsdonotknowthe“normal”operationalbehaviorofthetargetedsystem,theycannotstopmaliciouscontrolmessages, whichmaydrivethetargetedsystemfromitsexpectedandnormalbehavior,when theyaresentfromacompromisedunitthatisoftenusedtoremotelycontroland monitorSCADAnetworks.Moreover,itisbeyondtheabilityoffirewallswhenthe

1.1OVERVIEW 5

attacksareinitiatedinternallyusinganalready-implantedmaliciouscodeordirectly byanemployee.Stuxnet(Falliereetal.,2011),Duqu(Bencsáthetal.,2012),and Flame(Munro,2012)aretherecentcyber-attacksthatwereinitiatedfrominside automationsystems.Therefore,therelianceonlyonfirewallsisnotsufficienttomitigatethepotentialthreatstoSCADAsystems.Hence,anadditionaldefenseneedsto beinstalledtomonitoralreadypredefined(orunexpected)patternsforeithernetwork trafficorsystembehaviorinordertodetectanyintrusionattempt.Thesystemusing suchamethodisknownintheinformationsecurityareaasan IntrusionDetection System (IDS).

Thereisnosecuritycountermeasuresthatcancompletelyprotectthetargetsystemsfrompotentialthreats,althoughanumberofcountermeasurescanbeusedin conjunctionwitheachotherinordertobuildarobustsecuritysystem.AnIDS(IntrusionDetectionSystem)isoneofthesecuritymethodsthathasdemonstratedpromisingresultsindetectingmaliciousactivitiesintraditionalITsystems.Thesourceof auditdataandthedetectionmethodsarethemain,salientpartsinthedevelopmentof anIDS.Thenetworktraffic,system-leveleventsandapplication-levelactivitiesare themostusualsourcesofauditdata.Thedetectionmethodsarecategorizedintotwo strategies: signature-based and anomaly-based.Theformersearchesforanattack whosesignatureisalreadyknown,whilethelattersearchesforactivitiesthatdeviate fromanexpectedpatternorfromthepredefinednormalbehavior.

DuetothedifferencesbetweenthenatureandcharacteristicsoftraditionalITand SCADAsystems,therehasbeenaneedforthedevelopmentofSCADA-specific IDSs,andinrecentyearsthishasbecomeaninterestingresearcharea.Intheliterature,theyvaryintermsoftheinformationsourcebeingusedandintheanalysis strategy.SomeofthemuseSCADAnetworktraffic(Lindaetal.,2009;Cheungetal., 2007;ValdesandCheung,2009),system-levelevents(Yangetal.,2006),ormeasurementandcontroldata(valuesofsensorsandactuators)(Rrushietal.,2009b;Fovino etal.,2010a,2012;Carcanoetal.,2011)astheinformationsourcetodetectmalicious, uncommonorinappropriateactionsofthemonitoredsystemusingvariousanalysis strategieswhichcanbesignature-based,anomaly-basedoracombinationofboth.

Itisbelievedthatmodelingofmeasurementandcontroldataisapromisingmeans ofdetectingmaliciousattacksintendedtojeopardizeatargetedSCADAsystem.For instance,theStuxnetwormisasophisticatedattackthattargetsacontrolsystemand initiallycannotbedetectedbytheantivirussoftwarethatwasinstalledinthevictim (Falliereetal.,2011).Thisisbecauseitusedzero-dayvulnerabilitiesandvalidated itsdriverswithtrustedstolencertificates.Moreover,itcouldhideitsmodifications usingsophisticatedPLCrootkits.However,thefinalgoalofthisattackcannotbehiddensincethemanipulationofmeasurementandcontroldatawillmakethebehavior ofthetargetedsystemdeviatefrompreviously seen ones.Thisisthe main motivationofthisbook,namelytoexplainindetail howtodesignSCADA-specificIDSs usingSCADAdata(measurementandcontroldata),thusenablingthereaderto build/implementaninformationsourcethatmonitorstheinternalbehaviorofagiven systemandprotectsitfrommaliciousactionsthatareintendedtosabotageordisturb theproperfunctionalityofthetargetedsystem.

INTRODUCTION

Aspreviouslyindicated,theanalysis/modelingmethod,whichwillbeusedto buildthedetectionmodelusingSCADAdata,isthesecondmostimportantpart aftertheselectionoftheinformationsourcewhendesigninganIntrusionDetectionSystem(IDS).Itisdifficulttobuildthe“normal”behaviorofagivensystem usingobservationsoftherawSCADAdatabecause,firstly,itcannotbeguaranteedthatallobservationsrepresentonebehavioraseither“normal”or“abnormal”, andthereforedomainexpertsarerequiredforthelabelingofeachobservation,and thisprocessisprohibitivelyexpensive;secondly,inordertoobtainpurely“normal” observationsthatcomprehensivelyrepresent“normal”behavior,thisrequiresagiven systemtoberunforalongperiodundernormalconditions,andthisnotpractical; and,finally,itischallengingtoobtainobservationsthatwillcoverallpossibleabnormalbehaviorthatcanoccurinthefuture.Therefore,westronglyarguethatthedesign ofaSCADA-specificIDSthatuses SCADAdata aswellas operatinginunsupervisedmode,wherethelabeleddataisnotavailable,hasgreatpotentialasameans ofaddressingtheaforementionedissues.TheunsupervisedIDScanbeatime-and cost-efficientmeansofbuildingdetectionmodelsfromunlabeleddata;however,this requiresanefficientandaccuratemethodtodifferentiatebetweenthenormaland abnormalobservationswithouttheinvolvementofexperts,whichiscostlyandprone tohumanerror.Then,fromobservationsofeachbehavior,eithernormalorabnormal, thedetectionmodelscanbebuilt.

1.2EXISTINGSOLUTIONS

Alayereddefensecouldbethebestsecuritymechanism,whereeachlayerinthe computerandnetworksystemisprovidedwithaparticularsecuritycountermeasure. Forinstance,organizationsdeployfirewallsbetweentheirprivatenetworksandotherstopreventunauthorizedusersfromentering.However,firewallscannotaddress allrisksandvulnerabilities.Therefore,anadditionalsecuritylayerisrequired.The lastcomponentatthesecuritylevelistheIDS,whichisusedtomonitorintrusive activities(Pathan,2014).TheconceptofanIDSisbasedontheassumptionthat thebehaviorofintrusiveactivitiesarenoticeablydistinguishablefromthenormal ones(Denning,1987).Sincethelastdecade,comparedtoothersecuritycountermeasures,thedeploymentofIDStechnologyhasattractedgreatinterestfromthe traditionalITsystemsdomain(Pathan,2014).Thepromisingfunctionalitiesofthis technologyhaveencouragedresearchersandpractitionersconcernedwiththesecurityofSCADAsystemstoadoptthistechnologywhiletakingintoaccountthenature andcharacteristicsofSCADAsystems.

TodesignanIDS,twomainprocessesareoftenconsidered:first,theselectionof theinformationsource(e.g.,network-based,application-based)tobeused,through whichanomaliescanbedetected;second,thebuildingofthedetectionmodelsusing thespecifiedinformationsource.SCADA-specificIDSscanbebroadlygroupedinto threecategoriesintermsofthelatterprocess: signature-baseddetection (Digitalbond, 2013), anomalydetection (Lindaetal.,2009;Kumaretal.,2007;ValdesandCheung, 2009;Yangetal.,2006;Ningetal.,2002;Grossetal.,2004),and specification-based

detection (Cheungetal.,2007;Carcanoetal.,2011;Fovinoetal.,2010a;Fernandez etal.,2009).Recently,severalsignature-basedrules(Digitalbond,2013)havebeen designedtospecificallydetectparticularattacksonSCADAprotocols.Therules canperfectlydetect knownattacks attheSCADAnetworklevel.Todetect unknown attacks attheSCADAnetworklevel,anumberofmethodshavebeenproposed.Linda etal.(2009)suggestedawindow-basedfeatureextractionmethodtoextractimportantfeaturesofSCADAnetworktrafficandthenusedafeed-forwardneuralnetwork withthebackpropagationtrainingalgorithmformodelingtheboundariesofnormalbehavior.However,thismethodsuffersfromthegreatamountofexecutiontime requiredinthetrainingphase,inadditiontotheneedforrelearningtheboundaries ofnormalbehavioruponreceivingnewbehavior.

Themodel-baseddetectionmethodproposedinValdesandCheung(2009)illustratescommunicationpatterns.ThisisbasedontheassumptionthatthecommunicationpatternsofcontrolsystemsareregularandpredictablebecauseSCADAhas specificservicesaswellasinterconnectedandcommunicateddevicesthatarealready predefined.Thismethodisusefulinprovidingabordermonitoringoftherequested servicessanddevices.Similarly,Grossetal.(2004)proposedacollaborativemethod, named“selecticast”,whichusesacentralizedservertodisperseamongIDsensorsany informationaboutactivitiescomingfromsuspiciousIPs.Ningetal.(2002)identify causalrelationshipsbetweenalertsusingprerequisitesandconsequences.Inessence, thesemethodsfailtodetect high-levelcontrolattacks,whicharethemostdifficult threatstocombatsuccessfully(Weietal.,2011).Furthermore,SCADAnetworklevel methodsarenotconcernedwiththeoperationalmeaningoftheprocessparameter values,whicharecarriedbySCADAprotocols,aslongastheyarenotviolatingthe specificationsoftheprotocolbeingusedorabroaderpictureofthemonitoredsystem.

Thus,analyticalmodelsbasedonthefullsystem’sspecificationshavebeensuggestedintheliterature.Fovinoetal.(2010a)proposedananalyticalmethodtoidentify criticalstatesforspecific-correlatedprocessparameters.Therefore,thedeveloped detectionmodelsareusedtodetectmaliciousactions(suchashigh-levelcontrol attacks)thattrytodrivethetargetedsystemintoacriticalstate.Inthesamedirection,Carcanoetal.(2011)andFovinoetal.(2012)extendedthisideabyidentifyingcriticalstatesforspecific-correlatedprocessparameters.Then,eachcritical stateisrepresentedbyamultivariatevector,eachvectorbeingareferencepoint tomeasurethedegreeofcriticalityofthecurrentsystem.Forexample,whenthe distanceofthecurrentsystemstateisclosetoanycriticalstate,itshowsthatthesystemisapproachingacriticalstate.However,thecriticalstate-basedmethodsrequire fullspecificationsofallcorrelatedprocessparametersinadditiontotheirrespective acceptablevalues.Moreover,theanalyticalidentificationofcriticalstatesforarelativelylargenumberofcorrelatedprocessparametersistime-expensiveanddifficult. Thisisbecausethecomplexityoftheinterrelationshipamongtheseparametersis proportionaltotheirnumbers.Furthermore,anychangeinthesystembroughtabout byaddingorremovingprocessparameterswillrequirethesameeffortagain.Obviously,humanerrorsarehighlyexpectedintheidentificationprocessofcriticalsystem states.

INTRODUCTION

Duetotheaforementionedissuesrelatingtoanalyticalmethods,SCADA data-drivenmethodshavebeenproposedtocapturethemechanisticbehaviorof SCADAsystemswithoutaknowledgeofthephysicalbehaviorofthesystems. ItwasexperimentallyfoundbyWenxianandJiesheng(2011)thatoperational SCADAdataforwindturbinesystemsareusefuliftheyareproperlyanalyzedto indicatetheconditionofthesystemthatisbeingsupervised.AnumberofSCADA data-drivenmethodsforanomalydetectionhaveappearedintheliterature.Jin etal.(2006)extendedthesetofinvariantmodelsby avaluerangemodel todetect anomalousvaluesinthevaluesforaparticularprocessparameter.Apredetermined thresholdisproposedforeachparameterandanyvalueexceedingthisthreshold isconsideredasanomalous.Thismethodcandetecttheanomalousvaluesofan individualprocessparameter.However,thevalueofanindividualprocessparameter maynotbeabnormal,but,incombinationwithotherprocessparameters,may produceabnormalobservation,whichveryrarelyoccurs.Thesetypesofparameter arecalled multivariateparameters andareassumedtobedirectly(orindirectly)correlated.Rrushietal.(2009b)appliedprobabilisticmodelstoestimatethenormalcy oftheevolutionofvaluesofmultivariateprocessparameters.Similarly,Martonetal. (2013)proposedadata-drivenmethodtodetectabnormalbehaviourinindustrial equipment,wheretwomultivariateanalysismethods,namelyprincipalcomponent analysis(PCA)andpartialleastsquares(PLS),arecombinedtobuildthedetection models.Neuralnetwork-basedmethodshavebeenproposedtomodelthenormal behaviorforvariousSCADAapplications.Forinstance,Gaoetal.(2010)proposed aneural-network-basedintrusiondetectionsystemforwatertankcontrolsystems.In adifferentapplication,thismethodhasbeenadaptedbyZaheretal.(2009)tobuild thenormalbehaviourforawindturbinetoidentifyfaultsorunexpectedbehavior (anomalies).

AlthoughtheresultsfortheaforementionedSCADAdata-drivenmethodsare promising,theyworkonlyinsupervisedorsemisupervisedmodes.Theformer methodisapplicablewhenthelabelsforbothnormal/abnormalbehaviorare available.Domainexpertsneedtobeinvolvedinthelabelingprocessbutitiscostly andtime-consumingtolabelhundredsofthousandsofdataobservations(instances). Inaddition,itisdifficulttoobtainabnormalobservationsthatcomprehensively representanomalousbehavior,whileinthelattermodeaone-classproblem(either normalorabnormaldata)isrequiredtotrainthemodel.Obtaininganormaltraining datasetcanbedonebyrunningatargetsystemundernormalconditionsand thecollecteddataisassumedtobenormal.Toobtainpurelynormaldatathat comprehensivelyrepresentnormalbehavior,thesystemhastooperateforalong timeundernormalconditions.However,thiscannotbeguaranteedandtherefore anyanomalousactivityoccurringduringthisperiodwillbelearnedasnormal.On theotherhand,itischallengingtoobtainatrainingdatasetthatcoversallpossible anomalousbehaviorthatcanoccurinthefuture.

Unlikesupervised,semisupervised,andanalyticalsolutions,thisbookisabout designingunsupervisedanomalydetectionmethods,whereexpertsarenot requiredtopreparealabeledtrainingdatasetoranalyticallydefinetheboundariesof normal/abnormalbehaviorofagivensystem.Inotherwords,thisbookisinterested

indevelopinga robustunsupervisedintrusiondetectionsystem thatautomatically identifies,fromunlabeledSCADAdata,bothnormalandabnormalbehavior,and thenextractstheproximity-detectionrulesforeachbehavior.

1.3SIGNIFICANTRESEARCHPROBLEMS

Inrecentyears,manyresearchersandpractitionershaveturnedtheirattentionto SCADAdatatobuilddata-drivenmethodsthatareabletolearnthe mechanistic behaviorofSCADAsystemswithoutaknowledgeofthephysicalbehaviorofthese systems.Suchmethodshaveshownapromisingabilitytodetectanomalies,malfunctions,orfaultsinSCADAcomponents.Nonetheless,itremainsarelatively open researcharea todevelopunsupervisedSCADAdata-drivendetectionmethodsthat canbetime-andcost-efficientforlearningdetectionmethodsfromunlabeleddata. However,suchmethodsoftenhavealowdetectionaccuracy.Thefocusofthisbook isaboutthe designofanefficientandaccurateunsupervisedSCADAdata-driven IDS,andfourmainresearchproblemsareformulatedhereforthispurpose.Threeof thesepertaintothedevelopmentofmethodsthatareusedtobuildarobustunsupervisedSCADAdata-drivenIDS.Thefourthresearchproblemrelatestothedesignofa frameworkforaSCADAsecuritytestbedthatisintendedtobeanevaluationandtestingenvironmentforSCADAsecurityingeneralandfortheproposedunsupervised IDSinparticular.

1. HowtodesignaSCADA-basedtestbedthatisarealisticalternativeforreal SCADAsystemssothatitcanbeusedforproperSCADAsecurityevaluationandtestingpurposes. AnevaluationofthesecuritysolutionsofSCADA systemsisimportant.However,actualSCADAsystemscannotbeusedforsuch apurposebecauseavailabilityandperformance,whicharethemostimportant issues,aremostlikelytobeaffectedwhenanalysingvulnerabilities,threats, andtheimpactofattacks.Toaddressthisproblem,“realSCADAtestbeds”have beensetupforevaluationpurposes,buttheyarecostlyandbeyondthereach ofmostresearchers.Similarly,smallrealSCADAtestbedshavealsobeenset up;however,theyarestillproprietaryandlocation-constrained.Unfortunately, suchlabsarenotavailabletoresearchersandpractionnersinterestedinworkingonSCADAsecurity.Hence,thedesignofaSCADA-basedtestbedforthat purposewillbeveryusefulforevaluationandtestingpurposes.Twoessential partscouldbeconsideredhere: SCADAsystemcomponents anda controlled environment.Intheformer,bothhigh-levelandfield-levelcomponentswillbe consideredandtheintegrationofarealSCADAprotocolwillbedevisedto realisticallyproduceSCADAnetworktraffic.Inthelatter,itisimportantto modelacontrolledenvironmentsuchassmartgridpowerorwaterdistribution systemssothatwecanproducerealisticSCADAdata.

2. Howtomakeanexistingsuitabledataminingmethoddealwithlarge high-dimensionaldata. DuetothespecificnatureoftheunsupervisedSCADA

INTRODUCTION

systems,anIDSwillbedesignedherebasedonSCADAdata-drivenmethods fromtheunlabeledSCADAdatawhich,itishighlyexpected,willcontain anomalousdata;thetaskisintendedtogiveananomalyscoreforeachobservation.The k -NearestNeighbour(k -NN)algorithmwasfound,fromanextensive literaturereview,tobeoneofthetoptenmostinterestingandbestalgorithms fordataminingingeneral(Wuetal.,2008),and,inparticular,ithasdemonstratedpromisingresultsinanomalydetection(Chandolaetal.,2009).This isbecausetheanomalousobservationisassumedtohaveaneighborhoodin whichitwillstandout,whileanormalobservationwillhaveaneighborhood whereallitsneighborswillbeexactlylikeit.However,havingtoexamineall observationsinadatasetinordertofind k -NNforanobservation x isthemain drawbackofthismethod,especiallywithavastamountofhighdimensional data.Toefficientlyutilizethismethod,thereductionofcomputationtimein finding k -NNistheaimofthisresearchproblemthatthisbookendeavorsto address.

3. Howtolearnclustering-basedproximityrulesfromunlabeledSCADA dataforSCADAanomalydetectionmethods. TobuildefficientSCADA data-drivendetectionmethods,theefficientproposed k -NNalgorithminproblem2isusedtoassignananomalyscoretoeachobservationinthetraining dataset.However,itisimpracticaltouseallthetrainingdataintheanomaly detectionphase.Thisisbecausealargememorycapacityisneededtostoreall scoredobservationsanditiscomputationallyinfeasibletocomputethesimilaritybetweentheseobservationsandeachcurrentnewobservation.Therefore, itwouldbeidealtoefficientlyseparatetheobservations,whicharehighly expectedtobeconsistent(normal)orinconsistent(abnormal).Then,afew proximitydetectionrulesforeachbehavior,whetherconsistentorinconsistent, areautomaticallyextractedfromtheobservationsthatbelongtothatbehavior.

4. Howtocomputeaglobalandefficientanomalythresholdforunsuperviseddetectionmethods. Anomaly-scoring-basedandclustering-basedmethodsareamongthebest-knownonesthatareoftenusedtoidentifytheanomalies inunlabeleddata.Withanomaly-scoring-basedmethods(Eskinetal.,2002; AngiulliandPizzuti,2002;ZhangandWang,2006),allobservationsinadata setaregivenananomalyscoreandthereforeactualanomaliesareassumed tohavethehighestscores.Thekeyproblemishowtofindthenear-optimal cut-offthresholdthatminimizesthefalsepositiveratewhilemaximizingthe detectionrate.Ontheonehand,clustering-basedmethods(Portnoyetal.,2001; MahoneyandChan,2003a;Portnoyetal.,2001;Jianliangetal.,2009;Münz etal.,2007)groupsimilarobservationstogetherintoanumberofclusters,and anomaliesareidentifiedbymakinguseofthefactthatthoseanomalousobservationswillbeconsideredasoutliers,andthereforewillnotbeassignedtoany cluster,ortheywillbegroupedinsmallclustersthathavesomecharacteristicsthataredifferentfromthoseofnormalclusters.However,thedetection ofanomaliesiscontrolledthroughseveralparameterchoiceswithineachused detectionmethod.Forinstance,giventhetop50%oftheobservationsthathave thehighestanomalyscores,theseareassumedasanomalies.Inthiscase,both

1.4BOOKFOCUS 11

detectionandfalsepositiverateswillbemuchhigher.Similarly,labelingalow percentageoflargestclustersasnormalinclustering-basedintrusiondetectionmethodswillresultinhigherdetectionandfalsepositiverates.Therefore, theeffectivenessofunsupervisedintrusionmethodsissensitivetoparameter choices,especiallywhentheboundariesbetweennormalandabnormalbehaviorarenotclearlydistinguishable.Thus,itwouldbeinterestingtoidentifythe observationswhoseanomalyscoresareextremeandsignificantlydeviatefrom others,andthensuchobservationsareassumedtobe“abnormal”.Onanother hand,theobservationswhoseanomalyscoresaresignificantlydistantfrom “abnormal”oneswillbeassumedtobe“normal”.Then,theensemble-based supervisedlearningisproposedtofindaglobalandefficientanomalythreshold usingtheinformationofboth“normal”/“abnormal”behavior.

1.4BOOKFOCUS

Thissectionsummarizestheimportantlessonslearnedfromthedevelopmentof robustunsupervisedSCADAdata-drivenIntrusionDetectionSystems(IDSs),which aredetailedinthevariouschaptersofthisbook.Thefirstlessonrelatestothe designofaSCADAsecuritytestbedthroughwhichthepracticalityandefficiency ofSCADAsecuritysolutionsareevaluatedandtested,while,theremainingthree aspectsfocusonthedetailsofthevariouselementsofarobustunsupervisedSCADA data-drivenIDS.

•TheevaluationandtestingofsecuritysolutionstailoredtoSCADAsystemsisa challengingissuefacingresearchersandpractitionersworkingonsuchsystems. Severalreasonsforthisinclude:privacy,security,andlegalconstraintsthatpreventorganizationsfrompublishingtheirrespectiveSCADAdata.Inaddition,it isnotfeasibletoconductexperimentsonactuallivesystems,asthisishighly likelytoaffecttheiravailabilityandperformance.Moreover,theestablishment ofarealSCADALabcanbecostlyandplace-constrained,andthereforeunavailabletoallresearchersandpractitioners.Inthisbook,aframeworkforaSCADA securitytestbedisdescribedtobuildafullSCADAsystembasedonahybridof emulationandsimulationmethods.ArealSCADAprotocolisimplementedand thereforerealisticSCADAnetworktrafficisgenerated.Moreover,akeybenefit ofthisframeworkisthatitisarealisticalternativetoreal-worldSCADAsystemsand,inparticular,itcanbeusedtoevaluatetheaccuracyandefficiencyof unsupervisedSCADAdata-drivenIntrusionDetectionSystems(IDSs).

•Unsupervisedlearningforanomaly-detectionmethodsistime-andcost-efficient sincetheycanlearnfromunlabeleddata.Thisisbecausehumanexpertise isnotrequiredtoidentifythebehavior(whethernormalorabnormal)for eachobservationinalargeamountoftrainingdatasets.Anomalyscoring methodsarebelievedtobepromisingautomaticmethodsforassigningan anomalydegreetoeachobservation(Chandolaetal.,2009).The k -NNmethod

INTRODUCTION

isoneofthemostinterestingandbestmethodsforcomputingthedegreeof anomalybasedonneighborhooddensityofaparticularobservation(Wuetal., 2008).However,thismethodrequireshighcomputationalcost,especiallywith largeandhigh-dimensionaldatathatweexpecttohaveinthedevelopment ofanunsupervisedSCADAdata-drivenIDS.Therefore,thisbookdescribes anefficient k -nearestneighbor-basedmethod,called k NNVWC(k -Nearest NeighborapproachbasedonVarious-WidthsClustering),whichutilizesanovel various-widthclusteringalgorithmandtriangleinequality.

•ItisnotfeasibletoretainallthetrainingdatainSCADAdata-drivenanomaly detectionmethods,especiallywhenthesearebuiltfromalargetrainingdata set.Thisisbecausesuchdetectionmethodswillbeusedforon-linemonitoring, andthereforethemoreinformationretainedinthedetectionmethods,thelarger thememorycapacityrequiredandthehigherthecomputationcostrequired. Toaddressthisissue,thisbookdescribesaclustering-basedmethodtoextract proximity-baseddetectionrules,calledSDAD(SCADAData-DrivenAnomaly Detection),whichareassumedtobeatinyportioncomparedtothetrainingdata, foreachbehavior(normalandabnormal).Eachrulecomprehensivelyrepresents asubsetofobservationsthatrepresentonlyonebehavior.

•Unsupervisedlearningforanomaly-detectionmethodsarebasedmainlyon assumptionstofindthenear-optimalanomalydetectionthreshold.Therefore, theaccuracyofthedetectionmethodsisbasedonthevalidityoftheassumptions.Thisbook,however,describesanefficientmethod,calledGATUD (GlobalAnomalyThresholdtoUnsupervisedDetection),whichfirstlyidentifiesobservationswhoseanomalyscoressignificantlydeviatefromothersto represent“abnormal”behavior.Ontheotherhand,atinyportionofobservations whoseanomalyscoresarethesmallestareconsideredtorepresent“normal” behavior.Thenanensemble-baseddecision-makingmethodisdescribed,which aimstofindaglobalandefficientanomalythresholdusingtheinformationof both“normal”/“abnormal”behavior.

1.5BOOKORGANIZATION

Theremainderofthebookisstructuredasfollows.Chapter2givesanintroductionto readerswhodonothaveanunderstandingofSCADAsystemsandtheirarchitectures, andthemaincomponents.Thisincludesadescriptionoftherelationshipbetweenthe maincomponentsandthreegenerationsofSCADAsystems.Theclassificationofa SCADAIDSbasedonitsarchitectureandimplementationisdescribed. Chapter3describesindetailSCADAVT,aframeworkforaSCADAsecurity testbedbasedonvirtualizationtechnology.ThisframeworkisusedtocreateasimulationofthemainSCADAsystemcomponentsandacontrolledenvironment.Themain SCADAcomponentsandrealSCADAprotocol(e.g.,Modbus/TCP)areintegrated. Inaddition,aserver,whichactsasasurrogateforwaterdistributionsystems,isintroduced.ThisframeworkisusedthroughoutthebooktosimulatearealisticSCADA

systemforsupervisingandcontrollingawaterdistributionsystem.Thissimulation ismentionedintheotherchapterstoevaluateandtestanomalydetectionmodelsfor SCADAsystems.

Chapter4describesindetail k NNVWC,anefficientmethodthatfindsthe k -nearest neighborsinlargeandhigh-dimensionaldata.In k NNVWC,anewvarious-widths clusteringalgorithmisintroduced,wherethedataispartitionedintoanumberof clustersusingvariouswidths.Triangleinequalityisadaptedtopruneunlikelyclusters inthesearchprocessof k -nearestneighborsforanobservation.Experimentalresults showthat k NNVWCperformswellinfinding k -nearestneighborscomparedtoa numberof k -nearestneighbor-basedalgorithms,especiallyforadatasetwithhigh dimensions,variousdistributions,andlargesize.

Chapter5describesSDAD,amethodthatextractsproximity-baseddetectionrules fromunlabeledSCADAdata,basedonaclustering-basedmethod.Theevaluationof SDADiscarriedoutusingrealandsimulateddatasets.Theextractedproximity-based detectionrulesshowasignificantdetectionaccuracyratecomparedwithanexisting clustering-basedintrusiondetectionalgorithm.

Chapter6describesGATUD,amethodthatfindsaglobalandefficientanomaly threshold.GATUDisproposedasanadd-oncomponentthatcanbeattachedtoany unsupervisedanomalydetectionmethodinordertodefinethenear-optimalanomaly threshold.GATUDshowssignificantandpromisingresultswithtwounsupervised anomalydetectionmethods.

Chapter7looksattheauthenticationaspectsrelatedtoSCADAenvironments. ItdescribestwoinnovativeprotocolswhicharebasedonTPASS(Threshold Password-AuthenticatedSecretSharing)protocols;oneisbuiltontwo-phase commitmentandhaslowercomputationcomplexityandtheotherisbasedon zero-knowledgeproofandhaslesscommunicationrounds.Bothprotocolsare particularlyefficientfortheclient,whoonlyneedstosendarequestandreceive aresponse.Additionally,thischapterprovidesrigorousproofsofsecurityforthe protocolsinthestandardmodel.

Finally,Chapter8concludeswithasummaryofthevarioustoolsandmethods describedinthisbooktotheextantbodyofresearchandsuggestspossibledirections forfutureresearch.

Scada security abdulmohsen almalawi - Quickly download the ebook to start your content journey

Conventionally Yours Annabeth Albert

SCADASECURITY: MACHINELEARNING CONCEPTSFOR INTRUSIONDETECTION ANDPREVENTION

Toourdearparents

CONTENTS

FOREWORD

PREFACE

ACRONYMS

1.1OVERVIEW

INTRODUCTION

INTRODUCTION

1.2EXISTINGSOLUTIONS

INTRODUCTION

1.3SIGNIFICANTRESEARCHPROBLEMS

INTRODUCTION

1.4BOOKFOCUS

INTRODUCTION

1.5BOOKORGANIZATION