AI_based_clustering_of_airports

Page 1


WorkingtowardsanAI-basedclusteringof airports,intheeffortofimprovinghumanitarian disasterpreparedness

1 andKarlaSaldañaOchoa2

1 DelftUniversityofTechnology,Delft,Netherlands, maria.browarska@gmail.com, 2 UniversityofFlorida,SchoolofArchitecture, CollegeofDesign,ConstructionandPlanning Gainesville,USA ksaldanaochoa@ufl.edu

Abstract. Inrecentyears,naturaldisastershaveincreasedinfrequency, causingsignificantdamagetocommunitiesandinfrastructureworldwide. Whenanaturaldisasterstrikes,airportsintheaffectedregionhaveto adaptquicklyfromservingregularpassengerstobecomingahumanitarian hubhandlingamassiveincreaseinpassengersandcargo.Severalcountriesareparticularlyvulnerableandpronetosuchadevastatingevent. Althoughexistinginitiativesaimtoraiseawarenessandimproveairport preparedness,authoritiesareoftenisolatedintheirresilienceeffortsas theytendtoactindividually,andtheirresponseisoftenboundbylocal experience.Consequently,thisresearchaimstobroadenthefieldofview fromalocaltoaglobalonebycompilingadatabaseof971airportsworldwidewithcorrespondingsocio-technicalcharacteristicsinvariousdata modalities.Inaddition,throughadatascienceapproach,atransformationofthedifferentdatamodalitieswasperformedtoextractnumerical featurevectorssothatinfuturestudiesacorrelationbetweenairports canbefound,tofindsimilarairportsfromwhichdifferentapproachesto disasterpreparednessandresponsecanbelearned.

Keywords: airportsdatabase,disasterpreparedness,AI-basedclustering

1Introduction

Whenanaturaldisasterstrikes,thenearestairportbecomesthecriticallinkfor deliveringandorganizingreliefaidwhiletryingtostayefficientinevacuatingcitizensandreceivingemergencypersonnel[5].However,theexistinginfrastructure oftencannothandlethesuddenspikeinthevolumeofincominggoods[4].When airportsbecomenonoperational,theonlywaytoreceivevaluableaidisviaroad, rail,andwater,whichisoftenmuchlessefficientandtime-consuming[16].

Eventhoughdisastersandhumanitarianaidarenotthenewestchallenges, thereisstillmuchroomforimprovement.Airportsareacomplexsocio-technical challenge,astheyaresetinanenvironmentoftechnicalandoperationalchallenges,

lawsandregulations,internationalandregionalcooperationofstakeholdersfrom variousfieldsimprovinghumanitarianlogistics.Tocharacterizeanairport,we needtoconsidervariousfeaturesthatdescribetheircomplexity,a)geospatialand airport-specificdata:areasurrounding,reachability,numberofrunways,taxiways; b)demographicdata:urbanindexes,andpopulationaroundtheairport;and c)geographicandurbandata:seaportdataandbuiltenvironmentinformation. Creatingsuchadatabasecanhelpexpertsturnthosedatapointsintovaluable insights.

Thus,thisresearchexploreshowdatasciencecouldhelpestablishabase forformingcollaborationsbetweenairportsthatmightfacesimilarchallenges indisasterpreparednessefforts.Thegoalistobuildacomprehensivedatabase describingairportsfromtheperspectiveoftheirdisasterpreparednessthatwill helpfutureresearchersfindsimilaritiesbetweenthem,basedontheirintrinsic socio-technicalfeatures,sothatperhapsanairportinIndonesiacouldbematched withitssiblingairportintheCaribbeans.Theresearchinvolvedseveralprogrammingoperations––startingwithcollectingdatauptodataprocessing.The databasecanbefoundinthefollowingrepository.

https://gitlab.com/maria.browarska/OSM-SOM

Theproposeddatabaseofairportsandtheirnumericalfeaturesarethefirst steptoaprocessthatwillconcludecreatinggroup-specificpolicyadvicefor similarairports.Withthisarticle,wewanttodescribethestepsfromcollection, normalization,andpre-processingofthedatatotransformingthemultimodality ofthegathereddatatoanumericalfeaturevectorthatinfutureresearchcanbe usedforthegroupingofsimilarairportsthroughUnsupervisedMachineLearning algorithmsthatcanclustersimilarairportsbasedonsimilarnumericalfeatures. HavingarelevantscenariotoapplyMLthatbenefitssocietyatlarge.

2Knowledgegapandresearchgoal

Inordertodefinekeyconcepts,narrowdownthescopeoftheresearchand preciselydefinetheknowledgegap,aliteraturereviewwasconducted,followed by5semi-structuredinterviewswithindustryexperts.

2.1Literaturereview

Mostofthereviewedarticlesfocusedonacasestudyastheresearchapproach, oftenlookingatindividualairportsandassessinghistoricalevents.Researchers analysedthebehaviourofairportsinspecificdisastrousevents,mainlyfocusing onorganisationalprocessesandstakeholders’cooperation[17,22,16].Whileall theconsideredfeatures,withoutadoubt,influencelogisticaloperations,theyare alsouniqueforeachairport.Hence,itischallengingtodrawgeneralconclusions thatcouldapplytootherairportssincetheirorganisationalstructuremaydiffer, duetointernationalandregionalregulations,resourcesandneeds.

AI-basedclusteringofairports3

Someoftheauthorspointedouttheimportanceofthegeographicallocationof anairport,structuralfeaturesaswellasreachability[23,3,21].Pandeyetal. [14]provedthatutilisinggeo-spacialdataisbeneficialforairporthumanitarian responseplanningandthatairportauthoritiesareinterestedintoolsthatcan helptoplanlogisticalprocedures.ChoiandHanaoka[3]developedamodelthat visualisesalayoutofahumanitarianbasebasedonstructuralfeaturesofan airportandprovesitspotentialapplicabilitywithacasestudy,suggestingthat moreresearchisneededtogeneralisetheirresults.

Whilesomeoftheauthorssuggestedthatcooperationbetweenairportsthat strugglewithsimilarchallengeswouldhaveapositiveoutcome[9,17],noneof themexploredthepossiblebackboneofsuchcooperation.Thatfinding,combinedwiththeideaofstructuralfeaturesofairportshavinganimpactontheir humanitarianlogisticalprocedures,ledtodefiningtheknowledgegap.

Thespecificmethodsappliedinthisresearchwereusedinthefieldofhumanitarianaid-relatedresearchbefore,butonalocalornationalscale,asshown bySaldañaOchoa,ComesandChen[12,2].Theglobalapproachisachallenge duetothelimitedavailabilityofreliabledata,butifsuccessful,itpavestheway formoredetailedresearchonaglobalscale.Thisapproachcouldsignificantly benefitthelessdevelopedcountries,whichoftendonothaveresourcesforlocal advancedresearchandpreparednessstrategies.

Untilnow,thepractitionersinthefield,suchasGetAirportsReadyforDisaster(GARD),haveusedstraightforwardmethodsforassessingthevulnerabilityof airportsandhadtopreparedifferentstrategiesforeachclient.GARD’scapacity isminimal,andthisresearchcouldleadtonewwaysforauthoritiestoprepare, thankstoestablishingcollaborationsdirectlywithotherairportsfacingsimilar challenges.

2.2Researchgoal

Thegoalofthisresearchisto(1)betterunderstandthechallengesthatairports facewhenanaturaldisasterstrikesandtheirpreparednessactivities.This understandingshallthenbe(2)translatedintoalistofsocio-technicalfeatures influencingthelevelofpreparednessandairportcapabilitiesinfacingadisaster. Thefindingofkeyfeaturesisrelevantfor(3)buildingadatabasecontaining valuablehumanitarianaid-relatedinformationaboutseveralairportsworldwide, composedsolelyfrompubliclyavailablesources.Thefocusonpubliclyavailable dataisconditionedbyalargenumberofairportsbeinganalyzed,whichmakes itimpossibletoconductsurveysandobtaininformationdirectlywithinthe resourcesandtimeframeofthisresearch.

3Methodology

Inordertofindspecificqualitiesandfeaturesthatinfluenceairports’preparedness foradisaster,athoroughunderstandingofactivitiesandtheenvironmentin whichtheytakeplaceisneeded.Thisinformationwasderivedfromadesk

studyaccompaniedbysemi-structuredinterviews(table3intheAppendixlists theorganizationcontactedforinterviewing)withexpertsonairports’disaster preparednessandperformance,summarizedintable1.Thenextstepwasto translateidentifiedchallengesinfluencingtheperformanceofanairportina post-disasterscenariointosocio-technicalfeaturestoachieveagoodstarting pointforthedataminingprocess.

Table1. Socio-technicalfeatures

StructuralandcapacityfeaturesAccessibilityfeaturesOrganisationalfeaturesRiskrelatedfeatures RunwaysandtheircharacteristicsAirportconnectionHowmuchstaffisavailableRiskofoccurrenceofanaturaldisaster AircraftparkinganditscharacteristicGeographicalsurroundingsHowwellthestaffistrainedRegionalcapacityforhandlingdisasters TerminalsandtheircharacteristicsAlternativeairportsandseaportsWhoownstheairportWhatistheairport’smainPurpose(civil/military) Storagefacilitiesbothopen-airandcoveredwarehousesWhethertheairportwaspartofanypreparednessprograms

Thedataminingprocesswascomposedoftwomainiterativephases.First,the identifiedsocio-technicalfeaturesofairportshadtobetranslatedintomeasurable datapoints––numerical,categorical,ordescriptive.Thesecondphasewas retrievingdatafrompubliclyavailablesources,asdescribedinmoredetailin diagram2.Whenbuildingadatabasefrompubliclyavailablesources,itiscrucial tohaveastrongunderstandingofwhatwewanttodescribetoallowforflexibility andeasyreplacementoradjustmentoforiginallyplannedmeasures.Forfuture research,thedataminingprocesscouldbereplacedbyconductingdetailed surveyswithairports.Withsuchsurveys,itwouldbepossibletoobtaintheexact measurestoaccountforallplannedfeaturesstraightfromthesource,allowing forbetteraccuracyandtrustworthiness.

Tostartbuildingthedatabase,wechoosevulnerablecountriesandairports usingtheINFORMRiskIndexasqualificationcriteriaforchoosing.First,a listofallairportsthatarelocatedwithinthesecountrieswasexported.Next, theairports.csvfilefromOurAirportswasusedtoselectonlyairportscurrently operating,i.e.,havescheduledservices.Anadditionalcriterionwastheairport type-heliports,seaplanebases,andclosedoneswereexcluded,whilesmall, medium,andlargewerechosen.Theseoperationsresultedinformingalistof971 airports,withtheirnames,coordinates,InternationalAirTransportAssociation (IATA)codes,andInternationalCivilAviationOrganization(ICAO)codes.This listwouldformthebaseforallmassqueriesappliedviaAPIstocollectdatafor eachairport.Figure1presentsthe971airportsonaWorldmap.

4Buildingthedatabase

Datausedinthisresearchcamefromamultiplicityofsourcesinvariousdata modalitiesandformats.Inordertotranslatesocio-technicalintocomparablesets ofnumericalfeatures,variouscircumstancesneedtobetakenintoaccount,suchas availabilityofdata,methodsofmeasuringandquantifyingspecificcharacteristics, theircorrelations,andlevelofimportance.Inordertokeeptrackofchanges andmakethedatabaseeasytonavigate,theSQLitedatabasewasbuiltwith theuseofDBBrowsersoftware.TheOSMqueries,theGeoDB-citiesAPI

AI-basedclusteringofairports5

Fig.1. 971airportschosentobeanalyzed,placedonaworldmap

wereconnectedtothedatabasethroughPythonqueries,asseenintheattached GitLabrepository.Toaddrecordsandfeaturestothedatabase,outputsfrom varioussourceswereconvertedintothe.csvformat.ResultsofOpenStreetMap (OSM)andAPIquerieswereautomaticallywrittenintothedatabasedirectly.

4.1Datasources

OSM InordertoextractdatafromOSM,Overpassturbowasused-aweb-based dataminingtool,designedtorunOSMAPIqueriesandpresentthemonamap. Sincedataneededtobeextractedforover900airports,multiplescriptswere written,withtheuseoftheOverPyAPI,publishedundertheMITlicense[10]. Adetaileddocumentationofthescriptsandqueriescanbefoundintheattached GitLabrepository.

OurAirports OurAirportsisafreeandpublicservicethatmaintainsdataabout airportsaroundtheworld.SimilarlytoOSM,itisrunbyvolunteers-members createrecordsindividually-butatthesametimemuchoftheinformation comesfromofficialgovernmentalinstitutionssuchastheU.S.FederalAviation Administration[13].Inadditionfromexploringanonlineinteractivemap-based tool,userscanalsodownloaddailyupdatedfileswithdatarecordsofallairports thatarepartoftheservice.Forthisresearch,datasetofallairportsandrunways wasused.

Globalairports Themostcomprehensive,publiclyavailable,datasetaimed atprovidinginformationondisasterlogisticsiscalled Globalairports andwas publishedbytheHumanitarianDataservice[7].Officiallycoordinatedbythe

WorldFoodProgramme,basedonopenlyavailabledatafromsourcessuchas OSMandOurAirports,italsocontainsinputsfrompartnersthoughtheLogistics ClusterandLogisticsCapacityAssessments[7].Eventhoughthedatasetis updated,accordingtoaWFPrepresentativeinterviewed,formanyplacesthe datahasnotbeencheckedsincetheoriginaluploadin2013.Furthermore,the datasetcontainsfairlybasicinformationonairports.Datapointspresentedin thetablearenotavailableforeveryairportintheset.

TheLogisticsPerformanceIndex TheLogisticsPerformanceIndex(LPI) providesinformationonhoweasyordifficultitistotransportgoodsinthe analysedcountries.TheWorldBank,togetherwithvariouslogistics-related partnerorganisationsconductsthesurveyeverytwoyears[1].Whileaimedat assessingthelogisticalcapacityinthecontextoftradeandmerchandise,someof theindicatorsarerelevantforhumanitarianlogistics,suchastheoneschosen tobeincludedinthisresearch:theassessmentofcustomsproceduresandthe assessmentofgeneralqualityoftradeandtransportrelatedinfrastructure.

TheINFORMRiskIndex LedbytheEuropeanCommission,INFORM isaglobal,open-sourcedriskindexforhumanitariandisastersandcrises,that describesthreedimensions:hazardexposure,vulnerabilityandlackofcoping capacities.Inadditiontobeingthequalificationcriteriaforthefinalairport database,partsoftheINFORMRiskindexwerealsousedtocharacterizeairports.

4.2Extractingdata

Airportsurroundings TwostrategiesinOSMweretestedinordertoassesthe surroundingsofeachairport.First,the"landuse"tagwasexplored-allthenodes containinginformationonthelandusewithin5kmradiusfromeachairportwere extracted.However,thisledtoinconsistentresults-visualvalidationofmultiple queryoutputswasconductedanditledtoaconclusionthatbuildings-related nodesarehighlyoverrepresentedascomparedtofieldsorotherunusedspaces. Therefore,formanyairports,theresultonlyshowedanumberofbuildingswithin thatradius,andnoinformationdescribingtheemptyfieldsthatwerethetrue dominantsurrounding.

Thesecondstrategy,whichledtomorerepresentativeresults,wasonebasedon purelythenumberofnodeswiththetag"building".Theassumptionwasthatif thebuildingsarewelltaggedinOSM,simplythenumberofthosenodeswithin theradiuswoulddescribehowdenselybuiltthesurroundingoftheairportis. Thelowerthenumberofbuildingsaround-themoreusefulspacefororganising humanitarianaid.Avisualvalidationofmultiplerecordswasconducted,with aspecialfocusontheoutliers-airportswithveryloworveryhighnumberof buildingsaround.Thesurroundingsofsomeremoteairportswasunderrepresented, resultingin0buildingsreported.Whileitwasnottrue,thenumberofbuildings wasverylittleandtheresultwasstilluseful.

AI-basedclusteringofairports7

Alternativeairports Tofindandalternativeairport,wefocusedonthe surroundingswithina100kmradius.Unlikewithchoosingairportsforthemain database,withalternativeonestherewasnoexclusionofthosethataresmalleror donothaveanIATAcode.Theassumptionwasthatanykindofairportwithin aclosevicinitytothemainonemightworkasasupportingspace,evenifnotfor landingthesamesizeofairplanes,butperhapsstorageandotherhumanitarian operations.SinceairportsarewelltaggedinOSM,thevalidationofresultswas positive-therewerenooverlookedairportsfound.However,dependingonthe qualityanddensityofroads,anairportwithin100kmradiusmightinfactbe manyhoursaway,whichwouldnotbeausefulalternative.Infutureresearchit isworthconsideringfindingamoreaccuratequalifyingfeaturethantheradius.

Alternativeseaports Similarlytoalternativeairports,alternativeseaports wereinspectedwithinaradiusof100km.Vastmajorityofresultsshowed0 seaportsandthatwasvalidatedthoroughlyandresultedtobetrue.Validation wasalsoconductedforahighnumberofseaportscounted-forsome,thecounted resultswashigherthantheactualnumberofports,becauseofmultipletags withinthesameseaport.Itdidhoweverindicatethesizeoftheseaport-often thenodeswereindicatingmoreseaportterminalsorstoragefacilities.Giventhe smallnumberofrecordsthatindicatedseaportsatall,allresultshigherthan0 werevalidatedandmanuallycorrectedifneeded.

Tourismvs.industry Inordertoasseshowwellanairportisequippedto handleasuddeninfluxofcargohandlingandnotonlyagrowthinpassenger turnaround,itwasdecidedthatitcanbeassessedbythesurroundingofan airport.BasedontheinsightsfromtheinterviewwithChrisWeeksofGARD,it wasdeterminedthatairportsthataresituatedinmainlytouristicdestinations arelesslikelytohaveagoodcapacityforhandlingcargo.Therefore,foreach airporttheamountofnodestaggedas"industrial"and"tourismamenities"was calculated.Inordertoaccountforover/underrepresentationofcertainregions, aratiooftourismandindustryrelatedfacilitiesiscalculated-basedonthe assumptionthatiftheregionisunder/overrepresentedinOSM,itwillhappen forbothtypesofamenities.

Runways Thenumberofrunwayswascalculatedforeachairportbycounting thenumberofnodes/ways/relationswitha"runway"tag.Alloutlierswere manuallyvalidated-thosethatresultedin0runwayswerecorrectedsincea functioningairportcannothave0runways.Thesamewasdoneforallrecords thatshowedmorethantworunwayssinceitisnotverycommonforairportsto havemultiplerunways,especiallyinremoteplaces,whichhappenstobewhere mostoftheairportsfromthedatabaseare.

Citiesanddistances Inordertoasseshowdistantanairportisfromthe populationitmightbeservingwhenadisasterstrikes,threeclosestcitiesfor

eachrecordwerefound,togetherwiththedirectdistance(notbyroad)and populationofeachcity.Forthispurpose,theGeoDB-citiesAPIwasused[11]. Basedonthecoordinatesofeachairportthethreeclosestcitieswithin100km, containingpopulationinformationwerechosen.Validationwasperformedfor anumberofrandomlychosenrecordsandoutliers,andmanuallycorrectedif needed.TheAPIworkswithGeoNamesandWikiData,whichsimilarlytoOSM areconsideredtrustworthysources,thankstotheusercommunityinputand validationscheme.

Population Datagatheredtodescribesurroundingcitieswasusedtocalculate thegeneralpopulationaroundeachairport-asasummationofpopulationinall threeclosestcitiesfoundbytheGeoDBcitiesAPI.

Airportarea Inordertoassessthestoragecapacityaswellastheareaavailable forsettingupahumanitarianhub,theareaofeachairportwascalculated.In OSM,eachairportisnotonlyindicatedbyasinglenode,butbyarelationthat indicatesitsborders.ThisgeodatawasexportedandanalysedwiththeQGIS software[18].Thankstobuiltinfeatures,theareaofeachairportwascalculated. Validationwasconductedonarandomsampleofresultsandthemethodproved tobeeffective.

4.3Thedatabase

Astheplanistocompareairportsbasedonnumericalfeatures,eachdatamodality wasturnedintoan understandable formformathematicalprocessing.Depending onthemodalityofdata,variouspreprocessingmethodswereapplied,basedon severalscientificsources[20,19,8,6]andcanbeseeninAppendix2.Thefinal listofallairportsandcorrespondingfeatureswerebuiltintheDBBrowserand madeavailablethroughtheGitLabdepository,bothasa.csvfileandanSQLite database.Featuresselectedforeachairport,togetherwiththecorresponding source,preprocessingmethods,andadescriptionoftheirrelevanceforassessing disasterpreparedness,arepresentedintable2.

5Limitations

Thequalitydatasourcesusedintheresearchcansometimesbecontested,as thelevelofdetailavailableforvariousairportsandtheirsurroundingswasnot alwaysequal,whichmayleadtoinaccurateresults.Thisisalsoaproblemwith officialsourceswidelyusedbythehumanitariancommunity,suchastheLogistics CapacityAssessment.Intervieweesmentioned(Appendix1)theimportanceof accesstodynamicdatathatdescribesthestateofeachairportanditssurroundings ataprecisemomentintime,afteradisasterstrikes,becausethestaticinformation gatheredinassessmentsearliercanbeinaccuratethemomentadisasterstrikes. However,intervieweesinvolvedinpreparednessprogramsratherthanimmediate

AI-basedclusteringofairports9

Table2. Descriptionofthedatabase.

SourceFeatureTypeofdataDatahandlingRelevance

OurAirports iatatext

airport_nametext latitude_degnumerical longitude_degnumerical countrytext

noadditionalhandlingneeded airportidentificationandlocation

elevation_ftnumerical emptyfieldsinputedwith meanvalue

lightedcategoricalemptyfieldsimputedwith‘0’ runwaydescriptionforassessing airport’scapacityandaccessibility max_length_ftnumerical emptyfieldsinputedwith meanvalue

width_ftnumerical emptyfieldsinputedwith meanvalue

airport_typecategorical

seaport_countnumerical

OSM

textvaluesconverted intocategoricalvalues‘0’,‘1’ generalassessmentoftheairport trafficsize

identifyingpotentialalternative seaportswithin100kmradius

airport_countnumerical identifyingpotentialalternative airportswithin100kmradius

manualverification

build_countnumerical describingthesurrounding within5km

industrial_countnumericalassessingairport’scargo handlingpreparedness tourism_countnumerical terminal_countnumericalassessingairport’scapacity runways_countnumericalassessingairport’scapacity

GeoDB name_city_ntext obtainingdataaboutthree closestcities

assessingthedistancebetween theairportandpotentialcasualties dist_city_nnumerical

population_city_nnumerical assessingthenumberofpotential casualtiesinthearea

aptclasscategorical

Global Airports

INFORM Index

Logistics Performance Index

assessingairport’scapacity international/domestic apttypecategorical assessingairport’scapacity Airport/Airstrip/Airfield authoritycategorical assessingairport’sorganisational structure:civil/military humusecategorical assessingairport’shumanitarian operationpreparedness

textvaluesconverted intocategoricalvalues‘0’,‘1’

natural_dis_risknumerical emptyfieldsinputedwith meanvalue

assessingregionaldisasterrisk informrisknumerical assessingregionaldisaster preparedness

lpi_customsnumerical assessingregionallogistical capacityandpreparedness lpi_infrastructurenumerical assessingregionallogistical capacityandpreparedness

GARDgardcategorical

textvaluesconverted intocategoricalvalues‘0’,‘1’ assessingairport’shumanitarian operationpreparedness

Self calculated airport_areanumericalcalculatedbasedonOSMdataassessingairport’scapacity population_aroundnumericalcalculatedbasedonGeoDBdata assessingthenumberofpotential casualtiesinthearea iso_countrytextnoadditionalhandlingneededidentificationpurposes

responseoperationsunderlinedtheimportanceofbuildingcomprehensivedata setswithstaticinformationtoassessbetterwhatcanbedoneaheadofatragic event.

Anotherchallengingfactoristheaccuracyofassumptionsmade––especially forassessingairportconnectivity.Asprovedbyhistoricaldisasters,theinability todistributehumanitarianrelieffromtheairporttothepopulationinneedcan underminetheairport’soperationsandpreparedness.Amoresophisticatedand accuratewayofquantifyingthelevelofconnectivitycouldbeusedinfuture research.

6DiscussionandConclusion

Thedatabasebuiltinthisresearchisavaluableresourceforfutureclustering analysisorfutureresearchrelatedtoairports’preparednessforhumanitariandisasters.Itcanbefurtheranalyzedinmoredetailedresearch,updatedaccordingly, andusedtoassessairports’venerabilityandpreparedness.Fromthescientific perspective,thisresearchprovesthattherearenowwaysofanalyzingcomplex, specificchallengeswithaglobaloverviewbasedonnumerouspubliclyavailable datasets.Italsoshowsthatscientistsneedtobeverycarefulwhenusingnot preciselyscientificsourcesandthatbuildingaspecific,tailoreddatabaseisa lengthy,challengingprocess.Nevertheless,itcanbeachievednotonlybyIT professionalsbutalsobymultidisciplinaryresearchers.

Thisresearchprovidedavaluableframeworkforapproachingcomplexsociotechnicalenvironmentsofairportsandtheirdisasterpreparedness,through buildingadatabasewithrelevantfeatures,basedoninterviewsandliterature review,usingonlypubliclyavailabledata,followedbyacomprehensivedata selection,collectionandpre-processing.Thechallengesandproblemsencountered alongtheway,bothsolved,andunsolvedcanformavaluabletoolforother professionalsandscientistswillingtoconductsimilarresearch,notonlyrelated tothedomainofaviationanddisasterpreparedness.

Anadditionalfindingisthatweidentifiedtheneedforacommon,reliable databasewithallrelevantinformationaboutairportsinvulnerablelocations.The onedesignedduringthisresearchcouldformabaseforaonebuiltwithofficial datasourcesthatareotherwiseunavailabletothepublic.Withthat,however, comesthechallengeofsecurity;sincedetailedinformationaboutairportscanbe viewedassensitivedata,thereforeaccesstosuchadatabaseshouldberegulated.

6.1Futureresearch

Theideasforfutureresearchcanbedividedintothreesections-(1)relatedto thedataminingandtheprocessofbuildingthedatabase,(2)datapre-processing andapplyinganunsupervisedclusteringalgorithmand(3)usingtheresultsin variouswaysinordertoimproveairports’disasterpreparedness.

Buildingadatabasesolelyfrompubliclyavailablesourceshassomedrawbacks, asdiscussedinsection5,suchaslimitedtrustworthinessandinabilitytoretrieve

AI-basedclusteringofairports11

theexacttypesofinformationthatareneededinordertodescribespecific features.Inthefuture,itisworthconsideringbuildingasimilardatabasewith directinvolvementoftheairportsthatarebeingdescribed––withtheuseof surveysandpossibleinvolvementofinternationalhumanitarianandaviation relatedorganisationssuchasACIorOCHA.Thiswouldallowforretrieving morespecificdata,uptodateinformation.Moreover,ifregularlyupdatedand maintained,itcouldbecomeausefulresourceforairportsthatthemselveswould liketoknowmoreaboutcapabilitiesofalternativeportsintheregion––notonly forresearchpurposes,butforoperationsonceadisasterstrikesandhelpfrom neighbouringportsisneeded.Otherscientistscouldalsousesuchadatabasefor variousadditionalanalyses,savingtimeforgatheringthedataandfocusingon whatcanbederivedfromit.

However,thedatabasethatwasbuiltinthisresearchisitselfavaluable resourceforperformingotherresearchrelatedtoairports’preparednessfor humanitariandisasters.Withadditionaliterationsofthedatapre-processing, thereisroomforgatheringinsightfulknowledgeonsimilaritiesbetweenairports, thatwouldformasolidbaseforestablishingcooperations.Inordertoachievethat, futureresearchshouldfocusonidentifyingthedominatingfeaturesandadjusting thealgorithmaccordingly.Thiscouldrequiremoresophisticatedmethodsof datapre-processingandautomatingtheprocessofanalysingresults,inorderto quicklypickupcombinationsoffeaturesthatcannotoffertrustworthyresults.

Buildingpolicyadvicebasedonthedatabasecouldbeachievedbyidentifying airportsthatareespeciallyvulnerable,duetototheirintrinsicfeaturesand capabilities.Thisprocesswouldhavetobeaccompaniedbyathoroughanalysisof historicaleventsthattookplaceatsimilarairports,andthelessonslearnedcould beusedforimprovingpreparednessofthosethatmightfacesimilarchallengesin thefuture,leadingtoachievingthefullpotentialofthisresearch.

References

[1] Jean-FrançoisArvisetal. ConnectingtoCompete2018.Tech.rep.The WorldBank,2018. doi:10.1596/29971. url:https://openknowledge. worldbank.org/bitstream/handle/10986/29971/LPI2018.pdf.

[2] NingChenetal.“Regionaldisasterriskassessmentofchinabasedonselforganizingmap:Clustering,visualizationandranking”.In: International JournalofDisasterRiskReduction 33.October2018(2019),pp.196–206. issn:22124209. doi:10.1016/j.ijdrr.2018.10.005. url:https://doi.org/10. 1016/j.ijdrr.2018.10.005.

[3] SunkyungChoiandShinyaHanaoka.“Diagrammingdevelopmentfora basecampandstagingareainahumanitarianlogisticsbaseairport”.In: JournalofHumanitarianLogisticsandSupplyChainManagement 7(June 2017),pp.00–00. doi:10.1108/JHLSCM-12-2016-0044.

[4] DeutschePostDHLGroup. DisasterPreparedness-GetAirportsReady forDisaster.2021. url:https://www.dpdhl.com/en/sustainability/socialimpact-programs/disaster-management/disaster-preparedness.html.

[5] DeutschePostDHLGroup. GoHelpProgram-DisasterPreparednessand Response.Tech.rep.2019. url:https://www.dpdhl.com/en/responsibility/ society-and-engagement/disaster-management.html.

[6] PurvaHuilgol. FeatureTransformationandScalingTechniquestoBoost YourModelPerformance.Aug.2020. url:https://www.analyticsvidhya. com/blog/2020/07/types-of-feature-transformation-and-scaling/.

[7] HumanitarianDataExchange. Globalairports-HumanitarianDataExchange.Jan.2019. url:https://data.humdata.org/dataset/global-airports.

[8] GotaKikugawaetal.“Dataanalysisofmulti-dimensionalthermophysical propertiesofliquidsubstancesbasedonclusteringapproachofmachine learning”.In: ChemicalPhysicsLetters 728(2019),pp.109–114. issn: 0009-2614. doi:https://doi.org/10.1016/j.cplett.2019.04.075. url: https://www.sciencedirect.com/science/article/pii/S0009261419303598.

[9] JakubKraus,VladimírPlos,andPeterVittek.“TheNewApproachto AirportEmergencyPlans”.In: InternationalJournalofAerospaceand MechanicalEngineering 8.8(2014),pp.2406–2409. issn:eISSN:1307-6892. url:https://publications.waset.org/vol/92.

[10] MIT. PythonWrappertoaccesstheOverpassAPI.Apr.2021. url:https: //github.com/DinoTools/python-overpy.

[11] M.Mogley. GeoDBCitiesAPIDocumentation.2017. url:https://rapidapi. com/wirefreethought/api/geodb-cities.

[12] KarlaSaldanaOchoaandTinaComes. AMachinelearningapproachfor rapiddisasterresponsebasedonmulti-modaldata.Thecaseofhousing shelterneeds.2021.arXiv:2108.00887 [cs.LG]

[13] OurAirports. AboutOurAirports.2007. url:https://ourairports.com/ about.html#overview.

[14] B.H.Pandeyetal.“Developmentofresponseplanofairportformega earthquakesinNepal”.In: NCEE2014-10thU.S.NationalConferenceon

AI-basedclusteringofairports13

EarthquakeEngineering:FrontiersofEarthquakeEngineering (Jan.2014). doi:10.4231/D3TH8BN7T.

[15] F.Pedregosaetal.“Scikit-learn:MachineLearninginPython”.In: Journal ofMachineLearningResearch 12(2011),pp.2825–2830.

[16] AbdüssametPolater.“Airports’roleaslogisticscentersinhumanitarian supplychains:Asurgecapacitymanagementperspective”.In: Journalof AirTransportManagement 83(2020),p.101765. issn:0969-6997. doi: https://doi.org/10.1016/j.jairtraman.2020.101765. url:https://www. sciencedirect.com/science/article/pii/S0969699719303898.

[17] AbdussametPolater.“Managingairportsinnon-aviationrelateddisasters: Asystematicliteraturereview”.In: InternationalJournalofDisasterRisk Reduction 31(2018),pp.367–380. issn:2212-4209. doi:https://doi.org/ 10.1016/j.ijdrr.2018.05.026. url:https://www.sciencedirect.com/science/ article/pii/S2212420918302127.

[18] QGISDevelopmentTeam. QGISGeographicInformationSystem.Open SourceGeospatialFoundation.2009. url:http://qgis.org.

[19] JiminQianetal.“Introducingself-organizedmaps(SOM)asavisualization toolformaterialsresearchandeducation”.In: ResultsinMaterials 4(2019), p.100020. issn:2590-048X. doi:https://doi.org/10.1016/j.rinma. 2019.100020. url:https://www.sciencedirect.com/science/article/pii/ S2590048X19300202.

[20] HRitterandTKohonen.“Self-organizingsemanticmaps”.In: Biological Cybernetics 61.4(1989),pp.241–254. issn:1432-0770. doi:10.1007/ BF00203171. url:https://doi.org/10.1007/BF00203171.

[21] MichaelVeatchandJarrodGoentzel.“Feedingthebottleneck:airport congestionduringreliefoperations”.In: JournalofHumanitarianLogistics andSupplyChainManagement 8.4(Jan.2018),pp.430–446. issn:20426747. doi:10.1108/JHLSCM-01-2018-0006. url:https://doi.org/10.1108/ JHLSCM-01-2018-0006.

[22] BartelWalleandJulieDugdale.“Informationmanagementandhumanitarianreliefcoordination:findingsfromtheHaitiearthquakeresponse”. In: Int.J.ofBusinessContinuityandRiskManagement 3(Jan.2012), pp.278–305. doi:10.1504/IJBCRM.2012.051866.

[23] MartijnWarnieretal.“Humanitarianaccess,interrupted:dynamicnearrealtimenetworkanalyticsandmappingforreachingcommunitiesindisasteraffectedcountries”.In: ORSpectrum 42(Sept.2020). doi:10.1007/s00291020-00582-0.

[24] SanfordWeisberg.“Yeo-JohnsonPowerTransformations”.In: Department ofAppliedStatistics,UniversityofMinnesota 2(2001),pp.1–4. url: http://stat.umn.edu/arc/yjpower.pdf.

AAppendix1

14M.Browarska,K.SaldañaOchoa

Fig.2. Processflowofdatamining.

BAppendix2

AI-basedclusteringofairports15

Table3. Affiliationofinterviewees

IntervieweeOrganisation

ChrisWeeksGARD

VirginieBohlOCHA,IMPACCTWorkingGroup

ThomasRomigACI

CAppendix3

DDatapre-processing

Inorderforairportstobecomparablefortheunsupervisedmachinelearning algorithms,thefeaturesthataredescribingthemneedtobeturnedintoan understandable formformathematicalprocessing.

Inthissection,thepre-processingoftext,categoricalandnumericalfeaturesis described.

D.1Emptyfields

Duetothefactthatvariousdatasourceswereused,therewasanumberof emptyfieldsforsomefeatures.Dependingonthefeature,theseemptyfields werefilledeitherwithzeroesorthemeanvalueofallexistingrecords.Missing fieldsinfeaturesdescribingwhethertherunwayislightedandwhetherthere wasaGARDtrainingconductedbefore,asitwasdecidedthatifthereisno informationavailable,itissafertoassumethenegativeoutcome.Theelevation, lengthoftherunway,widthoftherunwayandmissingINFORMandLPIrisks werereplacedwiththemeanvalues.

D.2Categoricaldata

Anumberoffeaturesinthefinaldatasetdescribeseachairportasamemberof acertaincategory.Forexample,the airporttype featurecategorisesairports into smallairport,mediumairport,largeairport.Whileitisaclearand understandabledistinctionforahumaneye,themathematicalalgorithmsrequire anumericalexpression[15].AsproposedintheoriginalpublicationonSelf OrganisingMaps[20],thecategoricalfeaturewiththreevalueswastransformed intothreebinaryfeatures,withonequalto1,andallothersto0,foreachairport. Anexampleresultcanbeseenintable4.Toachievethatforeachcategorical feature,theLabelBinarizerfunctionfromSciKit[15]wasused.

D.3Numericaldata

Itiscommonformanymachinelearningalgorithmstorequirestandardiseddata inputs,inordertoperformwell[15].Thisalsothecasewithunsupervisedlearningalgorithmusedinthisresearch-theSOM.Therearevariousmathematical transformationsthatcanhelptoachieveanormallydistributeddataanditis importanttochooseonethatfitsthetypeofdatathebest.Again,theSciKit documentation,supportedbyvariousscientificsources[19,8,6]andexperiments wasusedtochoosetherightapproach.

TheYeo-Johnsontransform[24]wasusedtochangethedistributionofnumerical data,sinceitwasoneofafewtransformationsthatcanbeappliedonnegative andzerovalues,whichthedatasetcontained.Theeffectofthetransformation canbeseeninfigures3and4.Whileitwasnotpossibletosuccessfullytransform allfeatures,especiallytheonesconsistingof0/1values,formostfeaturesthe improvementisvisible.

Table4. Anexampleofencodingcategoricalfeatures

Fig.3. AnexampleofdatadistributionbeforetheYeo-Johnsontransform.Mostof thedatapointsareconcentratedaroundthelowervalues.ApplyingSOMdirectlyona non-normallydistributeddatacouldleadtospecificfeaturesbeingoverrepresented, thereforethetransformationisneeded.

Fig.4. ExampleofdatadistributionaftertheYeo-Johnsontransform.Therangeof valueshaschanged,howevertherelationsbetweenspecificvaluesarekeptandthe distributionisnowclosertonormal.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.