Issuu

Originalpapers

ComputersandElectronicsinAgriculture

journalhomepage: www.elsevier.com/locate/compag

Aframeworkforthemanagementofagriculturalresourceswithautomated aerialimagerydetection

KarlaSaldanaOchoa

⁎,1,ZifengGuo1

ETHZürich,InstituteofTechnologyinArchitecture,ChairforComputerAidedArchitecturalDesign,Switzerland

ARTICLEINFO

Keywords: Treesdetection Streetsegmentation Agriculture Machinelearning CNN UAV

ABSTRACT

Theacquisitionofdatathroughremotesensingrepresentsasignificantadvantageinagriculture,asitallows researcherstoperformfasterandcheaperinspectionsoverlargeareas.Currently,extensiveresearcheshavebeen doneontechnicalsolutionsthatcanbenefitsimultaneouslyfromboth:vastamountsofrawdata(bigdata) extractedfromsatelliteimagesandUnmannedAerialVehicle(UAV)andnovelalgorithmsinMachineLearning forimageprocessing.Inthisexperiment,weprovideanapproachthatfulfillsthenecessitiesofrapidfood security,assessment,planning,exploitation,andmanagementofagriculturalresourcesbyintroducingapipeline fortheautomaticlocalizationandclassificationoffourtypesoffruittrees(coconut,banana,mango,andpapaya) andthesegmentationofroadsintheKingdomofTonga,usinghigh-resolutionaerialimagery(0.04m).

Weusedtwosuperviseddeepconvolutionalneuralnetwork(CNN):thefirst,tolocalizeandclassifytrees (localization)andthesecond,tomaskthestreetsfromtheaerialimageryfortransportationpurposes(semantic segmentation).Additionally,weproposeauxiliarymethodstodeterminethedensityofgroupingsofeachof thesetreesspecies,basedonthedetectionresultsfromthelocalizationtaskandrenderitinDensityMapsthat allowcomprehendingtheconditionoftheagriculturesitequickly.Ultimately,weintroduceamethodtooptimizetheharvestingoffruits,basedonspecificsceneries,suchasmaximumtime,pathlength,andlocationof warehousesandsecuritypoints.

1.Introduction

LocatedinthePacificOcean,theKingdomofTongaextendsoveran areaof362,000km2.Withapopulationof107.122inhabitantsin2016, 58.4%ofitspopulationdependsonagricultureandforestryasaprimarysourceofincomeandakeydriverforeconomicgrowth.Itsmost prominentagriculturalproductsarebananas,coconuts,coffeebeans, vanillabeans,androotssuchascassava,sweetpotato,andtaro2 (HalavatauandHalavatau,2001).

MostofthecountriesinthePacificregionareexposedtohigh-risk disastersincludingcyclones,earthquakes,tsunami,stormsurge,volcaniceruptions,landslides,anddroughts,e.g.,Tongaisaffectedby morethanonetropicalcycloneeveryfouryears.Thesesrecurrentdisasterscausedamageandlossestoagriculture,foodsecurityandlocal economy.Inthelastyears,accordingtothe2015Reportofthe Secretary-GeneralontheImplementationoftheInternationalStrategy

forDisasterReduction;disastersworldwidecostaroundUSD1.5trillion ineconomicdamage.Thefrequencyandseverityofnaturaldisasters areincreasing,revealinganurgentneedtostrengthentheresilienceof foodassessmentsandsecurity(FAO,2015).

Tounderstandhowlocalagricultureandfoodsecuritywereaffected byanaturaldisaster,aerialimageryfromthesiteandthesucceeding mappingandclassificationofdataarerequired.ThefieldofRemote Sensingoverthepastdecadeshasrobustlyinvestigatedfastermethods tocollect,produce,classify,andmapearthobservationdata.Inrecent years,theuseofUnmannedAerialVehicles(UAV)tocollectdatahas increasedrapidly,mainlyfortheirinexpensivehardwareandrapidly deployforthecollectionofimagery.Inparallel,thedevelopmentof newtechnicstodetectobjectsinopticalremotesensingimagerywere activelyexplored3 byseveralscholars.In1991anautomatictreedetectionanddelineationfromdigitalimagerywasperformedby Pinz (1991) whoproposedaVisionExpertSystemusingaerialimagery.He

⁎ Correspondingauthor:BuildingHIB,FloorE15,Stefano-Franscini-Platz1,CH-8093Zurich,Switzerland. E-mailaddress: saldana@arch.ethz.ch (K.SaldanaOchoa).

1 Thetwoauthorscontributedequallytothiswork.

2 Theprocessingofcoconutsintocopraanddriedcoconutwasoncetheonlysignificantindustryandonlycommercialexport.

3 Thisprocessdetermineswhetheragivenaerialorsatelliteimagecontainsoneormoreobjectsbelongingtotheclassofinterestandlocatethepositionofeach predictedobjectintheimage(ChengandHan,2016).

https://doi.org/10.1016/j.compag.2019.03.028

Received12October2018;Receivedinrevisedform18February2019;Accepted27March2019

wasabletolocatethecenteroftreescrownandestimatetheirradius usinglocalbrightnessmaxima.In1995 Gougeon(1995),launcheda rule-basedalgorithm,thatfollowedthevalleysofshadowsbetweentree crowinagroundsampleddistancefromdigitalaerialimagery. Hung etal.(2006) proposedavision-basedshadowalgorithmfortreecrowns todetectandclassifyimageryfromUAV,usingcolorandtextureinformationtosegmentregionsofinterest. Hassaanetal.(2016) presentedanalgorithmtocounttreesinurbanenvironmentsusingimage processingtechniquesforvegetationsegmentationandtreecounting. Byapplyingak-meansclusteringalgorithmandsettingthresholdvaluestogreenclusterscenters,thealgorithmwasabletosegmentoutthe greenportionoutofanyimagewithoutanynoise.

Today,thedevelopmentofmachinelearningapproachesprovides researcherswithaconceptualalternativetosolveproblemsinthe mentioneddomainswithoutpredefiningtherulesforaspecifictask. Instead,modelscanlearntheunderlyingfeaturesemergingfromalarge amountofdata.Oneofthemostprominentapproachescomesfromthe fieldofimageprocessingandcomputervisionnamedConvolutional NeuralNetwork(CNN).Thealgorithmisbasedonanend-to-end learningprocess,fromrawdatatosemanticlabels,whichisanessential advantageincomparisonwithpreviousstate-of-the-artmethods (Nogueiraetal.,2017).Thismodeloutperformsalltheotherapproachesintaskslikeimageclassification,objectrecognitionandlocalization,andpixel-wisesemanticlabeling.Theearlyimplementation ofCNNby LeCunetal.(1998) achieved99.2%ofaccuracyinhandwritingdigitsrecognitionandledtheboostofCNNbasedimageprocessinginthefollowing20years.Inrecentyears,largeonlineimage repositoriessuchasImageNet(Dengetal.,2009),andhigh-performancecomputingplatformslikeGPUacceleration,havecontributed significantlytothesuccessofusingCNNinalarge-scaleimageand videorecognition.CompetitionsandchallengesliketheImageNet Challenge(Russakovskyetal.,2015)andVisualObjectClassesChallenge(Everinghametal.,2015)attractmanyresearchersandasaresult,state-of-artCNNmodelssuchasAlexNet(Krizhevskyetal.,2012) andVGG-Net(Simonyan&Zisserman,2014)respectively–both availableonline.

Moreover,researcherscandirectlyuseortrainthesemodelsontheir datasetwithnoneedtodesignitsarchitecture,e.g.,YOLOmodel (Redmonetal.,2016)achievedanexcellentperformanceonrecognitionandmadethereal-timeobjectlocalizationpossible.Inthemeantime, Longetal.(2015) withtheirnovelmodelFCNachieved20% relativeimprovementinpixel-wisesemanticsegmentationinthe PASCALVOCchallenge.Also,SegNetproposedby Badrinarayanan etal.(2015) alsoachievedcompetitiveperformanceasitisdesignedto beefficientbothintermsofmemoryandcomputationaltimeduring prediction–Itisalsosignificantlysmallerinthenumberoftrainable parametersthanothercompetingarchitectures.

Theuseofdeeplearning4 inRemoteSensinghasgrownexponentiallysinceitcaneffectivelyencodespectralandspatialinformationbasedonthedataitself.Duringthelastyears,considerable effortshavebeenmadetodevelopvariousmethodsforthedetectionof differenttypesofobjectsinsatelliteandaerialimageswithCNN,such asroad,vegetation,tree,water,buildings,cars,etc.–IntheConclusion sectionweaddressquantitativemeasurestosupporttheeffectivenessof theproposedapproachcomparedtoexistingapproaches: Chenetal., 2014;Luusetal.,2015;Luetal.2017;Kussuletal.,2017;Mortensen

4 Deeplearningisabranchofmachinelearningthatreferstomulti-layered interconnectedneuralnetworksthatcanlearnfeaturesandclassifiersatonce, i.e.,auniquenetworkmaybeabletolearnfeaturesandclassifiers(indifferent layers)andadjusttheparameters,atrunningtime,basedonaccuracy,giving moreimportancetoonelayerthananotherdependingontheproblem.End-toendfeaturelearning(e.g.,fromimagepixelstosemanticlabels)isthesignificantadvantageofdeeplearningwhencomparedtopreviousstate-of-the-art methods(Nogueiraetal.,2017).

etal.,2016;Sørensenetal.,2017;Miliotoetal.2017.

Inthispaper,weaimtoprovideanapproachthatfulfillsthenecessitiesofrapidfoodsecurity,assessment,planning,exploitation,and managementofagriculturalresources;weproposeaframeworktoefficientlylocalizeandclassifyfourtypesoftropicalfruits(coconut,banana,mango,andpapaya).Wepursuethelatterbyamethodtoautomaticallyidentifyandsegmentroads,sothatfastestandsafestwaysto transportcropstoadjacentwarehousesorsecuritypointscanbedetected.

Todoso,weusedtwosuperviseddeepCNNs;thefirstCNNmodel performsthetaskofobjectlocalization,tolocalizeandclassifythetype oftrees.Thelocationsofthetreesarenotonlyusedtocontrolagriculturalresources,butalsoinscenariosofnaturaldisasterstheycanbe comparedwiththepreviousstatetohaveabetterunderstandingon howlocalagricultureandfoodsecuritywereaffected.Thisinformation candirectlyinformandacceleratesubsequentreliefefforts. Additionally,weproposeamethodtodeterminethedensityofeachof thesetreestoimproveproductivity,basedonthedetectionresultsof thefirstCNNandpresentedasDensityMapstoquicklycomprehendthe conditionoftheagriculturalsite.

ThesecondCNNmodelperformsasemanticsegmentation,that masksthestreetsfromtheaerialimagerytohelpidentifylocaltransportationinfrastructureand,inthescenarioofnaturaldisasters,evaluatesthedamage,proposingaproperplantodistributeaidacrossaffectedareas.Ultimately,weintroduceamethodtooptimizethe harvestingprocess,basedinspecificsceneries,suchasmaximumtime, pathlength,andlocationofthewarehouseandsecuritypoints.

2.Data

2.1.DataforthefirstCNN:ObjectLocalizationmodel

Forthisexperiment,weusedUAVshigh-resolutionimageryover satelliteimages,thelatteriseasilyaffectedbycloudyenvironments. Also,freelyavailablesatelliteimageshavelowerresolutionthanUAV imagery.TheimagerywascapturedinOctober2017andwasmade availableinearly2018aspartofanOpenAIChallengecoordinatedby WeRobotics,PacificFlyingLabs,OpenAerialMapandtheWorldBank UAVsforDisasterResilienceProgram.Weparticipatedinthischallenge thataimtocrowdsourcethedevelopmentofautomatedsolutionsfor theanalysisofaerialimagery;withspecificfocusonhumanitarian, developmentandenvironmentalprojects.

Atotalof80km2 ofhighresolution(under10cm)aerialimagery wasobtainedfromtheKingdomofTonga,coveringfourareasofinterest(withacombinationofruralandurbanareas).Thefirstthree covered10km2 each,andthelatestcovered50km2.Thespatialresolutionoftheopticalimageryis4cmor8cmdependingontheAreaof Interest.

Wecreatedthetrainingdatabyselectingtheimageryfromthe 50km2 areawith8cmofprecisionandfurtheruseditinthefirstsupervisedCNN.WeobtainedlabeledimagerythroughtheHumanitarian OpenStreetMapcommunity,whereexpertslabeleverytypeoftreefrom thisaerialimagerywiththesetreeclasses:coconut,banana,mango,and papaya.

TopreparethetrainingdataforthefirstCNNmodel,wesplitthe originalfull-sizeaerialimageryintosquarepatcheswithpredefined resolution(256×256×3).Inordertoincreasethesampleoftraining data,weuseddataaugmentationtechniques,includingrandomhorizontalandverticalflippingandrandomrotationshavingaresultof 27,293labeledimages.Thepatchesareintentionallyoverlappeduntil halfofthesubdivisionresolution–becausesometreesmayhavebeen splitandwillnotberecognizedcorrectly–securingthatatleastone patchcanentirelycovereachtree.Thepatchesarelabeledbyvectors thatcontainposition,sizeandtypeoftree(Fig.1);thiswillbefurther explainedin Section3.1

Fig.1. Onepatchisexemplifyinghowthetrainingdatawasprepared,tobefedtothefirstCNN.

2.2.DataforthesecondCNN:Semanticsegmentationmodel

ThetrainingdatausedinthesecondCNNisfromtheISPRScommissionII/4benchmarkonUrbanClassificationand3DBuilding ReconstructionandSemanticLabelingThesedatacorrespondtothe urbanareaofPotsdam,Germany,andconsistsofhigh-resolutionTrue OrthoPhotoandtheirrespectiveDigitalSurfaceModels.Thisdatahas beenclassifiedmanuallyintosixlandcoverclasses:impervioussurfaces,buildings,lowvegetation,trees,cars,andbackground(Fig.2). Wesplittheimageryintosquarepatchesof256×256×3without overlapping,achievingatrainingdataof20,102images.

3.Procedure

3.1.Classificationandlocationoftrees

ThisCNNmodelistrainedwiththetrainingdatadescribedinthe subchapter DataforthefirstCNN:ObjectLocalizationmodel.Thismodel isabletoclassifyandlocatedifferenttreesspecies.TheCNNtakesone squareRGBimageof256×256×3asinputandprovidesthecorrespondingprediction.Attheendofthepredictionprocess,thelocalizationresultsareassembled.Ifthedistancebetweentwoormorerecognizedtrees–ofthesamespecies–islessthanapredefined threshold,thelatterareconsideredasone,andtheirlocationsare averaged.

ThearchitectureofCNNisbasedonamodifiedYOLOmodel.As introducedbyRedmonetal.,YOLOworkswithapredictiongrid,and eachcellofthegridisresponsibleforrecognizingoneobject.Objects arepredictedasoneormoreboundingboxeswithaconfidencevalue andaone-hotvectorthatrepresentsthetypeoftheobject;inthisexperiment,thespeciesofthetrees.Theconfidencevaluereflectsthe probabilityofthecellcontaininganobjectandhowaccuratethe boundingboxis.Boundingboxeswithconfidencevalueslargerthana user-definedthresholdarekeptandarerenderedasaresult.Inourcase, thepredictiongridis5×5,whereeachcellpredictsoneboundingbox andfourclasses.Aboundingboxisrepresentedbyfourvalues:x,y,the radiusoftheobjectandconfidencevalue.Sincetreesseenfromabove aremainlycircular,thewidthandheightoftheboundingboxare simplifiedbyradius.Therefore,theoutputisatensororathree-dimensionalmatrixof5×5×8.Wesetthethresholdfortheconfidence valuesto0.8.Weoverlapthepatchesinordertoavoidmissingatree localizationwhenseveraltreesarefoundinonecell.Theprocessof cellsactivationisillustratedin Fig.3.

Theoverallarchitectureofthemodelcanbeillustratedin Fig.4 Theinitialconvolutionallayersofthenetworkextractfeaturesfromthe imagewhilethefullyconnectedlayerspredictoutputprobabilitiesand coordinates.Thenetworkhas24convolutionallayersfollowedbytwo fullyconnectedlayers(Redmonetal.,2016)

Themodeladoptssum-squarederrorasthebasisofthelossfunction,however,asRedmonetal.mentioned,sum-squarelossweightsthe localization,theclassification,andtheconfidenceerrorsequallyand destabilizethemodel,whichisnotidealforourtask.Inorderto overcomethisissue,twomodificationsofthelossfunctionareintroduced.First,anadditionalcoefficientismultipliedtotheconfidence errorandsecond,thegroundtruthofconfidencevalue(whichiseither 0or1)isusedasthecoefficientofthelocalizationandtheclassification errors.Therefore,theconfidencevaluegainshigherpriorityintraining andincreasestheaccuracyofthemodelindetectingtheexistenceof trees.Moreover,thepenaltyoflocalizationerroronlyhappenswhen thegroundtruthtreeexists.LetCandC′,BandB′andTandT′bethe confidencevalues,theboundingboxes(x,ycoordinates,andsize)and thespeciesoftreesofthegroundtruthandthepredictionrespectively, λbethecoefficientforconfidenceerror(inourpracticeissetto5)and Nbethenumberofgridcells(whichis25inourcase).Thentheloss functioncanbewrittenasfollow:

Beforetrainingthemodel,wesplitthedatasetina60–5–35ratiofor training,validation,andtestingrespectively.Theoutputofthefirst CNNmodelretrievedthelocationandclassoftreesinpixelspace.In chapter3.Wewillfurtherdiscusstheperformanceandresultsofthe model.

3.1.1.Densityandheatmaps

Densitymapscanprovidevaluableinsightintonaturalscenarios suchasagriculturebecausetheycancommunicatethecharacteristicsof geo-data,e.g.,theconcentrationoftreesinspace.Inordertodetermine thedensityofthedetectedtrees,wemaptheirlocationsbackintotheir geo-coordinates.WeemploytheGaussianKerneltodeterminethe densityofeachclassoftrees:letpbethepositionsofalltheretrieved trees,Nbethenumberofalltrees,thedensityatagivenlocationp’can becalculatedas:

Fig.2. Anexampleofhowthetrainingdatawasprepared,tobefedtothesecondCNN.

pp exp(||||) i N i 2 2

WerenderedthisresultsasDensityMapsforeachspecies,andlater anadditionaloneforallspecies(Fig.5).Outof12,945truepositive treeslocated,10,136treeswherecoconuttrees,representing78%of thedetectedtrees,2340bananastrees,55papayatrees,and173Mango trees.

3.2.Streetdetection

Thestreetdetectionutilizesthemethodofpixel-wisesemantic segmentationtoextractthestreetsoutoftheoriginalaerialimagery. ThesemanticsegmentationisaprocessthattakesanRGBimageasthe inputandproducesanequal-sizeimagethatispixel-wisedcolored basedonsemanticlabelsastheoutput.Allthepixelsthathavethesame labelarecoloredidentically.Forthistask,wetrainedanotherCNN

TheoriginalUAVimagery,andthecorrespondingDensitymapdisplayingthelocalizationofallspecies.

Fig.3. Thefirstimageshowsonepatchbeensubdividedinagridof5×5,thesecondimageexemplifiestheprocessofthecellsbeingactivated,andthethirdimage showsthelocalcoordinateofthefoundedtrees.

Fig.4. ThearchitectureoftheYOLOmodelwherethedispositionoftheconvolutional,poolingandreshapelayersareexplained.

Fig.5.

Fig.6. ThearchitectureoftheSegNetmodelwherethedispositionoftheconvolutional,pooling,up-sampling,softmaxandnormalizationlayersareexplained.

Fig.7. Thefirstimage,theextractlayerformtheSegNetmodelcorrespondingtothestreetnetwork,secondimages,theprocessfrompixelstoscatterpointsfollowed bytheDelaunaytriangulation,andthethirdimagesthespatialpatternofthestreets.

modelwiththeISPRScommissionII/4dataset.Thesubdivisionprocess ofinputimageryandtreatmentwasdescribedinthesubchapter Data secondCNN:SemanticSegmentationmodel.Wehighlightthatunlikethe treelocalizationandclassificationtrainingdata,thepatchesforthe streetrecognitionhavenooverlap.Afterall,patchesareprocessed, outputsareassembled.Fromthisoutput,weextractedonlythestreet layerandhighlighteditonablackandwhiteimage.

ThesegmentationmodelisbasedonamodifiedSegNetmodel (Badrinarayananetal.,2015)withtheinputsizeof256×256×3;the inputimageisprocessedbyasetofhierarchicalconvolutionalmodules toreducethesizeandgainmuchmorechannels.Eachmoduleconsists ofthreetofiveconvolutionallayerswitheachonefollowedbyone batchnormalizationlayer.Attheendofeachmodule,thereareone poolinglayerandoneactivationlayer.Then,thecompressedimages arefeedintoasetofhierarchicalup-samplingmodules.Eachmodule startswithanup-samplinglayerandfollowedbyseveralconvolutional layers,andthelastoneisfollowedbyanactivationlayeraswell.The poolinglayerandtheup-samplinglayerinthemodulesofthesame hierarchysharetheirpoolingindices.Theoverallarchitectureofthe modelisillustratedin Fig.6

3.2.1.Pathoptimization

Thedetectionresultsareirregularandforsomestreetsdisconnected;theleadingcausesofare:first,theoriginalimagemaybe affectedbythedistortioncausedbythemergingofmanyUAV-images intoone,secondly,thestreetmaybecoveredbytreecrownsandother objects,whichmakesitdifficulttokeepconsistency;besidestothe instabilityofthedetectionmodel.

Insteadoftryingtoextractpreciselinesthatrepresentthestreet networkfromtheimage,analternativemethodisproposed.Withthe hypothesisthattheprobabilityoftwodisconnectedstreetsegmentsis onestreetdependingontheirdistancein-between,wedeterminethat theshorterthedistance,thehighertheprobabilityofbelongingtothe samestreet.Followingthisassumption,arandomsubsamplingprocess ismadeontheresultingimage,whereeachpixellabeledasastreethas acertainprobabilityofbeinganodeandtakenintoaccountinthenext stageoftheprocess.Thesubsamplingphaseconvertsthestreetsystem fromimagetoscatteredpoints,wheretheirdensityrepresentsthe hierarchy(importance)ofthestreet.Thescatteredpointsarestoredina list,andtheirpositionsinsidethislistareconsideredastheirindex.

ADelaunaytriangulationismadeonthescatteredpoints,representingthewholestreetnetwork.Itsedgesareweightedbytheinversedsquareoftheirlengths,meaningshorteredgeshavehigher priority.Then,theshortestpathbetweentwopointsofthenetworkcan becalculatedbytheDijkstraalgorithm5 (Dijkstra,1959),obtaining resultsthatmatchthespatialpatternofthestreets(Fig.7).

Byoverlappingthedensitymapandthestreetnetwork,thenodesof

5 Dijkstraalgorithm,orDijkstrashortestpathalgorithm,proposedbythe computerscientistEdsgerW.Dijkstra,isanalgorithmthatfindstheshortest pathbetweennodesinaweightedgraph.Thealgorithmexistsmanyvariants: themostcommononefixesanodeasthesource,iteratesoveralltheother nodesandproducesashortestpathtree.Theoriginalone,however,stopsearly whenthetargetnodeisreachedandthereforeonlyreturnstheshortestpath betweenthesourceandthetargetnodes.

Weightednodesbythenumberofreachabletreeswithinapredefinedthresholdofthemaximaldistance.

thenetworkcanbeweightedbythenumberoftreestheyreach;asseen in Fig.8.Theoverlappingbetweenlocalizationdataandsegmentation dataallowsusmakingqueriesaboutthenetwork.Forinstance,we couldasktheoptimalpathtoharvestasmanycropsaspossiblewithin 10minoftravelingorviceversa.Thestartingpoint,endingpoint, numberoftreesandtimeofthequeryareuser-specified,andthesearch processisbasedonoptimizationalgorithms.Therefore,thisprocess adaptstothenecessitiesofaspecificscenario.

WeusetheGeneticAlgorithm(Fraser,1957)forpathoptimization. Itsearchesforoptimalsolutionsonrandomlyselectedparentsolutions byregularlyapplyingmutationandcrossoveroperationsandselecting

theoffspringthatgainhigherscoresontheobjectivefunction.Thealgorithmmodifiesthesolutioninitsgenotyperatherthanthephenotype.Morespecifically,werepresenteachpathbyaseriesofkeypoints insteadofallthepointsofit.Thekeypointsareinarbitrarysequences andlocations,andtheyareassumedtobevisitedoneaftertheother untilthelastpointisreached.Thein-betweenpathbetweeneverytwo keypointsiscalculatedusingDijkstraalgorithmontopofthetriangulatedgraph,andthefinalpathistheunionoftheseresults.Therefore,thegenotypeofthepathwouldbethelistofkeypointsandthe phenotypewouldbetheresultoftheDijkstraalgorithm.Bymodifying thesequencesandthelocationsofthekeypointsnewpathscanbe

Fig.8.

generatedfromtheoriginalone.Duringtheoptimization,themutation ofthepathisdonebyrandomlymodifyingthekeypointssequence,and thecrossoverisdonebytakingtwoinputpathsandswappingpartof theirkeypoints.Thenumberofkeypointsofthepathisflexible,and thereforecrossovercanbeappliedtopathsthathaveadifferentnumber ofkeypointsandproducesin-betweenoffspring.

Weproposedtwotypesofsearch,asexamplesofthequeryprocess. Oneistomaximizethenumberofcropstoharvestwithalimitedlength oftraveldistance,andtheotheristominimizethetraveldistancewith alimitednumberofcropstoharvest.Theobjectivefunctionsforeach are: n l /max1, 1 d And: + n n l min1,/(1) d where n isthenumberofcrops, l isthelengthofthepath, d isaconstant and l′, n′arethetwoboundsrespectively.

4.Resultsanddiscussion

TheperformanceoftheTreeLocalizationandClassificationmodel wasmeasuredbyevaluatinghowprecisetheclassifierwastolocalize treescorrectly.TheaverageEuclidiandistancebetweenthecenterpoint oftheoriginaltreesandthepredictedtreesis8.86406pixels(lessthan onemeter).Theclassifierwasabletocount16,457trees,outofwhich 12,945werecorrectlylocated.Consideringtheoriginal13,393trees, theoverallLocalizationaccuracyofthemodelis80%.

Wedrawaconfusionmatrixfromeachtypeoftrees(Fig.9),in ordertoevaluatetheaccuracyofclassificationofourModel,arranged asfollows:Coconuttrees:type1,Bananatrees:type2,Papayatrees: type3andMangotrees:type4.WeachieveaClassificationaccuracyof 98%.

Havingforeachclass:

+ = (Mean TPMean TN)/total0.987691

OurMisclassificationRateis1%,basedonhowoftentheclassifier waswrong.

+ = (Mean FPMean FN)/total0.00990861

AnditsPrecisionis97%accordingtohowmanytimesitpredicts correctlyaTP.

Fig.9. Confusionmatrixforthe4typeoftrees.

Fig.10. AreaofinterestonTonga,withtheclassificationresults,fromthefourtypesoftrees;green:coconut,red:banana,blue:mango,andyellow:papaya.(For interpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)

DensityMapsofallspecies,followedbyspecificdensitymapsofCoconuttreesBananatrees,Mangotrees,andPapayatrees.

Fig.11.

Mean TP/predicted yes0.97

TheF1Scoreofthemodelis0.89,accordingtotheaverageofthe LocalizationAccuracy(80%)andtheClassificationAccuracy(98%).

Fig.10 illustratesthelocalizationresults.Alltreesarelocalizedwitha boundingboxrepresentingtheclasstheybelongto;green:coconut,red: banana,blue:mango,andyellow:papaya.

Thecorrespondingdensitymapoftheresultsshowsthathigh-densityareasarepunctual,whereasmid-densityareasappearaslarge surfacesconnectinghigh-densityareas.Null-densityareasaredisplayed inascatteredmannerbetweenhighandmid-densityareas(Fig.11).In thecoconutheatmap,thesameglobalheatmapreappears.However,in thebanana,mangoandpapayatreesheatmap,onlyhigh-densityseparateareasappear.

DosSantosetal.(2017) demonstratedthattheestimationoftrees densityisanessentialfirststeptowardlarge-scalemonitoring.Besides this,itprovidesabroadviewofresourcedistributionthatenablesthe identificationofareaswithhigher,midandlowdensities.Theseresults canyieldactionstoplanning,harvesting,andmanagementofthese tropicalfruitsbytheinterestsoflandowners,producerassociations,and humanitarianorganizations–TheimportanceofimprovingtheseactionsisessentialnotonlytoTonga’sindustryandeconomybutalsoto thethousandsoffamilieswhodependontheirextractionforsubsistence.

Thesecondmodelwasabletodiscriminatedifferenturbanclasses (Fig.12 middle):buildingfootprintsinblue,vegetationingreen,open spacesinred,androadnetworkinwhite.Byapplyingafilter,wecould maskthestreetsandseetheirstructure.Sincethetrainingdatawas fromanurbanscenario,andthesiteisrural,theaccuracyofthemodel todetectstreetswaslow.Forexample,somepartsofthebuildingswere mistakenlylabeledasstreets.Therefore,theseresultsrequiredfurther post-processingasdescribedinthesubchapter PathOptimization. Resamplingandgraph-makingprocessesprovideasystematicapproachtoovercomethisissue.Thepreciseextractionofthestreetcan bebypassed,makingitpossibletousetheseresultsinapplicationslike pathfindingandpathrecommendation.Thetwoproposedtasksfor pathfindingshowthatthisbypassingispossibleanduseful.Somepath queryresultsareillustratedin Fig.13 andmoreareshownin Appendix A.

Inordertovalidateourapproach,weappliedthetreelocalization modeltoadifferentdataset,onefromthethreeareasofinterestthat covered10Km2 witharesolutionof8cm(explainedinthechapter 2.1.

DatafirstCNN:ObjectLocalizationmodel)(Fig.14).Aftersuccessfully obtainingthelocationofthetreesfromtheimageryweconcludethat thetreerecognitionmodelcanworkindifferentscenarios,andisrobust enoughtofinddifferencesinocclusion,variation,illuminationand scaleamongtheretrievedtrees.

Besides,weappliedthewholepipelineoftheexperimenttoanother areaofinterestinTonga,successfullyretrievingtreesandstreets.The siteandtheresultsareshownin AppendixB.Theprocessingtimeof thisapproachisproportionaltothesizeofthesiteofinterest.

5.Conclusion

ThispaperhasinvestigatedtheuseofConvolutionalNeural Networkstoefficientlylocalizeandtransportfourtypesoftropicaltrees usingaerialimagery.Thisnewapproachreducescostsandtimeofinventory,mapping,harvesting,andmanagementofagriculturalresources,andassesstheimpactofdisastersonfoodsecurity.

Weintroduceaspecificcasewherethismethodfulfillsthenecessitiesofrapidassessmentafternaturaldisasters.Togetherwithtwo ConvolutionalNeuralNetworksmodels,wehavealsoproposeda methodtodeterminethedensityoftreesandamethodtooptimizethe harvestingprocessbasedonspecificscenarios.

Thisexperimentprovidesaframeworkwherewecandrawsome conclusionsabouttheadvantageanddisadvantagesofConvolutional NeuralNetworks.Theadvantagesofthemodelsare:reducestheneedof featuringengeneering,6 outperformsotherapproachesimplementedfor comparisonpurposes(featureextraction,7 area-basedtechniquesstatistics,texture,color,andshape-basedalgorithms)and,ithasthefacultytolearntheunderlyingpatternsofthedata.Thedisadvantagesof themodelsare:takesalongertimetotrainthanothertraditionalapproaches,needslargedatasets,lackspubliclyavailabledatasetsfor researcherstoworkwith,andinmanycases,researchersneedtodeveloptheirownsetsofimages.

Stateoftheartapproachesonthisfieldofstudy(agriculturalobjects detectionfromaerialimagery)involved:differentdatasets,pre-

6 Hand-engineeredcomponentsrequireconsiderabletime,aneffortthattakes placeautomatically.

7 Featureextractionstartsfromaninitialsetofmeasureddataandbuilds derivedvalues(features)intendedtobeinformativeandnon-redundant,facilitatingthesubsequentlearningandgeneralizationsteps.

Fig.12. Firstimages,originalaerialimagery,secondimagetheoutputofSegNetmodelandthirdimagethelayoutshowingthemaskingofstreets.

Fig.13. Thefirstrowshowsthelongestpathsaccessingamaximalamountoftrees;thesecond-rowshowstheshortestpathswithsomeaccessibletreesaboveagiven threshold.

Fig.14. Aerialimageryusedasavalidationdata.

Table1 Comparisonamongexistingapproaches.

KSC:75.34%(CA) 0.7463(F1). Pavia:84.61%(CA) 0.8441(F1)

93.48%(CA)

89.5%(CA)

94.6%(CA)

Reference Task Data

79%(CA),0.66(IoU)

97%(CA)

DatasetA 97.5%(CA) DatasetB 94%(CA)

80%(LA)

97.5%(CA) 0.89(F1)

Classification Accuracy(CA) F1Score(F1)

Bandremovalfordenoising

HybridofPCA, autoencoder,andlogistic regression

13classesfor KSCdataand9 classesfor Paviadata

Hyperspectral imageryonKennedy SpaceCenter,USA andPaviacity,Italy

Classification Accuracy(CA)

Classification Accuracy(CA) Intersectionover Union(IoU)

Classification Accuracy(CA)

FromRGBtoHSV,resizedto 96×96pixels,creationof multiscaleviews

Correctdistortion,resizedto 40×40pixels

Calibration,filteringand restorationofmissingdata

Resizedto1600×1600pixels andthendividedinto 400×400pixelspatches

Imagecropping,randomflip horizontallyandvertically, randomtransposing

Classification Accuracy(CA)

Localization Accuracy(LA)

Classification Accuracy(CA) F1Score(F1)

Backgroundseparation, makingvegetationblobs, resizedto64×64pixels

Dividedinto256×256 patches

AuthordefinedCNN

Aerialortho- imagerywitha 0.3048-mpixel resolution 21classes

Chenetal.(2014) Classification

Luusetal.(2015) Classification

Luetal.(2017) Classification UAVimagery 2classes

Kussuletal.(2017) Classification Satellitesimagery 11classes AuthordefinedCNN

Adaptedversionof VGG16

7classes

PhotographbySony a7with35mmlens

2classes DenseNet

Photographby CanonPowerShot G15

AuthordefinedCNN

Adaptedversionof YOLOandSegNet

2classes

4classes

Mortensenetal. (2016) Segmentation

Classification

Classification UAVimagery

Aerialortho- imagery

Localization, Segmentation

Sørensenetal. (2017)

Miliotoetal. (2017)

Ours

processingtechniques,metrics,models,andparameters;itisdifficultto comparethecurrentresearchamongthem(KamilarisandPrenafetaBoldú,2018);thus,ourcomparisonshavebeenstrictlylimitedtothe usedoftechniquesandthescoreofeachpaper.Inthefollowing Table1 adescriptionofthetask,data,labels,model,pre-processing,performanceandscoreisdisplayed.

Asexplainedabovethescoreofthemodelvariesdependingonthe experiment.Therefore,wecompareourresultswiththeonesthathad thesamevalidationperformance,inthecaseofChenetal.they achieveda0.79F1Score,andourmodelachieved0.89F1Score.Miliot etal.,Sørensenetal.,Kussuletal.,Luusetal.scoredabove90%results onClassificationAccuracy,ourmodelcanbeaddedtothislistsinceit achieved97%inClassificationaccuracy.Itisworthmentioningthatall theaboveexperimentsdealtonlywiththetaskofClassification.

Inthisexperiment,weproposedamodelthatnotonlyclassifybut locatedifferentclassoftrees;hence,wehaveanadditionalScoreperformance:LocalizationAccuracy80%,thisvalueshowshowaccurate themodelwastolocatedanyclassoftree.TheF1Score–theaverage valueofclassificationaccuracyandlocalizationaccuracy–demonstratesthatourmodelissuitabletoperformthetaskofclassification

andlocalizationof4typesoftrees.

Outlooks

Thisapproachcanbeusedinlocalization,classificationortransportationofresources;forinstance,intheassessmentofdamagein buildingsafteranaturaldisaster,foodsupplychain,urbanandregional planning,etc.Otherpotentialusescouldbeinformalsettlementsdetection,andmorespecificallythemonitoringofrooftopmaterialsasa meansdeterminelocalizedsocio-economicalconditions.

Dataandcode

Thedataandcodeofthispipelineareopensourceandcanbeaccessedviathislink: https://github.com/guozifeng91/south-pacificaerial-image

Declarationsofinterest

None. AppendixA

Pathoptimization

AppendixB

CompleteframworkaplliedinaAeraofinterestinTonga.

AppendixC.Supplementarymaterial

Supplementarydatatothisarticlecanbefoundonlineat https://doi.org/10.1016/j.compag.2019.03.028

References

Badrinarayanan,V.,Kendall,A.,Cipolla,R.,2015.Segnet:adeepconvolutionalencoderdecoderarchitectureforimagesegmentation.arXivpreprintarXiv:1511.00561. Chen,Y.,Lin,Z.,Zhao,X.,Wang,G.,Gu,Y.,2014.Deeplearning-basedclassificationof hyperspectraldata.IEEEJ.Sel.Top.Appl.EarthObs.RemoteSens.7(6),2094–2107

Cheng,G.,Han,J.,2016.Asurveyonobjectdetectioninopticalremotesensingimages. ISPRSJ.Photogramm.RemoteSens.117,11–28

Deng,J.,Dong,W.,Socher,R.,Li,L.J.,Li,K.,Fei-Fei,L.,2009.Imagenet:alarge-scale hierarchicalimagedatabase.In:2009IEEEConferenceonComputerVisionand PatternRecognition.IEEE,pp.248–255

Dijkstra,E.W.,1959.Anoteontwoproblemsinconnexionwithgraphs.Numer.Math.1 (1),269–271

dosSantos,A.M.,Mitja,D.,Delaître,E.,Demagistri,L.,deSouzaMiranda,I.,Libourel,T., Petit,M.,2017.Estimatingbabassupalmdensityusingautomaticpalmtreedetection withveryhighspatialresolutionsatelliteimages.J.Environ.Manage.193,40–51 Everingham,M.,Eslami,S.A.,VanGool,L.,Williams,C.K.,Winn,J.,Zisserman,A.,2015. Thepascalvisualobjectclasseschallenge:Aretrospective.Int.J.Comput.Vision111 (1),98–136

FAO,2015.Theimpactofdisastersonagricultureandfoodsecurity,76. https://doi.org/ F0134/EN Fraser,A.S.,1957.SimulationofgeneticsystemsbyautomaticdigitalcomputersI. Introduction.Aust.J.Biol.Sci.10(4),484–491

Gougeon,1995.ACrown-followingapproachtotheautomaticdelineationofindividual treecrownsinhighspatialresolutionaerialimages.Can.J.RemoteSens.21, 274–284

Halavatau,S.M.,Halavatau,N.V.,2001.FoodSecurityStrategiesfortheKingdomof Tonga(PDF),WorkingPapernumber57,UnitedNationsCentreforAlleviationof PovertyThroughSecondaryCrops'DevelopmentinAsiaandthePacific(CAPSA), archived(PDF)fromtheoriginalon10September2015. Hassaan,O.,Nasir,A.K.,Roth,H.,Khan,M.F.,2016.Precisionforestry:treescountingin urbanareasusingvisibleimagerybasedonanunmannedaerialvehicle.IFACPapersOnLine49(16),16–21. https://doi.org/10.1016/j.ifacol.2016.10.004. Hung,C.,Bryson,M.,Sukkarieh,S.,2006.Visionbasedshadowaidedtreecrowndetectionandclassificationalgorithmusingimageryfromanunmannedairbornevehicle.In:InternationalSymposiumforRemoteSensingoftheEnvironment. Kamilaris,A.,Prenafeta-Boldú,F.,2018.Deeplearninginagriculture:asurvey.Comput. Electron.Agric.147,70–90. https://doi.org/10.1016/j.compag.2018.02.016. ISSN: 0168-1699.

Krizhevsky,A.,Sutskever,I.,Hinton,G.E.,2012.Imagenetclassificationwithdeepconvolutionalneuralnetworks.In:Advancesinneuralinformationprocessingsystems, pp.1097–1105.

Kussul,N.,Lavreniuk,M.,Skakun,S.,Shelestov,A.,2017.Deeplearningclassificationof landcoverandcroptypesusingremotesensingdata.IEEEGeosci.RemoteSens.Lett. 14(5),778–782

LeCun,Y.,Bottou,L.,Bengio,Y.,Haffner,P.,1998.Gradient-basedlearningappliedto documentrecognition.Proc.IEEE86(11),2278–2324

Long,J.,Shelhamer,E.,Darrell,T.,2015.Fullyconvolutionalnetworksforsemantic segmentation.In:ProceedingsoftheIEEEConferenceonComputerVisionand PatternRecognition,pp.3431–3440

Lu,H.,Fu,X.,Liu,C.,Li,L.G.,He,Y.X.,Li,N.W.,2017.Cultivatedlandinformation extractioninUAVimagerybasedondeepconvolutionalneuralnetworkandtransfer learning.J.MountainSci.14(4),731–741

Luus,F.P.,Salmon,B.P.,vandenBergh,F.,Maharaj,B.T.,2015.Multiviewdeeplearning forland-useclassification.IEEEGeosci.RemoteSens.Lett.12(12),2448–2452 Milioto,A.,Lottes,P.,Stachniss,C.,2017.Real-timeblob-wisesugarbeetsvsweeds classificationformonitoringfieldsusingconvolutionalneuralnetworks.Proceedings oftheInternationalConferenceonUnmannedAerialVehiclesinGeomatics.Bonn, Germany

Mortensen,A.K.,Dyrmann,M.,Karstoft,H.,Jørgensen,R.N.,Gislum,R.,2016.Semantic segmentationofmixedcropsusingdeepconvolutionalneuralnetwork.International ConferenceonAgriculturalEngineering.Aarhus,Denmark

Nogueira,K.,Penatti,O.A.B.,dosSantos,J.A.,2017.Towardsbetterexploitingconvolutionalneuralnetworksforremotesensingsceneclassification.PatternRecogn. 61,539–556. https://doi.org/10.1016/j.patcog.2016.07.001.

Pinz,1991.Acomputervisionsystemforrecognitionoftreesinaerialphotographs.In: InternationalAssociationofPatternRecognitionWorkshop,pp.111–124.

Redmon,J.,Divvala,S.,Girshick,R.,Farhadi,A.,2016.Youonlylookonce:Unified,realtimeobjectdetection.In:ProceedingsoftheIEEEConferenceonComputerVision andPatternRecognition,pp.779–788

Russakovsky,O.,Deng,J.,Su,H.,Krause,J.,Satheesh,S.,Ma,S.,Berg,A.C.,2015. Imagenetlargescalevisualrecognitionchallenge.Int.J.Comput.Vision115(3), 211–252

Simonyan,K.,Zisserman,A.,2014.Verydeepconvolutionalnetworksforlarge-scale imagerecognition.arXivpreprintarXiv:1409.1556.

Sørensen,R.A.,Rasmussen,J.,Nielsen,J.,Jørgensen,R.,2017.ThistleDetectionusing ConvolutionalNeuralNetworks.EFITACongress,Montpellier,France

A framework for the management of agricultural resources with automated aerial imagery detection

ComputersandElectronicsinAgriculture

⁎,1,ZifengGuo1

ABSTRACT

3.1.1.Densityandheatmaps

3.2.1.Pathoptimization

5.Conclusion

Outlooks

Dataandcode