Computer vision metrics textbook edition scott krig auth by marjorie.owens136

Computer Vision Metrics Textbook Edition Scott

Krig (Auth.)

Visit to download the full and correct content document: https://textbookfull.com/product/computer-vision-metrics-textbook-edition-scott-krig-au th/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Computer Vision and Image Analysis Digital Image Processing and Analysis 4th Edition Scott E Umbaugh

https://textbookfull.com/product/computer-vision-and-imageanalysis-digital-image-processing-and-analysis-4th-edition-scotte-umbaugh/

Advances in Computer Vision: Proceedings of the 2019 Computer Vision Conference (CVC), Volume 1 Kohei Arai

https://textbookfull.com/product/advances-in-computer-visionproceedings-of-the-2019-computer-vision-conference-cvcvolume-1-kohei-arai/

Computer vision: theory, algorithms, practicalities Fifth Edition Davies

https://textbookfull.com/product/computer-vision-theoryalgorithms-practicalities-fifth-edition-davies/

Fundamentals of Computer Vision 1st Edition Wesley E. Snyder

https://textbookfull.com/product/fundamentals-of-computervision-1st-edition-wesley-e-snyder/

Computer Vision Technology for Food Quality Evaluation, Second Edition Sun

https://textbookfull.com/product/computer-vision-technology-forfood-quality-evaluation-second-edition-sun/

Computer Vision-Based Agriculture Engineering 1st Edition Han Zhongzhi (Author)

https://textbookfull.com/product/computer-vision-basedagriculture-engineering-1st-edition-han-zhongzhi-author/

MATLAB Computer Vision Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-computer-vision-toolboxuser-s-guide-the-mathworks/

Image Processing and Computer Vision in iOS Oge Marques

https://textbookfull.com/product/image-processing-and-computervision-in-ios-oge-marques/

Computer Vision ACCV 2018 14th Asian Conference on Computer Vision Perth Australia December 2 6 2018 Revised Selected Papers Part IV C.V. Jawahar

https://textbookfull.com/product/computer-vision-accv-2018-14thasian-conference-on-computer-vision-perth-australiadecember-2-6-2018-revised-selected-papers-part-iv-c-v-jawahar/

Scott Krig

Computer Vision Metrics

Textbook Edition

Survey, Taxonomy and Analysis of Computer

Vision, Visual Neuroscience, and Deep Learning

ComputerVisionMetrics

ComputerVisionMetrics TextbookEdition Survey,TaxonomyandAnalysis ofComputerVision,Visual Neuroscience,andDeepLearning

ScottKrig

ScottKrig KrigResearch,USA

ISBN978-3-319-33761-6ISBN978-3-319-33762-3(eBook)

DOI10.1007/978-3-319-33762-3

LibraryofCongressControlNumber:2016938637

# SpringerInternationalPublishingSwitzerland2016

ThisSpringerimprintispublishedbySpringerNatureThisworkissubjecttocopyright.Allrights arereservedbythePublisher,whetherthewholeorpartofthematerialisconcerned,speciﬁcally therightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting,reproduction onmicroﬁlmsorinanyotherphysicalway,andtransmissionorinformationstorageand retrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.

Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthis publicationdoesnotimply,evenintheabsenceofaspeciﬁcstatement,thatsuchnamesare exemptfromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationin thisbookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernor theauthorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerial containedhereinorforanyerrorsoromissionsthatmayhavebeenmade.

Printedonacid-freepaper

ThisSpringerimprintispublishedbySpringerNature

TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland

Thegoalofthissecondversionistoaddnewmaterialsondeeplearning, neuroscienceappliedtocomputervision,historicaldevelopmentsinneural networks,andfeaturelearningarchitectures,particularlyneuralnetwork methods.Inaddition,thissecondeditioncleansupsometyposandother itemsfromtheﬁrstversion.Intotal,threenewchaptersareaddedtosurvey thelatestfeaturelearning,andhierarchicaldeeplearningmethodsand architectures.Overall,thisbookisprovidesawidesurveyofcomputervision methodsincludinglocalfeaturedescriptors,regionalandglobalfeatures,and featurelearningmethods,withataxonomyfororganizationalpurposes. Analysis isdistributedthroughthebooktoprovide intuition behindthe variousapproaches,encouragingthereadertothinkforthemselvesabout themotivationsforeachapproach,whydifferentmethodsarecreated,how eachmethodisdesignedandarchitected,andwhyitworks.Nearly1000 referencestotheliteratureandothermaterialsareprovided,makingcomputervisionandimagingresourcesaccessibleatmanylevels.

Myexpectationforthereaderisthis: ifyouwanttolearnabout90%of computervision,readthisbook.Tolearntheother10%,readthereferences providedandspendatleast20yearscreatingrealsystems. Readingthis bookwilltakeamatterofhours,andreadingthereferencesandcreatingreal systemswilltakealifetimetoonlyscratchthesurface.Wefollowtheaxiom oftheeminentDr.JackSparrow,whohasnotimeforextraneousdetails,and hereweendeavortopresentcomputervisionmaterialsinafashionthat makesthefundamentalsaccessibletomanyoutsidetheinnercirclesof academia:

“Ilikeit.Simple,easytoremember”. JackSparrow,PiratesoftheCaribbean

Thisbookissuitableforindependentstudy,reference,orcourseworkat theuniversitylevelandbeyondforexperiencedengineersandscientists.The chaptersaredividedinsuchawaythatvariouscoursescanbedevisedto incorporateasubsetofchapterstoaccommodatecourserequirements.For example,typicalcoursetitlesinclude“ImageSensorsandImage Processing,”“ComputerVisionAndImageProcessing,”“AppliedComputer VisionAndImagingOptimizations,”“FeatureLearning,DeepLearning, andNeuralNetworkArchitectures,”“ComputerVisionArchitectures,” “ComputerVisionSurvey.”Questionsareavailableforcourseworkatthe endofeachchapter.Itisrecommendedthatthisbookbeusedasa v

ForewordtotheSecondEdition

complement tootherﬁnebooks,opensourcecode,andhands-onmaterials forstudyincomputervisionandrelatedscientiﬁcdisciplines,orpossibly usedbyitselfforahigher-levelsurveycourse.

Thisbookmaybeusedasrequiredreadingtoprovideasurvey componenttoacademiccourseworkforscienceandengineering disciplines,tocomplementothertextsthatcontainhands-onand how-tomaterials.

ThisbookDOESNOTPROVIDEextensivehow-tocodingexamples, workedoutexamples,mathematicalproofs,experimentalresultsand comparisons,ordetailedperformancedata,whicharealreadyverywell coveredinthebibliographyreferences.Thegoalistoprovideananalysis acrossarepresentativesurveyofmethods,ratherthanrepeatingwhatis alreadyfoundinthereferences.Thisisnotaworkbookwithopensource code(onlyalittlesourcecodeisprovided),sincetherearemanyﬁneopen sourcematerialsavailablealready,whicharereferencedfortheinterested reader.

Instead,thisbookDOESPROVIDEanextensivesurvey,taxonomy,and analysisofcomputervisionmethods.Thegoalistofindthe intuition behind themethodssurveyed. Thebookismeanttoberead,ratherthanworked through.Thisisnotaworkbook,butisintendedtoprovidesufficientbackgroundforthereadertofindpathwaysforwardintobasicorappliedresearch foravarietyofscientificandengineeringapplications. Insomerespects,this workisamuseumofcomputervision,containingconcepts,observations, oddments,andrelicswhichfascinateme.

Thebookisdesignedtocomplementexistingtextsandﬁllanicheinthe literature.Thebooktakesacompletepaththroughcomputervision,beginningwithimagesensors,imageprocessing,global-regional-localfeature descriptormethods,featurelearninganddeeplearning,neuralnetworksfor computervision,groundtruthdataandtraining,appliedengineering optimizationsacrossCPU,GPU,andsoftwareoptimizationmethods.The authorcouldnotﬁndasimilarbook,otherwiseIwouldnothavebegun thiswork.

Thisbookaimsatasurvey,taxonomy,andanalysisofcomputervision methodsfromtheperspectiveofthefeaturesused—thefeaturedescriptors themselves,howtheyaredesignedandhowtheyareorganized.Learning methodsandarchitecturesarenecessaryandsupportingfactors,andare includedhereforcompleteness.However,Iampersonallyfascinatedby thefeaturedescriptormethodsthemselves,andIregardthemasan art-formformathematicallyarrangingpixelpatterns,shapes,andspectrato revealhowimagesarecreated.Iregardeachfeaturedescriptorasaworkof art,likeapaintingormathematicalsculpturepreparedbyanartist,andthus theperspectiveofthisworkistosurveyfeaturedescriptorandfeature learningmethodsandappreciateeachone.

Asshowninthisbookoverandoveragain,researchersareﬁndingthata widerangeoffeaturedescriptorsareeffective,andthatoneofthekeysto bestresultsseemstobethesheernumberoffeaturesusedinfeature hierarchies,ratherthanthechoiceofSIFTvs.pixelpatchesvs.CNN features.Inthesurveyshereinweseethatmanymethodsforlearningand

vi ForewordtotheSecondEdition

trainingareused,manyarchitecturesareused,andtheconsensusseemstobe thathierarchicalfeaturelearningisnowthemainstayofcomputervision, followingonfromthepioneeringworkinconvolutionalneuralnetworksand deeplearningmethodsappliedtocomputervision,whichhasaccelerated sincethenewmillennium.Theoldercomputervisionmethodsarebeing combinedwiththenewerones,andnowapplicationsarebeginningtoappear inconsumerdevices,ratherthanexoticmilitaryandintelligencesystemsof thepast.

SpecialthankstoCourtneyClarkeatSpringerforcommissioningthis secondversion,andprovidingsupportandguidancetomakethesecond versionbetter.

Specialthankstoallthewonderfulfeedbackonthefirstversion,which helpedtoshapethissecondversion.VinRatfordandJeffBierofthe EmbeddedVisionAlliance(EVA)arrangedtoprovidecopiesofthefirst versiontoallEVAmembers,bothhardcopyande-bookversions,and maintainedafeedbackwebpageforreviewcomments—muchappreciated. ThankstoMikeSchmidtandVadimPisarevskyforexcellentreview commentsoveroftheentirebook.JuergenSchmidhuberprovidedlinksto historicalinformationonneuralnetworksandotherusefulinformation, KunihikoFukushimaprovidedcopiesofsomeofhisearlyneuralnetwork researchpapers,RahulSuthankarprovidedupdatesonkeytrendsincomputervision,andHugoLaRochelleprovidedinformationandreferenceson CNNtopicsandPatrickCoxonHMAXtopics.Interestinginformationwas alsoprovidedbyRobertGens,AndrejKarpathy.AndIwouldliketore-thank thosewhocontributedtothefirstversion,includingPaulRosinregarding syntheticinterestpoints,YannLeCunforprovidingkeyreferencesintodeep learningandconvolutionalnetworks,ShreeNayarforpermissiontousea fewimages,LucianoOviedoforblueskydiscussions,andmanyotherswho haveinfluencedmythinkingincludingAlexandreAlahi,SteveSeitz,Bryan Russel,LiefengBo,XiaofengRen,GutembergGuerra-filho,Harsha Viswana,DaleHitt,JoshuaGleason,NoahSnavely,DanielScharstein, ThomasSalmon,RichardBaraniuk,CarlVodrick,Herve ´ Je ´ gou,Andrew Richardson,OfriWeschler,HongJiang,AndyKuzma,MichaelJeronimo, EliTuriel,andmanyotherswhomIhavefailedtomention.

Asusual,thankstomywifeforpatiencewithmyresearch,andalsofor providingthe“governor”switchtopacemywork,withoutwhichIwould likelyburnoutmorecompletely.Andmostofall,specialthankstothegreat inventorwhoinspiresusall,AnnoDomini2016.

ForewordtotheSecondEdition vii

ScottKrig

Contents 1ImageCaptureandRepresentation ...................1 ImageSensorTechnology... .........................1 SensorMaterials .................................2 SensorPhotodiodeCells...........................2 SensorConﬁgurations:Mosaic,Foveon,BSI.. ..........4 DynamicRange,Noise,SuperResolution..............5 SensorProcessing ................................5 De-Mosaicking ..................................6 DeadPixelCorrection. ............................6 ColorandLightingCorrections... ...................6 GeometricCorrections....................... .....6 CamerasandComputationalImaging. ..................7 OverviewofComputationalImaging..................7 Single-PixelComputationalCameras...... ............8 2DComputationalCameras... .....................9 3DDepthCameraSystems .........................10 3DDepthProcessing ...............................21 OverviewofMethods. ............................22 ProblemsinDepthSensingandProcessing.. ...........22 MonocularDepthProcessing. .......................27 3DRepresentations:Voxels,DepthMaps,Meshes, andPointClouds..................................30 Summary.. ......................................31 Chapter1:LearningAssignments... ...................33 2ImagePre-Processing ..............................35 PerspectivesonImageProcessing......................35 ProblemstoSolveDuringImagePreprocessing............36 VisionPipelinesandImagePreprocessing.. ............36 Corrections....................................38 Enhancements ..................................38 PreparingImagesforFeatureExtraction...............39 TheTaxonomyofImageProcessingMethods .............43 Point. ........................................44 Line ..........................................44 Area .........................................44 Algorithmic ....................................45 ix

DataConversions ................................45 Colorimetry......................................45 OverviewofColorManagementSystems ..............46

....47 DeviceColorModels .............................47

...................48 GamutMappingandRenderingIntent .................48 PracticalConsiderationsforColorEnhancements.........49 ColorAccuracyandPrecision.......................50

................................50

KernelFilteringandShapeSelection ..................52 PointFiltering. .................................53 NoiseandArtifactFiltering .........................54 IntegralImagesandBoxFilters......................55 EdgeDetectors ....................................56

TransformFiltering,Fourier,andOthers.................58 FourierTransformFamily..........................58 MorphologyandSegmentation.. ......................61 BinaryMorphology..... .........................62 GrayScaleandColorMorphology ...................63 MorphologyOptimizationsandReﬁnements. ...........63 EuclideanDistanceMaps..........................63 Super-pixelSegmentation... .......................64 DepthSegmentation. .............................65 ColorSegmentation... ...........................66 Thresholding.. ...................................66 GlobalThresholding..............................67 LocalThresholding. ..............................70 Summary.. ......................................72 Chapter2:LearningAssignments... ...................73

.......................75 HistoricalSurveyofFeatures.........................75

TexturalAnalysis ................................78 StatisticalMethods ...............................80 TextureRegionMetrics... ..........................81 EdgeMetrics...................................82 Cross-CorrelationandAuto-correlation................83 FourierSpectrum,Wavelets,andBasisSignatures........84 Co-occurrenceMatrix,HaralickFeatures..... ..........85 LawsTextureMetrics.............................93 LBPLocalBinaryPatterns .........................94 DynamicTextures...............................95 x Contents

Illuminants,WhitePoint,BlackPoint,andNeutralAxis

ColorSpacesandColorPerception

SpatialFiltering...

ConvolutionalFilteringandDetection.................50

KernelSets:Sobel,Scharr,Prewitt,Roberts, Kirsch,Robinson,andFrei–Chen ....................56 CannyDetector.................................57

3GlobalandRegionalFeatures

KeyIdeas:Global,Regional,andLocalMetrics..........76

StatisticalRegionMetrics............................96 ImageMomentFeatures...........................96 PointMetricFeatures.............................97 GlobalHistograms...............................98 LocalRegionHistograms ..........................99 ScatterDiagrams,3DHistograms.. ..................99 Multi-resolution,Multi-scaleHistograms ...............102 RadialHistograms... ............................103 ContourorEdgeHistograms ........................104 BasisSpaceMetrics ................................104 FourierDescription ...............................107 Walsh–HadamardTransform ........................108 HAARTransform ................................108 SlantTransform... ..............................108 ZernikePolynomials ..............................109 SteerableFilters .................................109 Karhunen–LoeveTransformandHotellingTransform.....110 WaveletTransformandGaborFilters.................110 HoughTransformandRadonTransform...............112 Summary.. ......................................113 Chapter3:LearningAssignments... ...................114 4LocalFeatureDesignConcepts ......................115 LocalFeatures ....................................115 Detectors,InterestPoints,Keypoints,AnchorPoints, Landmarks.....................................116 Descriptors,FeatureDescription,FeatureExtraction... ....116 SparseLocalPatternMethods.......................117 LocalFeatureAttributes .............................117 ChoosingFeatureDescriptorsandInterestPoints.........117 FeatureDescriptorsandFeatureMatching ..............118 CriteriaforGoodness.. ...........................119 Repeatability,Easyvs.HardtoFind..................120 Distinctivevs.Indistinctive.........................120 RelativeandAbsolutePosition...... ................120 MatchingCostandCorrespondence...................120 DistanceFunctions............ .....................121 EarlyWorkonDistanceFunctions...................121 EuclideanorCartesianDistanceMetrics ...............122 GridDistanceMetrics .............................124 StatisticalDifferenceMetrics.......................124 BinaryorBooleanDistanceMetrics ..................125 DescriptorRepresentation... .........................126 CoordinateSpaces,ComplexSpaces ..................126 CartesianCoordinates.............................127 PolarandLogPolarCoordinates.....................127 RadialCoordinates ...............................127 Contents xi

DescriptorDensity.

LocalBinaryDescriptorPoint-PairPatterns...............134

SphericalCoordinates.............................128 GaugeCoordinates ...............................128 MultivariateSpaces,MultimodalData.................128 FeaturePyramids................................129

................................129 InterestPointandDescriptorCulling..................129 Densevs.SparseFeatureDescription.................130 DescriptorShapeTopologies.........................130 CorrelationTemplates. ............................131 PatchesandShape ...............................131 ObjectPolygonShapes.. ..........................133

FREAKRetinalPatterns.. .........................135

ORBandBRIEFPatterns

.........................138

.........139

...................146 DenseSearch ...................................146 GridSearch....................................146 Multi-scalePyramidSearch... .....................147

....................147 FeaturePyramids................................149

................150

LimitedSearch.........................151 ComputerVision,Models,Organization .................151 FeatureSpace

ObjectModels..................................152 Constraints.. ...................................154

...................154

xii Contents

BriskPatterns... ................................136

..........................137 DescriptorDiscrimination...

SpectraDiscrimination............................138 Region,Shapes,andPatternDiscrimination....

GeometricDiscriminationFactors... .................140 FeatureVisualizationtoEvaluateDiscrimination.........140 Accuracy,Trackability. ...........................143 AccuracyOptimizations,SubregionOverlap,Gaussian Weighting,andPooling...........................145 Sub-pixelAccuracy..............................145 SearchStrategiesandOptimizations.

ScaleSpaceandImagePyramids.

SparsePredictiveSearchandTracking ................150 TrackingRegion-LimitedSearch.....

SegmentationLimitedSearch.......................150 Depthor Z

...................................152

SelectionofDetectorsandFeatures

OverviewofTraining..... ........................155 ClassiﬁcationofFeaturesandObjects .................156 FeatureLearning,SparseCoding, ConvolutionalNetworks...........................161 Summary.. ......................................164 Chapter4:LearningAssignments... ...................165

5TaxonomyofFeatureDescriptionAttributes ............167 GeneralRobustnessTaxonomy ........................170 GeneralVisionMetricsTaxonomy.. ...................173 FeatureMetricEvaluation...........................182 SIFTExample. .................................183 LBPExample ...................................184 ShapeFactorsExample ............................184 Summary.. ......................................185 Chapter5:LearningAssignments... ...................186 6InterestPointDetectorandFeatureDescriptorSurvey ....187 InterestPointTuning ...............................188 InterestPointConcepts ..............................189 InterestPointMethodSurvey. ........................191 LaplacianandLaplacianofGaussian..................192 MoravacCornerDetector ..........................192 HarrisMethods,Harris–Stephens,Shi–Tomasi, andHessianTypeDetectors ........................192 HessianMatrixDetectorandHessian–Laplace...........193 DifferenceofGaussians...........................193 SalientRegions.................................193 SUSAN,andTrajkovicandHedly. ...................194 Fast,Faster,AGHAST ............................194 LocalCurvatureMethods ..........................195 MorphologicalInterestRegions......................196 FeatureDescriptorSurvey.. .........................196 LocalBinaryDescriptors... .......................197 Census.. ......................................205 ModifiedCensusTransform ........................205 BRIEF........................................206 ORB.........................................206 BRISK ........................................207 FREAK .......................................208 SpectraDescriptors................................208 SIFT.........................................209 SIFT-PCA.....................................213 SIFT-GLOH ....................................214 SIFT-SIFERRetrofit ..............................214 SIFTCS-LBPRetrofit... .........................214 RootSIFTRetrofit ................................215 CenSurEandSTAR.. ............................216 CorrelationTemplates. ............................217 HAARFeatures.................................219 Viola–JoneswithHAAR-LikeFeatures................220 SURF ........................................221 VariationsonSURF. ............................222 HistogramofGradients(HOG)andVariants... .........223 PHOGandRelatedMethods........................224 Contents xiii

DaisyandO-Daisy ...............................225 CARD.. ......................................226

RobustFastFeatureMatching... ....................228

RIFF,CHOG ...................................229

ChainCodeHistograms ...........................230 D-NETS......................................231

LocalGradientPattern ............................232

LocalPhaseQuantization ..........................232

BasisSpaceDescriptors.............................233

FourierDescriptors...............................234

OtherBasisFunctionsforDescriptorBuilding ...........235

SparseCodingMethods ...........................235

PolygonShapeDescriptors ...........................235

MSERMethod... ...............................236

ObjectShapeMetricsforBlobsandPolygons.. .........237 ShapeContext. .................................239

3D,4D,VolumetricandMultimodalDescriptors...........241 3DHOG......................................241

HON4D......................................242

3DSIFT......................................243 Summary.. ......................................244

Chapter6:LearningAssignments... ...................245

7GroundTruthData,Content,Metrics,andAnalysis ......247 WhatIsGroundTruthData?. .........................247

PreviousWorkonGroundTruthData:Artvs.Science.......249 GeneralMeasuresofQualityPerformance. .............249 MeasuresofAlgorithmPerformance..................250 Rosin’sWorkonCorners. .........................251

KeyQuestionsforConstructingGroundTruthData.........252 Content:Adopt,Modify,orCreate...................252 SurveyofAvailableGroundTruthData... .............252 FittingGroundTruthDatatoAlgorithms...............253 SceneCompositionandLabeling... ..................254

DeﬁningtheGoalsandExpectations.. ..................256 MikolajczykandSchmidMethodology ................256 OpenRatingSystems.. ...........................256 CornerCasesandLimits...........................257 InterestPointsandFeatures .........................257 RobustnessCriteriaforGroundTruthData...............258 IllustratedRobustnessCriteria.......................258 UsingRobustnessCriteriaforRealApplications.........259 PairingMetricswithGroundTruth.....................261 PairingandTuningInterestPoints,Features, andGroundTruth................................261 ExamplesUsingtheGeneralVisionTaxonomy ..........261 SyntheticFeatureAlphabets..........................262 GoalsfortheSyntheticDataset... ...................263

xiv Contents

SyntheticInterestPointAlphabet... ..................266 HybridSyntheticOverlaysonRealImages.............268 Summary.. ......................................269 Chapter7:LearningAssignments... ...................271 8VisionPipelinesandOptimizations ...................273 Stages,Operations,andResources.....................274 ComputeResourceBudgets.. ........................275 ComputeUnits,ALUs,andAccelerators.. .............277 PowerUse .....................................278 MemoryUse ...................................278 I/OPerformance.................................282 TheVisionPipelineExamples........................282 AutomobileRecognition.. .........................282 Face,Emotion,andAgeRecognition........ ..........289 ImageClassification. .............................296 AugmentedReality. ..............................299 AccelerationAlternatives............................304 MemoryOptimizations.. ..........................304 Coarse-GrainParallelism.. ........................307 Fine-GrainDataParallelism ........................308 AdvancedInstructionSetsandAccelerators.............310 VisionAlgorithmOptimizationsandTuning..............311 CompilerandManualOptimizations..................312 Tuning... .....................................312 FeatureDescriptorRetrofit,Detectors, DistanceFunctions ...............................313 BoxletsandConvolutionAcceleration........ .........313 Data-TypeOptimizations,Integervs.Float .............314 OptimizationResources .............................314 Summary.. ......................................315 Chapter8:LearningAssignments... ...................316 9FeatureLearningArchitectureTaxonomy andNeuroscienceBackground .......................319 NeuroscienceInspirationsforComputerVision............320 FeatureGenerationvs.FeatureLearning .................321 TerminologyofNeuroscienceAppliedtoComputerVision...322 ClassesofFeatureLearning..........................327 ConvolutionalFeatureWeightLearning ................328 LocalFeatureDescriptorLearning ...................328 BasisFeatureCompositionandDictionaryLearning .......329 SummaryPerspectiveonFeatureLearningMethods .......329 MachineLearningModelsforComputerVision... .........329 ExpertSystems ..................................330 StatisticalandMathematicalAnalysisMethods ..........331 NeuralScienceInspiredMethods....................331 DeepLearning ..................................331 DNNHackingandMisclassification ................333 Contents xv

HistoryofMachineLearning(ML)andFeatureLearning .....333 HistoricalSurvey,1940s–2010s ......................334

ArtiﬁcialNeuralNetwork(ANN)TaxonomyOverview ....338 FeatureLearningOverview..........................339 LearnedFeatureDescriptorTypes....................339 HierarchicalFeatureLearning .......................340 HowManyFeaturestoLearn? ......................340 ThePowerOfDNNs... ..........................341 EncodingEfﬁciency. .............................341 HandcraftedFeaturesvs.HandcraftedDeepLearning. .....341 InvarianceandRobustnessAttributesforFeature Learning ......................................343

WhatAretheBestFeaturesandLearning Architectures? ..................................343 MergerofBigData,Analytics,andComputerVision......345 KeyTechnologyEnablers.. ........................347 NeuroscienceConcepts ..............................348 BiologyandBlueprint. ............................349 TheElusiveUnifiedLearningTheory.................350 HumanVisualSystemArchitecture...................351 TaxonomyofFeatureLearningArchitectures .............356 ArchitectureTopologies ...........................357 ANNs(ArtificialNeuralNetworks).................358 FNN(FeedForwardNeuralNetwork)..... ..........358 RNN(RecurrentNeuralNetwork).. ................358 BFN(BasisFunctionNetwork)... .................359 Ensembles,Hybrids.... ........................359 ArchitectureComponentsandLayers.................360 LayerTotals.................................360 LayerConnectionTopology ......................361 MemoryModel ...............................362 TrainingProtocols.. ...........................362 InputSamplingMethods.. .......................363 Dropout,Reconfiguration,Regularization............363 Preprocessing,NumericConditioning...............365 FeaturesSetDimensions.. .......................365 FeatureInitialization.. .........................366 Features,Filters.. .............................366 Activation,Transferfunctions.....................367 Post-processing,NumericConditioning..............368 Pooling,Subsampling,Downsampling,Upsampling.....369 Classifiers.... ...............................371 Summary.. ......................................371 Chapter9:LearningAssignments... ...................373

xvi Contents

10FeatureLearningandDeepLearningArchitecture Survey .........................................375 ArchitectureSurvey. ...............................376 FNNArchitectureSurvey. .........................377 P—Perceptron................................377 MLP,MultilayerPerceptron,Cognitron,Neocognitron...383 ConceptsforCNNs,Convnets,DeepMLPs. ..........387 LeNet......................................417 AlexNet,ZFNet ...............................419 VGGNetandVariantsMSRA-22,BaiduDeepImage, DeepResidualLearning.........................422 Half-CNN...................................425 NiN,Maxout........

GoogLeNet,InceptionNet........................431 MSRA-22,SPP-Net,R-CNN,MSSNN,Fast-R-CNN....434 Baidu,DeepImage,MINWA.. ...................437 SYMNETS—DeepSymmetryNetworks.............438 RNNArchitectureSurvey

MultidimensionalRNNs,MDRNN

Learning(DRL)...

.........................426

..................454

.................457

.............465

.........................469

Networks..... ...............................469

.........486

.............488

...................490

..................495

EnsembleMethods............ .....................506

......................508

Summary.. ......................................511

...................513 AppendixA:SyntheticFeatureAnalysis ...................515

..............547 AppendixC:ImagingandComputerVisionResources ........555 Contents xvii

..........................442 ConceptsforRecurrentNeuralNetworks.............443 LSTM,GRU.................................451 NTM,RNN-NTM,RL-NTM...

C-RNN,QDRNN..............................460 RCL-RCNN ..................................461 dasNET.....................................463 NAP—NeuralAbstractionPyramid....

BFNArchitectureSurvey.

ConceptsforMachineLearningandBasisFeature

PNN—PolynomialNeuralNetwork,GMDH.

HKD—KernelDescriptorLearning....

HMP—SparseFeatureLearning

HMAXandNeurologicalModels

HMO—HierarchicalModelOptimization............506

DeepNeuralNetworkFutures...

IncreasingDepthtotheMax—DeepResidual

..............................509 ApproximatingComplexModelsUsingASimpler MLP(ModelCompression).........................510 ClassiﬁerDecompositionandRecombination............511

Chapter10:LearningAssignments..

AppendixB:SurveyofGroundTruthDatasets

AppendixD:ExtendedSDMMetrics

AppendixE:TheVisualGenomeModel(VGM)

References

.....................563

.............575

..........................................601

..............................................627 xviii Contents

Index

ImageCaptureandRepresentation

“Thechangingofbodiesintolight,andlightintobodies,isveryconformabletothecourse ofNature,whichseemsdelightedwithtransmutations.”

—IsaacNewton

Computervisionstartswithimages.Thischaptersurveysarangeoftopicsdealingwithcapturing, processing,andrepresentingimages,includingcomputationalimaging,2Dimaging,and3Ddepth imagingmethods,sensorprocessing,depth-ﬁeldprocessingforstereoandmonocularmulti-view stereo,andsurfacereconstruction.Ahigh-leveloverviewofselectedtopicsisprovided,with referencesfortheinterestedreadertodigdeeper.Readerswithastrongbackgroundintheareaof 2Dand3Dimagingmaybeneﬁtfromalightreadingofthischapter.

ImageSensorTechnology

Thissectionprovidesabasicoverviewofimagesensortechnologyasabasisforunderstandinghow imagesareformedandfordevelopingeffectivestrategiesforimagesensorprocessingtooptimizethe imagequalityforcomputervision.

TypicalimagesensorsarecreatedfromeitherCCDcells(charge-coupleddevice)orstandard CMOScells(complementarymetal-oxidesemiconductor).TheCCDandCMOSsensorsshare similarcharacteristicsandbotharewidelyusedincommercialcameras.Themajorityofsensors todayuseCMOScells,though,mostlyduetomanufacturingconsiderations.Sensorsandopticsare oftenintegratedtocreate wafer-scalecameras forapplicationslikebiologyormicroscopy,asshown inFig. 1.1

Imagesensorsaredesignedtoreachspeciﬁcdesigngoalswithdifferentapplicationsinmind, providingvaryinglevelsofsensitivityandquality.Consultthemanufacturer’sinformationtoget familiarwitheachsensor.Forexample,thesizeandmaterialcompositionofeachphotodiodesensor cellelementisoptimizedforagivensemiconductormanufacturingprocesssoastoachievethebest trade-offbetweensilicondieareaanddynamicresponseforlightintensityandcolordetection.

Forcomputervision,theeffectsofsamplingtheoryarerelevant—forexample,theNyquist frequencyappliedtopixelcoverageofthetargetscene.Thesensorresolutionandopticstogether mustprovideadequateresolutionforeachpixeltoimagethefeaturesofinterest,soitfollowsthata featureofinterestshouldbeimagedorsampledatleasttwotimesgreaterthantheminimumsizeof thesmallestpixelsofimportancetothefeature.Ofcourse,2 oversamplingisjustaminimumtarget foraccuracy;inpractice,singlepixelwidefeaturesarenoteasilyresolved.

# SpringerInternationalPublishingSwitzerland2016 S.Krig, ComputerVisionMetrics,DOI10.1007/978-3-319-33762-3_1 1

Micro-lenses

RGB Color Filters

CMOS imager

Commonintegratedimagesensorarrangementwithopticsandcolorﬁlters

Forbestresults,thecamerasystemshouldbecalibratedforagivenapplicationtodeterminethe sensornoiseanddynamicrangeforpixelbitdepthunderdifferentlightinganddistancesituations. Appropriatesensorprocessingmethodsshouldbedevelopedtodealwiththenoiseandnonlinear responseofthesensorforanycolorchannel,todetectandcorrectdeadpixels,andtohandlemodeling ofgeometricdistortion.Ifyoudeviseasimplecalibrationmethodusingatestpatternwithﬁneand coarsegradationsofgrayscale,color,anddifferentscalesofpixelfeatures,appropriatesensor processingmethodscanbedevised.InChap. 2,wesurveyarangeofimageprocessingmethods applicabletosensorprocessing.Butletusbeginbysurveyingthesensormaterials.

SensorMaterials

Silicon-basedimagesensorsaremostcommon,althoughothermaterialssuchasgallium(Ga)are usedinindustrialandmilitaryapplicationstocoverlongerIRwavelengthsthansiliconcanreach. Imagesensorsrangeinresolution,dependinguponthecameraused,fromasinglepixelphototransistorcamera,through1Dlinescanarraysforindustrialapplications,to2Drectangulararraysfor commoncameras,allthewaytosphericalarraysforhigh-resolutionimaging.(Sensorconﬁgurations andcameraconﬁgurationsarecoveredlaterinthischapter.)

CommonimagingsensorsaremadeusingsiliconasCCD,CMOS,BSI,andFoveonmethods,as discussedabitlaterinthischapter.Siliconimagesensorshaveanonlinearspectralresponsecurve; thenearinfraredpartofthespectrumissensedwell,whileblue,violet,andnearUVaresensedless well,asshowninFig. 1.2.Notethatthesiliconspectralresponsemustbeaccountedforwhenreading therawsensordataandquantizingthedataintoadigitalpixel.Sensormanufacturersmakedesign compensationsinthisarea;however,sensorcolorresponseshouldalsobeconsideredwhen calibratingyourcamerasystemanddevisingthesensorprocessingmethodsforyourapplication.

SensorPhotodiodeCells

Onekeyconsiderationforimagesensorsisthephotodiodesizeorcellsize.Asensorcellusingsmall photodiodeswillnotbeabletocaptureasmanyphotonsasalargephotodiode.Ifthecellsizeisnear thewavelengthofthevisiblelighttobecaptured,suchasbluelightat400nm,thenadditional problemsmustbeovercomeinthesensordesigntocorrecttheimagecolor.Sensormanufacturers takegreatcaretodesigncellsattheoptimalsizetoimageallcolorsequallywell(Fig. 1.3).Inthe extreme,smallsensorsmaybemoresensitivetonoise,owingtoalackofaccumulatedphotonsand sensorreadoutnoise.Ifthephotodiodesensorcellsaretoolarge,thereisnobeneﬁteither,andthedie sizeandcostforsilicongoup,providingnoadvantage.Commoncommercialsensordevicesmay havesensorcellsizesofaround1squaremicronandlarger;eachmanufacturerisdifferent,however, andtrade-offsaremadetoreachspeciﬁcrequirements.

21ImageCaptureandRepresentation

Figure1.1

Figure1.2 Typicalspectralresponseofafewtypesofsiliconphotodiodes.Notethehighestsensitivityinthenearinfraredrangearound900nmandnonlinearsensitivityacrossthevisiblespectrumof400–700nm.RemovingtheIR ﬁlterfromacameraincreasesthenear-infraredsensitivityduetothenormalsiliconresponse.(Spectraldataimage # OSIOptoelectronicsInc.andusedbypermission)

color spectral overlap

Sensitivity

Wavelength (nm)

Figure1.3 Primarycolorassignmenttowavelengths.Notethattheprimarycolorregionsoverlap,withgreenbeinga goodmonochromeproxyforallcolors

1.00 0.80 0.60 0.40 0.20 0.00

390440490540590640690740

RGB

ImageSensorTechnology 3

SensorConfigurations:Mosaic,Foveon,BSI

Therearevariouson-chipconﬁgurationsformultispectralsensordesign,includingmosaicsand stackedmethods,asshowninFig. 1.4.Ina mosaicmethod,thecolorﬁltersarearrangedinamosaic patternaboveeachcell.The Foveon1 sensorstackingmethod reliesonthephysicsofdepthpenetrationofthecolorwavelengthsintothesemiconductormaterial,whereeachcolorpenetratesthesilicon toadifferentdepth,therebyimagingtheseparatecolors.Theoverallcellsizeaccommodatesall colors,andsoseparatecellsarenotneededforeachcolor.

Figure1.4 (Left)TheFoveonmethodofstackingRGBcellstoabsorbdifferentwavelengthsatdifferentdepths,with allRGBcolorsateachcelllocation.(Right)AstandardmosaiccellplacementwithRGBfiltersaboveeach photodiode,withfiltersonlyallowingthespecificwavelengthstopassintoeachphotodiode

Back-side-illuminated (BSI)sensorconﬁgurationsrearrangethesensorwiringonthedietoallow foralargercellareaandmorephotonstobeaccumulatedineachcell.SeetheAptina[392]white paperforacomparisonoffront-sideandback-sidediecircuitarrangement.

Thearrangementofsensorcellsalsoaffectsthecolorresponse.Forexample,Fig. 1.5 shows variousarrangementsofprimarycolor(R,G,B)sensorsaswellaswhite(W)sensorstogether,where Wsensorshaveaclearorneutralcolorﬁlter.Thesensorcellarrangementsallowforarangeofpixel processingoptions—forexample,combiningselectedpixelsinvariousconﬁgurationsofneighboring cellsduringsensorprocessingforapixelformationthatoptimizescolorresponseorspatialcolor resolution.Infact,someapplicationsjustusetherawsensordataandperformcustomprocessingto increasetheresolutionordevelopalternativecolormixes.

Theoverallsensorsizeandformatdeterminesthelenssizeaswell.Ingeneral,alargerlensletsin morelight,solargersensorsaretypicallybettersuitedtodigitalcamerasforphotography applications.Inaddition,thecellplacementaspectratioonthediedeterminespixelgeometry—for example,a4:3aspectratioiscommonfordigitalcameraswhile3:2isstandardfor35mmﬁlm.The sensorconﬁgurationdetailsareworthunderstandinginordertodevisethebestsensorprocessingand imagepreprocessingpipelines.

Stacked Photo-diodes R B G R filter B filter G filter

Photo-diode Photo-diode Photo-diode

1 FoveonisaregisteredtrademarkofFoveonInc. 41ImageCaptureandRepresentation

Figure1.5 Severaldifferentmosaicconﬁgurationsofcellcolors,includingwhite,primaryRGBcolors,andsecondary CYMcells.Eachconﬁgurationprovidesdifferentoptionsforsensorprocessingtooptimizeforcolororspatial resolution.(Imageusedbypermission, # IntelPress,fromBuildingIntelligentSystems)

DynamicRange,Noise,SuperResolution

Currentstate-of-the-artsensorsprovideatleast8bitspercolorcell,andusuallyare12–14bits.Sensor cellsrequireareaandtimetoaccumulatephotons,sosmallercellsmustbedesignedcarefullytoavoid problems.Noisemaycomefromoptics,colorﬁlters,sensorcells,gainandA/Dconverters,postprocessing,orthecompressionmethods,ifused.Sensorreadoutnoisealsoaffectseffectiveresolution,aseachpixelcellisreadoutofthesensor,senttoanA/Dconverter,andformedintodigitallines andcolumnsforconversionintopixels.Bettersensorswillprovidelessnoiseandhighereffectivebit resolution,howevereffectiveresolutioncanbeincreasedusingsuperresolutionmethods,bytaking severalimagesinrapidsuccessionaveragedtogethertoreducenoise[886],oralternativelythesensor positioncanbemicro-MEMS-ditheredtocreateimagesequencestoaveragetogethertoincrease resolution.Agoodsurveyofde-noisingisfoundintheworkbyIbenthal[391].

Inaddition,sensorphotonabsorptionisdifferentforeachcolor,andmaybeproblematicforblue, whichcanbethehardestcolorforsmallersensorstoimage.Insomecases,themanufacturermay attempttoprovideasimplegamma-curvecorrectionmethodbuiltintothesensorforeachcolor, whichisnotrecommended.Fordemandingcolorapplications,considercolorimetricdevicemodels andcolormanagement(aswillbediscussedinChap. 2),orevenbycharacterizingthenonlinearityfor eachcolorchannelofthesensoranddevelopingasetofsimplecorrectiveLUTtransforms.(NoiseﬁlteringmethodsapplicabletodepthsensingarealsocoveredinChap. 2.)

SensorProcessing

Sensorprocessingisrequiredtode-mosaicandassemblethepixelsfromthesensorarray,andalsoto correctsensingdefects.Wediscussthebasicsofsensorprocessinginthissection.

Typically,adedicatedsensorprocessorisprovidedineachimagingsystem,includingafastHW sensorinterface,optimizedVLIWandSIMDinstructions,anddedicatedﬁxed-functionhardware blockstodealwiththemassivelyparallelpixel-processingworkloadsforsensorprocessing.Usually, sensorprocessingistransparent,automatic,andsetupbythemanufactureroftheimagingsystem,

ImageSensorTechnology 5

andallimagesfromthesensorareprocessedthesameway.Abypassmayexisttoprovidetheraw datathatcanallowcustomsensorprocessingforapplicationslikedigitalphotography.

De-Mosaicking

Dependingonthesensorcellconﬁguration,asshowninFig. 1.5,variousde-mosaickingalgorithms areemployedtocreateaﬁnalRGBpixelfromtherawsensordata.AgoodsurveybyLosson etal.[388]andanotherbyLietal.[389]providesomebackgroundonthechallengesinvolvedandthe variousmethodsemployed.

Oneofthecentralchallengesofde-mosaickingispixelinterpolationtocombinethecolorchannels fromnearbycellsintoasinglepixel.Giventhegeometryofsensorcellplacementandtheaspectratio ofthecelllayout,thisisnotatrivialproblem.Arelatedissueiscolorcellweighting forexample, howmuchofeachcolorshouldbeintegratedintoeachRGBpixel.Sincethespatialcellresolutionin amosaickedsensorisgreaterthantheﬁnalcombinedRGBpixelresolution,someapplications requiretherawsensordatatotakeadvantageofalltheaccuracyandresolutionpossible,orto performspecialprocessingtoeitherincreasetheeffectivepixelresolutionordoabetterjobof spatiallyaccuratecolorprocessingandde-mosaicking.

DeadPixelCorrection

Asensor,likeanLCDdisplay,mayhavedeadpixels.Avendormaycalibratethesensoratthefactory andprovideasensordefectmapfortheknowndefects,providingcoordinatesofthosedeadpixelsfor useincorrectionsinthecameramoduleordriversoftware.Insomecases,adaptivedefectcorrection methods[390]areusedonthesensortomonitortheadjacentpixelstoactivelylookfordefectsand thentocorrectarangeofdefecttypes,suchassinglepixeldefects,columnorlinedefects,anddefects suchas2 2or3 3clusters.Acameradrivercanalsoprovideadaptivedefectanalysistolookfor ﬂawsinrealtime,andperhapsprovidespecialcompensationcontrolsinacamerasetupmenu.

ColorandLightingCorrections

Colorcorrectionsarerequiredtobalancetheoverallcoloraccuracyaswellasthewhitebalance.As showninFig. 1.2,colorsensitivityisusuallyverygoodinsiliconsensorsforredandgreen,butless goodforblue,sotheopportunityforprovidingthemostaccuratecolorstartswithunderstandingand calibratingthesensor.

Mostimagesensorprocessorscontainageometricprocessorforvignettecorrection,which manifestsasdarkerilluminationattheedgesoftheimage,asdiscussedinChap. 7 Table 7.1 on robustnesscriteria.Thecorrectionsarebasedonageometricwarpfunction,whichiscalibratedatthe factorytomatchtheopticsvignettepattern,allowingforaprogrammableilluminationfunctionto increaseilluminationtowardtheedges.Foradiscussionofimagewarpingmethodsapplicableto vignetting,seeRef.[472].

GeometricCorrections

Alensmayhavegeometricaberrationsormaywarptowardtheedges,producingimageswithradial distortion,aproblemthatisrelatedtothevignettingdiscussedaboveandshowninChap. 7 (Fig. 7.6). Todealwithlensdistortion,mostimagingsystemshaveadedicatedsensorprocessorwitha

61ImageCaptureandRepresentation

hardware-accelerateddigitalwarpunitsimilartothetexturesamplerinaGPU.Thegeometric correctionsarecalibratedandprogrammedinthefactoryfortheoptics.SeeRef.[472]fora discussionofimagewarpingmethods.

CamerasandComputationalImaging

Manynovelcameraconﬁgurationsaremakingtheirwayintocommercialapplicationsusing computationalimaging methodstosynthesizenewimagesfromrawsensordata—forexample,depth camerasandhighdynamicrangecameras.AsshowninFig. 1.6,aconventionalcamerasystem usesasinglesensor,lens,andilluminatortocreate2Dimages.However,acomputationalimaging cameramayprovidemultipleoptics,multipleprogrammableilluminationpatterns,andmultiple sensors,enablingnovelapplicationssuchas3Ddepthsensingandimagerelighting,takingadvantage ofthedepthinformation,mappingtheimageasatextureontothedepthmap,andintroducingnew lightsourcesandthenre-renderingtheimageinagraphicspipeline.Sincecomputationalcamerasare beginningtoemergeinconsumerdevicesandwillbecomethefrontendofcomputervisionpipelines, wesurveysomeofthemethodsused.

Image Enhancements

- Color Enhancements

- Filtering, Contrast

Computational Imaging

- High Dynamic Range HDR

- High Frame Rates

- 3D Depth Maps

- Focal Plane Refocusing

- Focal Sweep

- Rolling Shutter

- Panorama Stitching - Image Relighting

2D Sensor

2D Sensor Array

2D Sensor Array 2D Sensor Array 2D Sensor Array

Single Lens

Multi-lens optics Arrays

- Plenopticlens arrays

Multi-lens optics Arrays

- Plenopticlens arrays

Multi-lens Optics Arrays

- Sphere/ball lenses

- Plenopticlens arrays

- PlenopticLens Arrays

- Sphere/ball lenses

- Sphere/Ball Lenses

Single Flash Programmable Flash

- Pattern Projectors

- Multi-Flash

Figure1.6 Comparisonofcomputationalimagingsystemswithconventionalcameras.(Top)Simplecameramodel withﬂash,lens,andimagingdevicefollowedbyimageenhancementslikesharpeningandcolorcorrections.(Bottom) Computationalimagingusingprogrammableﬂash,opticsarrays,andsensorarrays,followedbycomputationalimaging applications.NOTSHOWN:superresolution[886]discussedearlier

OverviewofComputationalImaging

Computationalimaging[396,429]providesoptionsforsynthesizingnewimagesfromtherawimage data.Acomputationalcameramaycontrolaprogrammableﬂashpatternprojector,alensarray,and multipleimagesensors,aswellassynthesizenewimagesfromtherawdata,asillustratedinFig. 1.6. Todigdeeperintocomputationalimagingandexplorethecurrentresearch,seetheCAVEComputer VisionLaboratoryatColumbiaUniversityandtheRochesterInstituteofTechnologyImaging Research.Herearesomeofthemethodsandapplicationsinuse.

CamerasandComputationalImaging7

Single-PixelComputationalCameras

Single-pixelcomputationalcamerascanreconstructimagesfromasequenceofsinglephotodetector pixelimagesofthesamescene.Theﬁeldofsingle-pixelcameras[95,96]fallsintothedomainof compressedsensingresearch,whichalsohasapplicationsoutsideimageprocessingextendinginto areassuchasanalog-to-digitalconversion.

AsshowninFig. 1.7,a single-pixelcamera mayusea micro-mirrorarray ora digitalmirror device (DMD),similartoadiffractiongrating.Thegratingsarearrangedinarectangularmicromirrorgridarray,allowingthegridregionstobeswitchedonorofftoproducebinarygridpatterns. Thebinarypatternsaredesignedasapseudo-randombinarybasisset.Theresolutionofthegrid patternsisadjustedbycombiningpatternsfromadjacentregions—forexample,agridof2 2or 3 3micro-mirrorregions.

Figure1.7 Asingle-pixelimagingsystemwhereincominglightisreﬂectedthroughaDMDarrayofmicro-mirrors ontoasinglephotodiode.Thegridlocationswithinthemicro-mirrorarraycanbeopenedorclosedtolight,asshown here,tocreatebinarypatterns,wherethewhitegridsquaresarereﬂectiveand open,andthe blackgridsquares are closed.(Imageusedbypermission, # R.G.Baraniuk,CompressiveSensingLectureNotes)

Asequenceofsingle-pixelimagesistakenthroughasetofpseudo-randommicrolensarray patterns,thenanimageisreconstructedfromtheset.Infact,thenumberofpatternsamplesrequired toreconstructtheimageislowerthantheNyquistfrequency,sinceasparserandomsampling approachisusedandtherandomsamplingapproachhasbeenprovenintheresearchtobemathematicallysufﬁcient[95,96].Thegridbasis-setsamplingmethodisdirectlyamenabletoimage compression,sinceonlyarelativelysparsesetofpatternsandsamplesaretaken.Sincethemicromirrorarrayusesrectangularshapes,thepatternsareanalogoustoasetofHAARbasisfunctions. (Formoreinformation,seeFigs. 3.21, 6.21 and 6.22.)

TheDMDmethodisremarkable,inthatanimagecanbereconstructedfromafairlysmallsetof imagestakenfromasinglephotodetector,ratherthana2DarrayofphotodetectorsasinaCMOSor CCDimagesensor.Sinceonlyasinglesensorisused,themethodispromisingforapplicationswith wavelengthsoutsidethenearIRandvisiblespectrumimagedbyCMOSandCCDsensors.TheDMD methodcanbeused,forexample,todetectemissionsfromconcealedweaponsorsubstancesat invisiblewavelengthsusingnon-siliconsensorssensitivetononvisiblewavelengths.

81ImageCaptureandRepresentation

2DComputationalCameras

Novelconﬁgurationsofprogrammable2Dsensorarrays,lenses,andilluminatorsarebeingdeveloped intocamerasystemsas computationalcameras [406–408],withapplicationsrangingfromdigital photographytomilitaryandindustrialuses,employingcomputationalimagingmethodstoenhance theimagesafterthefact.Computationalcamerasborrowmanycomputationalimagingmethodsfrom confocalimaging[401]andconfocalmicroscopy[402,403]—forexample,usingmultipleilluminationpatternsandmultiplefocalplaneimages.Theyalsodrawonresearchfromsyntheticaperture radarsystems[404]developedafterWorldWarIItocreatehigh-resolutionimagesand3Ddepth mapsusingwidebaselinedatafromasinglemoving-cameraplatform.Syntheticaperturesusing multipleimagesensorsandopticsforoverlappingﬁeldsofviewusingwafer-scaleintegrationarealso topicsofresearch[401].Wesurveyhereafewcomputational2Dsensormethods,including high resolution (HR), highdynamicrange (HDR),and highframerate (HF)cameras.

Thecurrentwaveofcommercialdigitalmegapixelcameras,rangingfromaround10megapixels onup,provideresolutionmatchingorexceedinghigh-endfilmusedina35mmcamera[394],soa pixelfromanimagesensoriscomparableinsizetoagrainofsilveronthebestresolutionfilm.Onthe surface,thereappearstobelittleincentivetogoforhigherresolutionforcommercialuse,since currentdigitalmethodshavereplacedmostfilmapplicationsandfilmprintersalreadyexceedthe resolutionofthehumaneye.

However,veryhighresolutiongigapixelimagingdevicesarebeingdevisedandconstructedasan arrayofimagesensorsandlenses,providingadvantagesforcomputationalimagingaftertheimageis taken.Oneconﬁgurationisthe 2Darraycamera,composedofanorthogonal2Darrayofimage sensorsandcorrespondingoptics;anotherconﬁgurationisthe sphericalcamera asshowninFig. 1.8 [393,397],developedasaDARPAresearchprojectatColumbiaUniversityCAVE.

Highdynamicrange (HDR)cameras[398–400]canproducedeeperpixelswithhigherbit resolutionandbettercolorchannelresolutionbytakingmultipleimagesofthescenebracketed withdifferentexposuresettingsandthencombiningtheimages.Thiscombinationusesasuitable weightingschemetoproduceanewimagewithdeeperpixelsofahigherbitdepth,suchas32pixels percolorchannel,providingimagesthatgobeyondthecapabilitiesofcommoncommercialCMOS andCCDsensors.HDRmethodsallowfaintlightandstronglighttobeimagedequallywell,andcan combinefaintlightandbrightlightusingadaptivelocalmethodstoeliminateglareandcreatemore uniformandpleasingimagecontrast.

Highframerate (HF)cameras[407]arecapableofcapturingarapidsuccessionofimagesofthe sceneintoasetandcombiningthesetofimagesusingbracketingtechniquestochangetheexposure, ﬂash,focus,whitebalance,anddepthofﬁeld.

CamerasandComputationalImaging9

Figure1.8 (Top)Componentsofaveryhighresolutiongigapixelcamera,usinganovelsphericallensandsensor arrangement.(Bottom)Theresultinghigh-resolutionimagesshownat82,000 22,000 ¼ 1.7gigapixels.(Allﬁgures andimagesusedbypermission # ShreeNayarColumbiaUniversityCAVEresearchprojects)

3DDepthCameraSystems

Usinga3Ddepthﬁeldforcomputervisionprovidesanunderstatedadvantageformanyapplications, sincecomputervisionhasbeenconcernedinlargepartwithextracting3Dinformationfrom2D images,resultinginawiderangeofaccuracyandinvarianceproblems.Novel3Ddescriptorsare beingdevisedfor3Ddepthﬁeldcomputervision,andarediscussedinChap. 6.

Withdepthmaps,thescenecaneasilybesegmentedintoforegroundandbackgroundtoidentify andtracksimpleobjects.Digitalphotographyapplicationsareincorporatingvariouscomputervision

Sensor Array Lens Array Ball Lens SensorArray Lens Ball Lens 82,000 pixels 22,000 Array 100 mm 200 mm (a) A single element design(b) A 4π FOV design ResistorDollar Bill2D BarcodeFingerprint

101ImageCaptureandRepresentation