Matrix and Tensor Decompositions in Signal Processing
Gérard Favier
First published 2021 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Ltd
John Wiley & Sons, Inc.
27-37 St George’s Road 111 River Street London SW19 4EU Hoboken, NJ 07030
UK USA
www.iste.co.uk
www.wiley.com
© ISTE Ltd 2021
The rights of Gérard Favier to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.
Library of Congress Control Number: 2021938218
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
ISBN 978-1-78630-155-0
Chapter1.MatrixDecompositions
1.1.Introduction.................................1
1.2.Overviewofthemostcommonmatrixdecompositions.........2
1.3.Eigenvaluedecomposition.........................5
1.3.1.Remindersabouttheeigenvaluesofamatrix.............5
1.3.2.Eigendecompositionandproperties..................7
1.3.3.Specialcaseofsymmetric/Hermitianmatrices............10
1.3.4.Applicationtocomputethepowersofamatrixandamatrix polynomial...................................12
1.3.5.Applicationtocomputeastatetransitionmatrix...........12
1.3.6.Applicationtocomputethetransferfunctionandtheoutputofa discrete-timelinearsystem...........................13
1.4. URV H decomposition...........................13
1.5.Singularvaluedecomposition.......................15
1.5.1.Definitionandproperties........................15
1.5.2.ReducedSVDanddyadicdecomposition...............17
1.5.3.SVDandfundamentalsubspacesassociatedwithamatrix.....20
1.5.4.SVDandtheMoore–Penrosepseudo-inverse.............20
1.5.5.SVDcomputation............................21
1.5.6.SVDandmatrixnorms.........................22
1.5.7.SVDandlow-rankmatrixapproximation...............25
1.5.8.SVDandorthogonalprojectors....................28
1.5.9.SVDandLSestimator.........................28
1.5.10.SVDandpolardecomposition....................31
1.5.11.SVDandPCA.............................33
1.5.12.SVDandblindsourceseparation...................38
1.6. CUR decomposition............................43
Chapter2.Hadamard,KroneckerandKhatri–RaoProducts .....47
2.1.Introduction.................................47
2.2.Notation...................................49
2.3.Hadamardproduct..............................50
2.3.1.Definitionandidentities.........................50
2.3.2.Fundamentalproperties.........................51
2.3.3.Basicrelations..............................51
2.3.4.Relationsbetweenthe diag operatorandHadamardproduct....52
2.4.Kroneckerproduct..............................54
2.4.1.Kroneckerproductofvectors......................54
2.4.2.Kroneckerproductofmatrices.....................58
2.4.3.Rank,trace,determinantandspectrumofaKroneckerproduct..64
2.4.4.StructuralpropertiesofaKroneckerproduct.............66
2.4.5.InverseandMoore–Penrosepseudo-inverseofa Kroneckerproduct...............................67
2.4.6.DecompositionsofaKroneckerproduct...............68
2.5.Kroneckersum................................69
2.5.1.Definition.................................69
2.5.2.Properties.................................70
2.6.Indexconvention...............................70
2.6.1.Writingvectorsandmatriceswiththeindexconvention.......71
2.6.2.Basicrulesandidentitieswiththeindexconvention.........72
2.6.3.Matrixproductsandindexconvention.................74
2.6.4.Kroneckerproductsandindexconvention..............75
2.6.5.Vectorizationandindexconvention..................77
2.6.6.Vectorizationformulae.........................79
2.6.7.Vectorizationofpartitionedmatrices.................82
2.6.8.Tracesofmatrixproductsandindexconvention...........84
2.7.Commutationmatrices...........................86
2.7.1.Definition.................................87
2.7.2.Properties.................................88
2.7.3.Kroneckerproductandpermutationoffactors............88
2.7.4.MultipleKroneckerproductandcommutationmatrices.......90
2.7.5.BlockKroneckerproduct........................92
2.7.6.StrongKroneckerproduct.......................94
2.8.Relationsbetweenthe diag operatorandtheKroneckerproduct....94
2.9.Khatri–Raoproduct.............................95
2.9.1.Definition.................................95
2.9.2.Khatri–Raoproductandindexconvention..............96
2.9.3.MultipleKhatri–Raoproduct......................97
2.9.4.Properties.................................97
2.9.5.Identities.................................98
2.9.6.Khatri–Raoproductandpermutationoffactors...........99
2.9.7.TraceofaproductofmatricesandKhatri–Raoproduct.......100 2.10.RelationsbetweenvectorizationandKroneckerandKhatri–Rao products......................................101
2.11.RelationsbetweentheKronecker,Khatri–Raoand Hadamardproducts................................101
2.12.Applications................................108
2.12.1.Partialderivativesandindexconvention...............108
2.12.2.Solvingmatrixequations.......................116
Chapter3.TensorOperations
3.1.Introduction.................................125
3.2.Notationandparticularsetsoftensors...................127
3.3.Notionofslice................................133
3.3.1.Fibers...................................133
3.3.2.Matrixandtensorslices.........................133
3.4.Modecombination..............................135
3.5.Partitionedtensorsorblocktensors....................137
3.6.Diagonaltensors...............................139
3.6.1.Caseofatensor X∈ K[N ;I ] ......................139
3.6.2.Caseofasquaretensor.........................140
3.6.3.Caseofarectangulartensor......................141
3.7.Matricization.................................141
3.7.1.Matricizationofathird-ordertensor..................142
3.7.2.Matrixunfoldingsandindexconvention...............143
3.7.3.Matricizationofatensoroforder N ..................144
3.7.4.Tensormatricizationbyindexblocks.................147
3.8.Subspacesassociatedwithatensorandmultilinearrank........148
3.9.Vectorization.................................149
3.9.1.Vectorizationofatensoroforder N ..................149
3.9.2.Vectorizationofathird-ordertensor..................150
3.10.Transposition................................151
3.10.1.Definitionofatransposetensor....................151
3.10.2.Propertiesoftransposetensors....................152
3.10.3.Transpositionandtensorcontraction.................155
3.11.Symmetric/partiallysymmetrictensors.................156
3.11.1.Symmetrictensors...........................156
3.11.2.Partiallysymmetric/Hermitiantensors................156
3.11.3.MultilinearformswithHermitiansymmetryand Hermitiantensors................................159
3.11.4.Symmetrizationofatensor......................161
3.12.Triangulartensors.............................166
3.13.Multiplicationoperations.........................166
3.13.1.Outerproductoftensors........................168
3.13.2.Tensor–matrixmultiplication.....................170
3.13.3.Tensor–vectormultiplication.....................174
3.13.4.Mode-(p,n) product..........................176
3.13.5.Einsteinproduct............................178
3.14.Inverseandpseudo-inversetensors....................186
3.15.Tensordecompositionsintheformoffactorizations..........193
3.15.1.Eigendecompositionofasymmetricsquaretensor.........193
3.15.2.SVDdecompositionofarectangulartensor.............194
3.15.3.ConnectionbetweenSVDandHOSVD...............194
3.15.4.Full-rankdecomposition........................197
3.16.Innerproduct,Frobeniusnormandtraceofatensor..........199
3.16.1.Innerproductoftwotensors.....................199
3.16.2.Frobeniusnormofatensor......................200
3.16.3.Traceofatensor............................203
3.17.Tensorsystemsandhomogeneouspolynomials.............203
3.17.1.Multilinearsystemsbasedonthemode-n product.........204
3.17.2.TensorsystemsbasedontheEinsteinproduct............207
3.17.3.SolvingtensorsystemsusingLS...................209
3.18.HadamardandKroneckerproductsoftensors..............211
3.19.Tensorextension..............................213
3.20.Tensorization................................215
3.21.Hankelization................................217
Chapter4.EigenvaluesandSingularValuesofaTensor .......221
4.1.Introduction.................................221
4.2.Eigenvaluesofatensorofordergreaterthantwo............224
4.2.1.Differentdefinitionsoftheeigenvaluesofatensor..........224
4.2.2.Positive/negative(semi-)definitetensors...............232
4.2.3.Orthogonally/unitarilysimilartensors.................233
4.3.Bestrank-oneapproximation........................235
4.4.Orthogonaldecompositions.........................238
4.5.Singularvaluesofatensor.........................239
Chapter5.TensorDecompositions .....................241
5.1.Introduction.................................241
5.2.Tensormodels................................242
5.2.1.Tuckermodel..............................242
5.2.2.Tucker-(N1 ,N )model.........................249
5.2.3.Tuckermodelofatransposetensor..................251
5.2.4.TuckerdecompositionandmultidimensionalFouriertransform..251
5.2.5.PARAFACmodel............................252
5.2.6.Blocktensormodels...........................271
5.2.7.Constrainedtensormodels.......................273
5.3.Examplesoftensormodels.........................277
5.3.1.Modelofmultidimensionalharmonics................277
5.3.2.Sourceseparation............................278
5.3.3.ModelofaFIRsystemusingfourth-orderoutputcumulants....282
Thefirstbookofthisserieswasdedicatedtointroducingmatricesandtensors(of ordergreaterthantwo)fromtheperspectiveoftheiralgebraicstructure,presenting theirsimilarities,differencesandconnectionswithrepresentationsoflinear,bilinear andmultilinearmappings.Thissecondvolumewillnowstudytensoroperationsand decompositionsingreaterdepth.
Inthisintroduction,wewillmotivatetheuseoftensorsbyansweringfivequestions thatprospectiveusersmightandshouldask:
–Whataretheadvantagesoftensorapproaches?
–Forwhatuses?
–Inwhatfieldsofapplication?
–Withwhattensordecompositions?
–Withwhatcostfunctionsandoptimizationalgorithms?
Althoughouranswersarenecessarilyincomplete,ouraimisto:
–presenttheadvantagesoftensorapproachesovermatrixapproaches;
–showafewexamplesofhowtensortoolscanbeused;
–giveanoverviewoftheextensivediversityofproblemsthatcanbesolvedusing tensors,includingafewexampleapplications;
–introducethethreemostwidelyusedtensordecompositions,presentingsomeof theirpropertiesandcomparingtheirparametriccomplexity;
–stateafewproblemsbasedontensormodelsintermsofthecostfunctionstobe optimized;
–describevarioustypesoftensor-basedprocessing,withabriefglimpseofthe optimizationmethodsthatcanbeused.
I.1.Whataretheadvantagesoftensorapproaches?
Inmostapplications,atensor X oforder N isviewedasanarrayofrealorcomplex numbers.Thecurrentelementofthetensorisdenoted xi1 , ,iN ,whereeachindex in ∈ In {1, ,In },for n ∈ N {1, ,N },isassociatedwiththe nth mode,and In isitsdimension,i.e.thenumberofelementsforthe nthmode.The orderofthetensoristhenumber N ofindices,i.e.thenumberofmodes.Tensorsare writtenwithcalligraphicletters1.An N th-ordertensorwithentries xi1 , ··· ,iN iswritten X =[xi1 , ··· ,iN ] ∈ KI1 ×···×IN ,where K = R or C,dependingonwhetherthetensor isreal-valuedorcomplex-valued,and I1 ×···× IN representsthesizeof X .
Ingeneral,amode(alsocalledaway)canhaveoneofthefollowing interpretations:(i)asasourceofinformation(user,patient,client,trial,etc.);(ii)asa typeofentityattachedtothedata(items/products,typesofmusic,typesoffilm,etc.); (iii)asatagthatcharacterizesanitem,apieceofmusic,afilm,etc.;(iv)asa recordingmodalitythatcapturesdiversityinvariousdomains(space,time, frequency,wavelength,polarization,color,etc.).Thus,adigitalimageincolorcanbe representedasathree-dimensionaltensor(ofpixels)withtwospatialmodes,onefor therows(width)andoneforthecolumns(height),andonechannelmode(RGB colors).Forexample,acolorimagecanberepresentedasatensorofsize 1024 × 768 × 3,wherethethirdmodecorrespondstotheintensityofthethreeRGB colors(red,green,blue).Foravolumetricimage,therearethreespatialmodes (width × height × depth),andthepointsoftheimagearecalledvoxels.Inthe contextofhyperspectralimagery,inadditiontothetwospatialdimensions,thereisa thirddimensioncorrespondingtotheemissionwavelengthwithinaspectralband.
Tensorapproachesbenefitfromthefollowingadvantagesovermatrixapproaches: –theessentialuniquenessproperty2,satisfiedbysometensordecompositions, suchasPARAFAC(parallelfactors)(Harshman1970)undercertainmildconditions; formatrixdecompositions,thispropertyrequirescertainrestrictiveconditionson thefactormatrices,suchasorthogonality,non-negativity,oraspecificstructure (triangular,Vandermonde,Toeplitz,etc.);
–theabilitytosolvecertainproblems,suchastheidentificationofcommunication channels,directlyfrommeasuredsignals,withoutrequiringthecalculationof
1Scalars,vectors,andmatricesarewritteninlowercase,boldlowercase,andbolduppercase letters,respectively: a, a, A. 2Adecompositionsatisfiestheessentialuniquenesspropertyifitisuniqueuptopermutation andscalingfactorsinthecolumnsofitsfactormatrices.
high-orderstatisticsofthesesignalsortheuseoflongpilotsequences.Theresulting deterministicandsemi-blindprocessingscanbeperformedwithsignalrecordings thatareshorterthanthoserequiredbystatisticalmethods,basedontheestimation ofhigh-ordermomentsorcumulants.Fortheblindsourceseparationproblem,tensor approachescanbeusedtotacklethecaseofunderdeterminedsystems,i.e.systems withmoresourcesthansensors;
–thepossibilityofcompressingbigdatasetsviaadatatensorizationandtheuse ofatensordecomposition,inparticular,alowmultilinearrankapproximation;
–agreaterflexibilityinrepresentingandprocessingmultimodaldataby consideringthemodalitiesseparately,insteadofstackingthecorrespondingdatainto avectororamatrix.Thisallowsthemultilinearstructureofdatatobepreserved, meaningthatinteractionsbetweenmodescanbetakenintoaccount;
–agreaternumberofmodalitiescanbeincorporatedintotensorrepresentations ofdata,meaningthatmorecomplementaryinformationisavailable,whichallows theperformanceofcertainsystemstobeimproved,e.g.wirelesscommunication, recommendation,diagnostic,andmonitoringsystems,bymakingdetection, interpretation,recognition,andclassificationoperationseasierandmoreefficient. Thisledtoageneralizationofcertainmatrixalgorithms,likeSVD(singularvalue decomposition)toMLSVD(multilinearSVD),alsoknownasHOSVD(higherorder SVD)(deLathauwer etal.2000a);similarly,certainsignalprocessingalgorithms weregeneralized,likePCA(principalcomponentanalysis)toMPCA(multilinear PCA)(Lu etal.2008)orTRPCA(tensorrobustPCA)(Lu etal.2020)and ICA(independentcomponentanalysis)toMICA(multilinearICA)(Vasilescuand Terzopoulos2005)ortensorPICA(probabilisticICA)(BeckmannandSmith2005).
Itisworthnotingthat,withatensormodel,thenumberofmodalitiesconsidered inaproblemcanbeincreasedeitherbyincreasingtheorderofthedatatensororby couplingtensorand/ormatrixdecompositionsthatshareoneorseveralmodes.Such acouplingapproachiscalleddatafusionusingacoupledtensor/matrixfactorization. Twoexamplesofthistypeofcouplingarepresentedlaterinthisintroductorychapter. Inthefirst,EEGsignalsarecoupledwithfunctionalmagneticresonanceimaging (fMRI)datatoanalyzethebrainfunction;inthesecond,hyperspectraland multispectralimagesaremergedforremotesensing.
Theotherapproach,namely,increasingthenumberofmodalities,willbe illustratedinVolume3ofthisseriesbygivingaunifiedpresentationofvarious modelsofwirelesscommunicationsystemsdesignedusingtensors.Inorderto improvesystemperformance,bothintermsoftransmissionandreception,theideais toemploymultipletypesofdiversitysimultaneouslyinvariousdomains(space,time, frequency,code,etc.),eachtypeofdiversitybeingassociatedwithamodeofthe tensorofreceivedsignals.Coupledtensormodelswillalsobepresentedinthe contextofcooperativecommunicationsystemswithrelays.
I.2.Forwhatuses?
Inthebigdata3 era,digitalinformationprocessingplaysakeyroleinvarious fieldsofapplication.Eachfieldhasitsownspecificitiesandrequiresspecialized, oftenmultidisciplinary,skillstomanageboththemultimodalityofthedataandthe processingtechniquesthatneedtobeimplemented.Thus,the“intelligent” informationprocessingsystemsofthefuturewillhavetointegraterepresentation tools,suchastensorsandgraphs,signalandimageprocessingmethods,with artificialintelligencetechniquesbasedonartificialneuralnetworksandmachine learning.
Theneedsofsuchsystemsarediverseandnumerous–whetherintermsof storage,visualization(3Drepresentation,virtualreality,disseminationofworksof art),transmission,imputation,prediction/forecasting,analysis,classificationor fusionofmultimodalandheterogeneousdata.ThereaderisinvitedtorefertoLahat etal.(2015)andPapalexakis etal.(2016)forapresentationofvariousexamplesof datafusionanddataminingbasedontensormodels.
Someofthekeyapplicationsoftensortoolsareasfollows:
–decompositionorseparationofheterogeneousdatasetsintocomponents/factors orsubspaceswiththegoalofexploitingthemultimodalstructureofthedataand extractingusefulinformationforusersfromuncertainornoisydataormeasurements providedbydifferentsourcesofinformationand/ortypesofsensor.Thus,featurescan beextractedindifferentdomains(spatial,temporal,frequential)forclassificationand decision-makingtasks;
–imputationofmissingdatawithinanincompletedatabaseusingalow-rank tensormodel,wherethemissingdataresultsfromdefectivesensorsorcommunication links,forexample.Thistaskiscalledtensorcompletionandisahigherorder generalizationofmatrixcompletion(CandèsandRecht2009;Signoretto etal.2011; Liu etal.2013);
–recoveryofusefulinformationfromcompresseddatabyreconstructingasignal oranimagethathasasparserepresentationinapredefinedbasis,usingcompressive sampling(CS;alsoknownascompressedsensing)techniques(CandèsandWakin 2008;CandèsandPlan2010),appliedtosparse,low-ranktensors(Sidiropoulosand Kyrillidis2012);
–fusionofdatausingcoupledtensorandmatrixdecompositions;
–designofcooperativemulti-antennacommunicationsystems(alsocalled MIMO(multiple-inputmultiple-output);thistypeofapplication,whichledtothe
3Bigdataischaracterizedby3Vs(Volume,Variety,Velocity)linkedtothesizeofthedataset, theheterogeneityofthedataandtherateatwhichitiscaptured,storedandprocessed.
developmentofseveralnewtensormodels,willbeconsideredinthenexttwovolumes ofthisseries;
–multilinearcompressivelearningthatcombinescompressedsensingwith machinelearning;
–reductionofthedimensionalityofmultimodal,heterogeneousdatabaseswith verylargedimensions(bigdata)bysolvingalow-ranktensorapproximationproblem;
–multiwayfilteringandtensordatadenoising.
Tensorscanalsobeusedtotensorizeneuralnetworkswithfullyconnectedlayers byexpressingtheweightmatrixofalayerasatensortrain(TT)whosecores representtheparametersofthelayer.Thisconsiderablyreducestheparametric complexityand,therefore,thestoragespace.Thiscompressionpropertyofthe informationcontainedinlayeredneuralnetworkswhenusingtensordecompositions providesawaytoincreasethenumberofhiddenunits(Novikov etal.2015).Tensors, whenusedtogetherwithmultilayerperceptronneuralnetworkstosolveclassification problems,achievelowererrorrateswithfewerparametersandlesscomputationtime thanneuralnetworksalone(ChienandBao2017).Neuralnetworkscanalsobeused tolearntherankofatensor(Zhou etal.2019),ortocomputeitseigenvaluesand singularvalues,andhencetherank-oneapproximationofatensor(Che etal.2017).
I.3.Inwhatfieldsofapplication?
Tensorshaveapplicationsinmanydomains.Thefieldsofpsychometricsand chemometricsinthe1970sand1990spavedthewayforsignalandimageprocessing applications,suchasblindsourceseparation,digitalcommunications,andcomputer visioninthe1990sandearly2000s.Today,thereisaquantitativeexplosionof bigdatainmedicine,astronomy,meteorology,withfifth-generationwireless communications(5G),formedicaldiagnosticaid,webservicesdeliveredby recommendationsystems(videoondemand,onlinesales,restaurantandhotel reservations,etc.),aswellasforinformationsearchingwithinmultimediadatabases (texts,images,audioandvideorecordings)andwithsocialnetworks.Thisexplains whyvariousscientificcommunitiesandtheindustrialworldareshowingagrowing interestintensors.
Amongthemanyexamplesofapplicationsoftensorsforsignalandimage processing,wecanmention:
–blindsourceseparationandblindsystemidentification.Theseproblemsplay afundamentalroleinsignalprocessing.Theyinvolveseparatingtheinputsignals (alsocalledsources)andidentifyingasystemfromtheknowledgeofonlytheoutput signalsandcertainhypothesesabouttheinputsignals,suchasstatisticalindependence inthecaseofindependentcomponentanalysis(Comon1994),ortheassumptionofa
finitealphabetinthecontextofdigitalcommunications.Thistypeofprocessingis,in particular,usedtojointlyestimatecommunicationchannelsandinformationsymbols emittedbyatransmitter.Itcanalsobeusedforspeechormusicseparation,orto processseismicsignals;
–useoftensordecompositionstoanalyzebiomedicalsignals(EEG,MEG,ECG, EOG4)inthespace,timeandfrequencydomains,inordertoprovideamedical diagnosticaid;forinstance,Acar etal.(2007)usedaPARAFACmodelofEEGsignals toanalyzeepilepticseizures;Becker etal.(2014)usedthesametypeofdecomposition tolocatesourceswithinEEGsignals;
–analysisofbrainactivitybymergingimagingdata(fMRI)andbiomedical signals(EEGandMEG)withthegoalofenablingnon-invasivemedicaltests(see TableI.4);
–analysisandclassificationofhyperspectralimagesusedinmanyfields (medicine,environment,agriculture,monitoring,astrophysics,etc.).Toimprovethe spatialresolutionofhyperspectralimages,Li etal.(2018)mergedhyperspectral andmultispectralimagesusingacoupledTuckerdecompositionwithasparsecore (coupledsparsetensorfactorization(CSTF))(seeTableI.4);
–designofsemi-blindreceiversforpoint-to-pointorcooperativeMIMO communicationsystemsbasedontensormodels;seetheoverviewsbydeAlmeida etal.(2016)anddaCosta etal.(2018);
–modelingandidentificationofnonlinearsystemsviaatensorrepresentationof VolterrakernelsorWiener–Hammersteinsystems(see,forexample,Kibangouand Favier2009a,2010;FavierandKibangou2009;FavierandBouilloc2009,2010; Favier etal.2012a);
–identificationoftensor-basedseparabletrilinearsystemsthatarelinearwith respectto(w.r.t.)theinputsignalandtrilinearw.r.t.thecoefficientsoftheglobal impulseresponse,modeledasaKroneckerproductofthreeindividualimpulse responses(Elisei-Iliescu etal.2020).Notethatsuchsystemsaretobecompared withthird-orderVolterrafiltersthatarelinearw.r.t.theVolterrakernelcoefficients andtrilinearw.r.t.theinputsignal;
–facialrecognition,basedonfacetensors,forpurposesofauthenticationand identificationinsurveillancesystems.Forfacialrecognition,photosofpeopleto recognizearestoredinadatabasewithdifferentlightingconditions,differentfacial expressions,frommultipleangles,foreachindividual.InVasilescuandTerzopoulos (2002),thetensoroffacialimagesisoforderfive,withdimensions: 28 × 5 × 3 × 3 × 7943,correspondingtothemodes: people × views × illumination × expressions × pixelsperimage.Foranoverviewofvariousfacialrecognitionsystems, seeArachchilageandIzquierdo(2020);
4Electroencephalography(EEG),magnetoencephalography(MEG),electrocardiography (ECG)andelectrooculography(EOG).
–tensor-basedanomalydetectionusedinmonitoringandsurveillancesystems.
TableI.1presentsafewexamplesofsignalandimagetensors,specifyingthe natureofthemodesineachcase.
Signals
Modes
References
Antenna space(antennas) × time × sensorsubnetwork (Sidiropoulos etal.2000a) processing space × time × polarization (Raimondi etal.2017)
Digital space(antennas) × time × code (Sidiropoulos etal.2000b)
communications antennas × blocks × symbolperiods × code × frequencies (FavieranddeAlmeida2014b)
ECG space(electrodes) × time × frequencies (Acar etal.2007;Padhy etal.2019)
EEG space(electrodes) × time × frequencies × subjectsortrials (Becker etal.2014;Cong etal.2015)
subjects × electrodes × time + subjects × voxels
EEG+fMRI (modelwithmatrixandtensorfactorizations (Acar etal.2017) coupledviathe“subjects”mode)
Images
Colorimages space(width) × space(height) × channel(colors)
Videosingrayscale space(width) × space(height) × time
Videosincolor space × space × channel × time
Hyperspectralimages space × space × spectralbands (Makantasis etal.2018)
Computervision people × views × illumination × expressions × pixels (VasilescuandTerzopoulos2002)
TableI.1. Signalandimagetensors
OtherfieldsofapplicationareconsideredinTableI.2.
Below,wegivesomedetailsabouttheapplicationconcerningrecommendation systems,whichplayanimportantroleinvariouswebsites.Thegoalofthesesystems istohelpuserstoselect items from tags thathavebeenassignedtoeachitembyusers. Theseitemscould,forexample,bemovies,books,musicalrecordings,webpages, productsforsaleonane-commercesite,etc.Astandardrecommendationsystemis basedonthethreefollowingmodes: users × items × tags.
Collaborativefilteringtechniquesusetheopinionsofasetofpeople,or assessmentsfromthesepeoplebasedonaratingsystem,togeneratealistof recommendationsforaspecificuser.Thistypeoffilteringis,forexample,usedby websiteslikeNetflixforrentingDVDs.Collaborativefilteringmethodsareclassified intothreecategories,dependingonwhetherthefilteringisbasedon(a)historyanda similaritymetric;(b)amodelbasedonmatrixfactorizationusingalgorithmslike SVDornon-negativematrixfactorization(NMF);(c)somecombinationofboth, knownashybridcollaborativefilteringtechniques.SeeLuo etal.(2014)andBokde etal.(2015)forapproachesbasedonmatrixfactorization.
Otherso-calledpassivefilteringtechniquesexploitthedataofamatrixofrelations betweenitemstodeducerecommendationsforauserfromcorrelationsbetweenitems andtheuser’spreviouschoices,withoutusinganykindofratingsystem.Thisisknown asacontent-basedapproach.
Domains
Modes
References
Phonetics subjects × vowels × formants (Harshman1970)
Chemometrics excitation × emission × samples (fluorescence) (excitation/emissionwavelengths) (Bro1997,2006;Smilde etal.2004)
Contextual (RendleandSchmidt-Thieme2010) recommendation users × items × tags (SymeonidisandZioupos2016) systems × context1 ×···× contextN (FrolovandOseledets2017)
Transportation Space(sensors) × time(days) × time(weeks) (Goulart etal.2017) (speedmeasurements) (periodsof15sand24h) (Tan etal.2013;Ran etal.2016)
typesofmusic × frequencies × frequencies (Panagakis etal.2010) users × keywords × songs (Nanopoulos etal.2010) Music recordings × (audio)characteristics × segments (BenetosandKotropoulos2008)
Bioinformatics medicine × targets × diseases (Wang etal.2019)
TableI.2. Otherfieldsofapplication
Recommendationsystemscanalsouseinformationabouttheusers(age, nationality,geographiclocation,participationonsocialnetworks,etc.)andtheitems themselves(typesofmusic,typesoffilm,classesofhotels,etc.).Thisiscalled contextualinformation.Takingthisadditionalinformationintoaccountallowsthe relevanceoftherecommendationstobeimproved,atthecostofincreasingthe dimensionalityandthecomplexityofthedatarepresentationmodeland,therefore,of theprocessingalgorithms.Thisiswhytensorapproachesaresoimportantforthis typeofapplicationtoday.Notethat,forrecommendationsystems,thedatatensors aresparse.Consequently,sometagscanbeautomaticallygeneratedbythesystem basedonsimilaritymetricsbetweenitems.Thisis,forexample,thecaseformusic recommendationsbasedontheacousticcharacteristicsofsongs(Nanopoulos etal. 2010).Personalizedtagrecommendationstakeintoaccounttheuser’sprofile, preferences,andinterests.Thesystemcanalsohelptheuserselectexistingtagsor createnewones(RendleandSchmidt-Thieme2010).
ThearticlesbyBobadilla etal.(2013)andFrolovandOseledets(2017)present variousrecommendationsystemswithmanybibliographicalreferences.Operating accordingtoasimilarprincipleasrecommendationsystems,socialnetwork websites,suchasWikipedia,Facebook,orTwitter,allowdifferenttypesofdatatobe exchangedandshared,contenttobeproducedandconnectionstobeestablished.
I.4.Withwhattensordecompositions?
Itisimportanttonotethat,foran N th-ordertensor X∈ KI1 ×···×IN ,thenumber ofelementsis N n=1 In ,and,assuming In = I for n ∈ N ,thisnumberbecomes I N , whichinducesanexponentialincreasewiththetensororder N .Thisiscalledthecurse ofdimensionality(OseledetsandTyrtyshnikov2009).Forbigdatatensors,tensor decompositionsplayafundamentalroleinalleviatingthiscurseofdimensionality, duetothefactthatthenumberofparametersthatcharacterizethedecompositionsis generallymuchsmallerthanthenumberofelementsintheoriginaltensor.
Wenowintroducethreebasicdecompositions:PARAFAC/CANDECOMP/CPD, TDandTT5.ThefirsttwoarestudiedindepthinChapter5,whereasthethird,briefly introducedinChapter3,willbeconsideredinmoredetailinVolume3.
TableI.3givestheexpressionoftheelement xi1 , ,iN ofatensor X∈ KI1 ×···×IN oforder N andsize I1 ×···× IN ,eitherreal(K = R)orcomplex(K = C),foreach ofthethreedecompositionscitedabove.Theirparametriccomplexityiscomparedin termsofthesizeofeachmatrixandtensorfactor,assuming In = I and Rn = R for all n ∈ N
FiguresI.1–I.3showgraphicalrepresentationsofthePARAFACmodel A(1) , A(2) , A(3) ; R andtheTDmodel G ; A(1) , A(2) , A(3) forathird-order tensor X∈ KI1 ×I2 ×I3 ,andoftheTTmodel A(1) , A(2) , A(3) , A(4) fora fourth-ordertensor X∈ KI1 ×I2 ×I3 ×I4 .InthecaseofthePARAFACmodel,we define A(n) a(n) 1 , , a(n) R ∈ KIn ×R usingitscolumns,for n ∈{1, 2, 3}
FigureI.1. Third-orderPARAFACmodel
Wecanmakeafewremarksabouteachofthesedecompositions: –ThePARAFACdecomposition(Harshman1970),alsoknownasCANDECOMP (CarrollandChang1970)orCPD(Hitchcock1927),ofa N th-ordertensor X isasum 5PARAFACforparallelfactors;CANDECOMPforcanonicaldecomposition;CPDfor canonicalpolyadicdecomposition;TDforTuckerdecomposition;TTfortensortrain.
xxMatrixandTensorDecompositionsinSignalProcessing of R rank-onetensors,eachdefinedastheouterproductofonecolumnfromeachof the N matrixfactors A(n) ∈ KIn ×R .When R isminimal,itiscalledtherankof thetensor.Ifthematrixfactorssatisfycertainconditions,thisdecompositionhasthe essentialuniquenessproperty.SeeFigureI.1forathird-ordertensor (N =3),and Chapter5foradetailedpresentation. Decompositions
TableI.3. ParametriccomplexityoftheCPD,TD, andTTdecompositions
FigureI.2. Third-orderTuckermodel
FigureI.3. Fourth-orderTTmodel
–TheTuckerdecomposition(Tucker1966)canbeviewedasageneralizationof thePARAFACdecompositionthattakesintoaccountalltheinteractionsbetweenthe columnsofthematrixfactors A(n) ∈ KIn ×Rn viatheintroductionofacoretensor
G∈ KR1 ×···×RN .Thisdecompositionisnotuniqueingeneral.Notethat,if Rn ≤ In for ∀n ∈ N ,thenthecoretensor G providesacompressedformof X .If Rn ,for n ∈ N ,ischosenastherankofthemode-n matrixunfolding6 of X ,thenthe N -tuple (R1 , ··· ,RN ) isminimal,anditiscalledthemultilinearrankofthetensor.
SuchaTuckerdecompositioncanbeobtainedusingthetruncatedhigh-order SVD(THOSVD),undertheconstraintofcolumn-orthonormalmatrices A(n) (deLathauwer etal.2000a).Thisalgorithmisdescribedinsection5.2.1.8.
SeeFigureI.2forathird-ordertensor,andChapter5foradetailedpresentation.
–TheTTdecomposition(Oseledets2011)iscomposedofatrainofthird-order tensors A(n) ∈ KRn 1 ×In ×Rn ,for n ∈{2, 3, ··· ,N 1},thefirstandlastcarriages ofthetrainbeingmatrices A(1) ∈ KI1 ×R1 and A(N ) ∈ KRN 1 ×IN ,whichimplies r0 = rN =1,andtherefore a(1)
and
.The dimensions Rn ,for n ∈ N 1 ,calledtheTTranks,aregivenbytheranksofsome matrixunfoldingsoftheoriginaltensor.
,i
Thisdecompositionhasbeenusedtosolvethetensorcompletionproblem (Grasedyck etal.2015;Bengua etal.2017),forfacialrecognition(Brandoniand Simoncini2020)andformodelingMIMOcommunicationchannels(Zniyed etal. 2020),amongmanyotherapplications.AbriefdescriptionoftheTTdecompositionis giveninsection3.13.4usingthemode-(p,n) product.NotethataspecificSVD-based algorithm,calledTT-SVD,wasproposedbyOseledets(2011)forcomputingaTT decomposition.
ThisdecompositionandthehierarchicalTucker(HT)one(Grasedyckand Hackbush2011;Ballani etal.2013)arespecialcasesoftensornetworks(TNs) (Cichocki2014),aswillbediscussedinmoredetailinthenextvolume.
6Seedefinition[3.41],inChapter3,ofthemode-n matrixunfolding Xn ofatensor X ,whose columnsarethemode-n vectorsobtainedbyfixingallbut n indices.
Fromthisbriefdescriptionofthethreetensormodels,onecanconcludethat, unlikematrices,thenotionofrankisnotuniquefortensors,sinceitdependsonthe decompositionused.Thus,asmentionedabove,onedefinesthetensorrank(also calledthecanonicalrankorKruskal’srank)associatedwiththePARAFAC decomposition,themultilinearrankthatreliesontheTucker’smodel,andthe TT-rankslinkedwiththeTTdecomposition.
Itisimportanttonotethatthenumberofcharacteristicparametersofthe PARAFACandTTdecompositionsisproportionalto N ,theorderofthetensor, whereastheparametriccomplexityoftheTuckerdecompositionincreases exponentiallywith N .Thisiswhythefirsttwodecompositionsareespecially valuableforlarge-scaleproblems.AlthoughtheTuckermodelisnotuniquein general,imposinganorthogonalityconstraintonthematrixfactorsyieldsthe HOSVDdecomposition,atruncatedformofwhichgivesanapproximatesolutionto thebestlowmultilinearrankapproximationproblem(deLathauwer etal.2000a). Thissolution,whichisbasedonan apriori choiceofthedimensions Rn ofthecore tensor,istobecomparedwiththetruncatedSVDinthematrixcase,althoughitdoes nothavethesameoptimalityproperty.Itiswidelyusedtoreducetheparametric complexityofdatatensors.
Fromtheabove,itcanbeconcludedthattheTTmodelcombinestheadvantages oftheothertwodecompositions,intermsofparametriccomplexity(likePARAFAC) andnumericalstability(likeTucker’smodel),duetoaparameterestimationalgorithm basedonacalculationofSVDs.
ToillustratetheuseofthePARAFACdecomposition,letusconsiderthecaseof multi-usermobilecommunicationswithaCDMA(code-divisionmultipleaccess) encodingsystem.Themultipleaccesstechniqueallowsmultipleemittersto simultaneouslytransmitinformationoverthesamecommunicationchannelby assigningacodetoeachemitter.Theinformationistransmittedassymbols sn,m , with n ∈ N and m ∈ M ,where N and M arethenumberoftransmissiontime slots,i.e.thenumberofsymbolperiods,andthenumberofemittingantennas, respectively.Thesymbolsbelongtoafinitealphabetthatdependsonthemodulation beingused.Theyareencodedwithaspace-timecodingthatintroducescodediversity byrepeatingeachsymbol P timeswithacode cp,m assignedtothe mthemitting antenna, p ∈ P ,where P denotesthelengthofthespreadingcode.Thesignal receivedbythe k threceivingantenna,duringthe nthsymbolperiodandthe pthchip period,isalinearcombinationofthesymbolsencodedandtransmittedbythe M emittingantennas:
xk,n,p = M m=1 hk,m sn,m cp,m , [I.1] where hk,m isthefadingcoefficientofthecommunicationchannelbetweenthe receivingantenna k andtheemittingantenna m
Thereceivedsignals,whicharecomplex-valued,thereforeformathird-order tensor X∈ CK ×N ×P whosemodesare: space × time × code,associatedwith theindices (k,n,p).ThissignaltensorsatisfiesaPARAFACdecomposition H, S, C; M whoserankisequaltothenumber M ofemittingantennasandwhose matrixfactorsarethechannel(H ∈ CK ×M ),thematrixoftransmittedsymbols (S ∈ CN ×M )andthecodingmatrix(C ∈ CP ×M ).Thisexampleisasimplifiedform oftheDS-CDMA(direct-sequenceCDMA)systemproposedby(Sidiropoulos etal 2000b).
I.5.Withwhatcostfunctionsandoptimizationalgorithms?
Wewillnowbrieflydescribethemostcommonprocessingoperationscarriedout withtensors,aswellassomeoftheoptimizationalgorithmsthatareused.Itis importanttofirstpresentthepreprocessingoperationsthatneedtobeperformed. Preprocessingtypicallyinvolvesdatacenteringoperations(offsetelimination), scalingofnon-homogeneousdata,suppressionofoutliersandartifacts,image adjustment(size,brightness,contrast,alignment,etc.),denoising,signal transformationusingcertaintransforms(wavelets,Fourier,etc.),andfinally,insome cases,thecalculationofstatisticsofsignalstobeprocessed.
Preprocessingisfundamental,bothtoimprovethequalityoftheestimated modelsand,therefore,ofthesubsequentprocessingoperations,andtoavoid numericalproblemswithoptimizationalgorithms,suchasconditioningproblems thatmaycausethealgorithmstofailtoconverge.Centeringandscaling preprocessingoperationsarepotentiallyproblematicbecausetheyareinterdependent andcanbecombinedinseveraldifferentways.Ifdataaremissing,centeringcanalso reducetherankofthetensormodel.Foramoredetaileddescriptionofthese preprocessingoperations,seeSmilde etal.(2004).
Fortheprocessingoperationsthemselves,wecandistinguishbetweenseveral differentclasses:
–supervised/non-supervised(blindorsemi-blind),i.e.withorwithouttraining data,forexample,tosolveclassificationproblems,orwhen apriori information, calledapilotsequence,istransmittedtothereceiverforchannelestimation;
–real-time(online)/batch(offline)processing; –centralized/distributed;
–adaptive/blockwise(withrespecttothedata);
–with/withoutcouplingoftensorand/ormatrixmodels; –with/withoutmissingdata.
Itisimportanttodistinguishbatchprocessing,whichisperformedtoanalyzedata recordedassignalandimagesets,fromthereal-timeprocessingrequiredbywireless communicationsystems,recommendationsystems,websearchesandsocial networks.Inreal-timeapplications,thedimensionalityofthemodelandthe algorithmiccomplexityarepredominantfactors.Thesignalsreceivedbyreceiving antennas,theinformationexchangedbetweenawebsiteandtheusersandthe messagesexchangedbetweentheusersofasocialnetworkaretime-dependent.For instance,arecommendationsysteminteractswiththeusersinreal-time,viaa possibleextensionofanexistingdatabasebymeansofmachinelearningtechniques. Foradescriptionofvariousapplicationsoftensorsfordataminingandmachine learning,seeAnandkumar etal.(2014)andSidiropoulos etal.(2017).
Tensor-basedprocessingsleadtovarioustypesofoptimizationalgorithmas follows:
–constrained/unconstrainedoptimization; –iterative/non-iterative,orclosed-form;
–alternating/global; –sequential/parallel.
Furthermore,dependingontheinformationthatisavailable apriori,different typesofconstraintscanbetakenintoaccountinthecostfunctiontobeoptimized: lowrank,sparseness,non-negativity,orthogonalityanddifferentiability/smoothness. Inthecaseofconstrainedoptimization,weightsneedtobechoseninthecost functionaccordingtotherelativeimportanceofeachconstraintandthequalityofthe apriori informationthatisavailable.
TableI.4presentsafewexamplesofcostfunctionsthatcanbeminimizedfor theparameterestimationofcertainthird-ordertensormodels(CPD,Tucker,coupled matrixTucker(CMTucker)andcoupledsparsetensorfactorization(CSTF)),forthe imputationofmissingdatainatensorandfortheestimationofasparsedatatensor withalow-rankconstraintexpressedintheformofthenuclearnormofthetensor.
R EMARK I.1.–Wecanmakethefollowingremarks:
–thecostfunctionspresentedinTableI.4correspondtodatafittingcriteria.These criteria,expressedintermsoftensorandmatrixFrobeniusnorms( . F ),arequadratic inthedifferencebetweenthedatatensor X andtheoutputofCPDandTDmodels,as wellasbetweenthedatamatrix Y andamatrixfactorizationmodel,inthecaseofthe CMTuckermodel.Theyaretrilinearandquadrilinear,respectively,withrespecttothe parametersoftheCPDandTDmodelstobeestimated,andbilinearwithrespectto theparametersofthematrixfactorizationmodel;
–forthemissingdataimputationproblemusingaCPDorTDmodel,thebinary tensor W ,whichhasthesamesizeas X ,isdefinedas:
wijk = 1 if xijk isknown 0 if xijk ismissing
ThepurposeoftheHadamardproduct(denoted )of W ,withthedifference between X andtheoutputoftheCPDandTDmodels,istofitthemodeltothe availabledataonly,ignoringanymissingdataformodelestimation.Thisimputation problem,knownasthetensorcompletionproblem,wasoriginallydealtwithby TomasiandBro(2005)andAcar etal.(2011a)usingaCPDmodel,followedby FilipovicandJukic(2015)usingaTDmodel.Variousarticleshavediscussedthis probleminthecontextofdifferentapplications.Anoverviewoftheliteraturewillbe giveninthenextvolume; Problems
Estimation
Costfunctions
Imputation
Imputationwith low-rankconstraint
Costfunctions
Costfunctions
TableI.4. Costfunctionsformodelestimation andrecoveryofmissingdata
–fortheimputationproblemwiththelow-rankconstraint,theterm X in thecostfunctionreplacesthelow-rankconstraintwiththenuclearnormof X , sincethefunctionrank(X ) isnotconvex,andthenuclearnormistheclosest
convexapproximationoftherank.InLiu etal.(2013),thistermisreplacedby 3 n=1 λn Xn ,where Xn representsthemode-n unfoldingof X 7;
–inthecaseoftheCMTuckermodel,thecouplingconsideredhererelatestothe firstmodesofthetensor X andthematrix Y ofdataviathecommonmatrixfactor A.
Coupledmatrixandtensorfactorization(CMTF)modelswereintroducedinAcar etal.(2011b)bycouplingaCPDmodelwithamatrixfactorizationandusingthe gradientdescentalgorithmtoestimatetheparameters.Thistypeofmodelwasused byAcar etal.(2017)tomergeEEGandfMRIdatawiththegoalofanalyzingbrain activity.TheEEGsignalsaremodeledwithanormalizedCPDmodel(seeChapter5), whereasthefMRIdataaremodeledwithamatrixfactorization.Thedataarecoupled throughthe subjects mode(seeTableI.1).Thecostfunctiontobeminimizedis thereforegivenby:
wherethecolumnvectorsofthematrixfactors(A, B, C)haveunitnorm, Σ isa diagonalmatrixwhosediagonalelementsarethecoefficientsofthevector σ and α> 0 isapenaltyparameterthatallowstheimportanceofthesparsenessconstraints ontheweightvectors (g , σ ) tobeincreasedordecreased,modeledbymeansofthe l1 norm.TheadvantageofmergingEEGandfMRIdatawiththecriterion[I.3]isthatthe acquisitionandobservationmethodsarecomplementaryintermsofresolution,since EEGsignalshaveahightemporalresolutionbutlowspatialresolution,whilefMRI imagingprovideshighspatialresolution;
–inthecaseoftheCSTFmodel(Li etal.2018),thetensorofhigh-resolution hyperspectralimages(HR-HSI)isrepresentedusingathird-orderTuckermodel thathasasparsecore(X = G×1 W ×2 H ×3 S),withthefollowingmodes: space (width) × space (height) × spectralbands.Thematrices W ∈ RM ×nw , H ∈ RN ×nh and S ∈ RP ×ns denotethedictionariesforthewidth,heightand spectralmodes,composedof nw , nh and ns atoms,respectively,andthecoretensor G containsthecoefficientsrelativetothethreedictionaries.Thematrices W ∗ , H∗ and S∗ arespatiallyandspectrallysubsampledversionswithrespecttoeachmode.The term λ isaregularizationparameterforthesparsenessconstraintonthecoretensor, expressedintermsofthe l1 normof G .
ThecriterialistedinTableI.4canbegloballyminimizedusinganonlinear optimizationmethodsuchasagradientdescentalgorithm(withfixedoroptimalstep size),ortheGauss–NewtonandLevenberg–Marquardtalgorithms,thelatterbeinga
7Seedefinition[3.41]oftheunfolding Xn ,anddefinitions[1.65]and[1.67]oftheFrobenius norm( F )andthenuclearnorm( ∗ )ofamatrix;foratensor,seesection3.16.
regularizedformoftheformer.Inthecaseofconstrainedoptimization,the augmentedLagrangianmethodisveryoftenused,asitallowstheconstrained optimizationproblemtobetransformedintoasequenceofunconstrained optimizationproblems.
Thedrawbacksoftheseoptimizationmethodsincludeslowconvergencefor gradient-typealgorithmsandhighnumericalcomplexityfortheGauss–Newtonand Levenberg–MarquardtalgorithmsduetotheneedtocomputetheJacobianmatrixof thecriterionw.r.t.theparametersbeingestimated,aswellastheinverseofalarge matrix.
Alternatingoptimizationmethodsarethereforeoftenusedinsteadofaglobal optimizationw.r.t.allmatrixandtensorfactorstobeestimated.Theseiterative methodsperformasequenceofseparateoptimizationsofcriterialinearineach unknownfactorwhilefixingtheotherfactorswiththevaluesestimatedatprevious iterations.AnexampleisthestandardALS(alternatingleastsquares)algorithm, presentedinChapter5forestimatingPARAFACmodels.Forconstrained optimization,thealternatingdirectionmethodofmultipliers(ADMM)isoftenused (Boyd etal.2011).
Tocompletethisintroductorychapter,letusoutlinethekeyknowledgeneededto employtensortools,whosepresentationconstitutesthemainobjectiveofthissecond volume:
–arrangement(alsocalledreshaping)operationsthatexpressthedatatensor asavector(vectorization),amatrix(matricization),oralowerordertensorby combiningmodes;conversely,thetensorizationandHankelizationoperationsallow ustoconstructtensorsfromdatacontainedinlargevectorsormatrices;
–tensoroperationssuchastransposition,symmetrization,Hadamardand Kroneckerproducts,inversionandpseudo-inversion;
–thenotionsofeigenvalueandsingularvalueofatensor;
–tensordecompositions/models,andtheiruniquenessproperties;
–algorithmsusedtosolvedimensionalityreductionproblemsand,hence,best low-rankapproximation,parameterestimationandmissingdataimputation.This algorithmicaspectlinkedtotensorswillbeexploredinmoredepthinVolume3.
I.6.Briefdescriptionofcontent
Tensoroperationsanddecompositionsoftenusematrixtools,sowewillbegin byreviewingsomematrixdecompositionsinChapter1,goingintofurtherdetailon eigenvaluedecomposition(EVD)andSVD,aswellasafewoftheirapplications.
TheHadamard,KroneckerandKhatri–Raomatrixproductsarepresentedin detailinChapter2,togetherwithmanyoftheirpropertiesandafewrelations betweenthem.Toillustratetheseoperations,wewillusethemtorepresentfirst-order partialderivativesofafunction,andsolvematrixequations,suchasSylvesterand Lyapunovones.Thischapteralsointroducesanindexconventionthatisveryuseful fortensorcomputations.Thisconvention,whichgeneralizesEinstein’ssummation convention(Pollock2011),willbeusedtorepresentvariousmatrixproductsandto provesomematrixproductvectorizationformulae,aswellasvariousrelations betweentheKronecker,Khatri-RaoandHadamardproducts.Itwillbeusedin Chapter3fortensormatricizationandvectorizationinanoriginalway,aswellasin Chapter5toestablishmatrixformsoftheTuckerandPARAFACdecompositions.
Chapter3presentsvarioussetsoftensorsbeforeintroducingthenotionsofmatrix andtensorslicesandofmodecombinationonwhichreshapingoperationsarebased. Thekeytensoroperationslistedabovearethenpresented.Severallinksbetween productsoftensorsandsystemsoftensorequationsarealsooutlined,andsomeof thesesystemsaresolvedwiththeleastsquaresmethod.
Chapter4isdedicatedtointroducingthenotionsofeigenvalueandsingularvalue fortensors.Theproblemofthebestrank-oneapproximationofatensorisalso considered.
InChapter5,wewillgiveadetailedpresentationofvarioustensor decompositions,withaparticularfocusonthebasicTuckerandCPD decompositions,whichcanbeviewedasgeneralizationsofmatrixSVDtotensorsof ordergreaterthantwo.Blocktensormodelsandconstrainedtensormodelswillalso bedescribed,aswellascertainvariants,suchasHOSVDandBTD(blockterm decomposition).CPD-typedecompositionsaregenerallyusedtoestimatelatent parameters,whereasTuckerdecompositionisoftenusedtoestimatemodalsubspaces andreducethedimensionalityvialowmultilinearrankapproximationandtruncated HOSVD.
AdescriptionoftheALSalgorithmforparameterestimationofPARAFACmodels willalsobegiven.TheuniquenesspropertiesoftheTuckerandCDPdecompositions willbepresented,aswellasthevariousnotionsoftherankofatensor.Thechapter willendwithillustrationsofBTDandCPDdecompositionsforthetensormodeling ofmultidimensionalharmonics,theproblemofsourceseparationinaninstantaneous linearmixtureandthemodelingandestimationofafiniteimpulseresponse(FIR) linearsystem,usingatensoroffourth-ordercumulantsofthesystemoutput.
High-ordercumulantsofrandomsignalsthatcanbeviewedastensorsplaya centralroleinvarioussignalprocessingapplications,asillustratedinChapter5.This motivatedustoincludeanAppendixtopresentabriefoverviewofsomebasicresults concerningthehigherorderstatistics(HOS)ofrandomsignals,withtwoapplications totheHOS-basedestimationofalineartime-invariantsystemandahomogeneous quadraticsystem.