Statistical topics and stochastic models for dependent data with applications vlad stefan barbu - In

Page 1


StatisticalTopicsandStochasticModelsfor DependentDataWithApplicationsVladStefanBarbu

https://ebookmass.com/product/statistical-topics-andstochastic-models-for-dependent-data-with-applications-vladstefan-barbu/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Applied Modeling Techniques and Data Analysis 2: Financial, Demographic, Stochastic and Statistical Models and Methods, Volume 8 Yannis Dimotikalis

https://ebookmass.com/product/applied-modeling-techniques-and-dataanalysis-2-financial-demographic-stochastic-and-statistical-modelsand-methods-volume-8-yannis-dimotikalis/ ebookmass.com

Statistical Learning for Big Dependent Data (Wiley Series in Probability and Statistics) 1st Edition Daniel Peña

https://ebookmass.com/product/statistical-learning-for-big-dependentdata-wiley-series-in-probability-and-statistics-1st-edition-danielpena/

ebookmass.com

From Statistical Physics to Data-Driven Modelling: with Applications to Quantitative Biology Simona Cocco

https://ebookmass.com/product/from-statistical-physics-to-data-drivenmodelling-with-applications-to-quantitative-biology-simona-cocco/

ebookmass.com

Understanding Religious Violence: Radicalism and Terrorism in Religion Explored via Six Case Studies 1st ed. Edition James Dingley

https://ebookmass.com/product/understanding-religious-violenceradicalism-and-terrorism-in-religion-explored-via-six-casestudies-1st-ed-edition-james-dingley/ ebookmass.com

Molecular, Cellular, and Metabolic Fundamentals of Human Aging Evandro Fei Fang & Linda Hildegard Bergersen & Brian C. Gilmour

https://ebookmass.com/product/molecular-cellular-and-metabolicfundamentals-of-human-aging-evandro-fei-fang-linda-hildegardbergersen-brian-c-gilmour/

ebookmass.com

The Town with No Mirrors Christina Collins

https://ebookmass.com/product/the-town-with-no-mirrors-christinacollins-2/

ebookmass.com

Tribology of Graphene: Simulation Methods, Preparation Methods, and Their Applications Oleksiy V. Penkov

https://ebookmass.com/product/tribology-of-graphene-simulationmethods-preparation-methods-and-their-applications-oleksiy-v-penkov/

ebookmass.com

The Game Designer's Playbook: An Introduction to Game Interaction Design Samantha Stahlke

https://ebookmass.com/product/the-game-designers-playbook-anintroduction-to-game-interaction-design-samantha-stahlke/

ebookmass.com

Theatre, Brief Version 11th Edition Robert Cohen

https://ebookmass.com/product/theatre-brief-version-11th-editionrobert-cohen/

ebookmass.com

Life history evolution: a biological meta-theory for the social sciences Steven

https://ebookmass.com/product/life-history-evolution-a-biologicalmeta-theory-for-the-social-sciences-steven-c-hertler/

ebookmass.com

Statistical Topics and Stochastic Models for Dependent Data with Applications

Statistical Topics and Stochastic Models for Dependent Data with Applications

Vlad Stefan Barbu
Nicolas Vergne

First published 2020 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd

John Wiley & Sons, Inc.

27-37 St George’s Road 111 River Street London SW19 4EU Hoboken, NJ 07030

UK USA

www.iste.co.uk

www.wiley.com

© ISTE Ltd 2020

The rights of Vlad Stefan Barbu and Nicolas Vergne to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2020938718

British Library Cataloguing-in-Publication Data

A CIP record for this book is available from the British Library

ISBN 978-1-78630-603-6

Preface ........................................xi VladStefanB ARBU andNicolasV ERGNE

Part1.MarkovandSemi-MarkovProcesses ................1

Chapter1.VariableLengthMarkovChains,PersistentRandom Walks:ACloseEncounter ...........................3

PeggyC ÉNAC ,BrigitteC HAUVIN ,FrédéricPACCAUT andNicolasP OUYANNE

1.1.Introduction.................................3

1.2.VLMCs:definitionofthemodel......................6

1.3.DefinitionandbehaviorofPRWs.....................9

1.3.1.PRWsindimensionone.........................9

1.3.2.PRWsindimensiontwo........................13

1.4.VLMC:existenceofstationaryprobabilitymeasures..........15

1.5.WhereVLMCandPRWmeet.......................19

1.5.1.Semi-MarkovchainsandMarkovadditiveprocesses........19

1.5.2.PRWsinducesemi-Markovchains...................20

1.5.3.Semi-Markovchainofthe α-LISinastableVLMC.........22

1.5.4.Themeetingpoint............................23

1.6.References..................................27

Chapter2.BootstrapsofMartingale-differenceArraysUnderthe UniformlyIntegrableEntropy .........................29

SalimB OUZEBDA andNikolaosL IMNIOS

2.1.Introductionandmotivation........................29

2.2.Somepreliminariesandnotation......................30

2.3.Mainresults.................................35

2.4.Applicationforthesemi-Markovkernelestimators...........36

2.5.Proofs.....................................41 2.6.References..................................45

Chapter3.AReviewoftheDividendDiscountModel: FromDeterministictoStochasticModels .................47 GuglielmoD’A MICO andRiccardoD E B LASIS

3.1.Introduction.................................47

3.2.Generalmodel................................48

3.3.Gordongrowthmodelandextensions...................50

3.3.1.Gordonmodel..............................50

3.3.2.Two-stagemodel.............................51 3.3.3.Hmodel.................................52

3.3.4.Three-stagemodel............................52

3.3.5.N-stagemodel..............................53

3.3.6.Otherextensions.............................53

3.4.Markovchainstockmodels.........................54

3.4.1.HurleyandJohnsonmodel.......................54

3.4.2.Yaomodel................................56

3.4.3.Markovstockmodel...........................57

3.4.4.MultivariateMarkovchainstockmodel................61 3.5.Conclusion..................................64 3.6.References..................................65

Chapter4.EstimationofPiecewise-deterministicTrajectories inaQuantumOpticsScenario .........................69

RomainA ZAÏS andBrunoL EGGIO

4.1.Introduction.................................69

4.1.1.Thepostulatesofquantummechanics.................69

4.1.2.DynamicsofopenquantumMarkoviansystems...........71

4.1.3.Stochasticwavefunction:quantumdynamicsasPDPs.......74

4.1.4.EstimationforPDPs...........................76

4.2.Problemformulation............................77

4.2.1.Atom-fieldinteraction..........................77

4.2.2.Piecewise-deterministictrajectories..................78

4.2.3.Measures.................................80

4.3.Estimationprocedure............................80

4.3.1.Strategy..................................80

4.3.2.Least-squareestimators.........................82

4.3.3.Numericalexperiments.........................83

4.4.Physicalinterpretation............................86

4.5.Concludingremarks.............................87

4.6.References..................................88

Chapter5.IdentificationofPatternsinaSemi-MarkovChain ....91

5.1.Introduction.................................91

5.2.Theprefixchain...............................93

5.3.Thesemi-Markovsetting..........................94

5.4.Thehittingtimeofthepattern.......................100

5.5.Agenomicapplication...........................102

5.6.Concludingremarks.............................106

5.7.References..................................106

Part2.AutoregressiveProcesses ......................109

Chapter6.TimeChangesandStationarityIssuesforContinuous TimeAutoregressiveProcessesofOrder p ................111

ValérieG IRARDIN andRachidS ENOUSSI

6.1.Introduction.................................111

6.2.Basics.....................................112

6.3.StationaryARprocesses..........................114

6.3.1.Formulasforthetwofirst-ordermoments...............114

6.3.2.Examples.................................116

6.3.3.ConditionsforstationarityofCAR1 (p) processes..........118

6.4.Timetransforms...............................125

6.4.1.Propertiesoftimetransforms......................125

6.4.2.MSprocesses..............................131

6.5.Conclusion..................................132

6.6.Appendix...................................133

6.7.References..................................136

Chapter7.SequentialEstimationforNon-parametric AutoregressiveModels .............................139

OuerdiaA RKOUN ,Jean-YvesB RUA andSergueiP ERGAMENCHTCHIKOV

7.1.Introduction.................................139

7.2.Mainconditions...............................141

7.3.Pointwiseestimationwithabsoluteerrorrisk...............142

7.3.1.Minimaxapproach............................142

7.3.2.Adaptiveminimaxapproach......................144

7.3.3.Non-adaptiveprocedure.........................145

7.3.4.Sequentialkernelestimator.......................148

7.3.5.Adaptivesequentialprocedure.....................151

7.4.Estimationwithquadraticintegralrisk..................153

7.4.1.Passagetoadiscretetimeregressionmodel.............155

7.4.2.Modelselection.............................159

7.4.3.Mainresults...............................161

7.5.References..................................164

Part3.DivergenceMeasuresandEntropies ................167

Chapter8.InferenceinParametricandSemi-parametricModels: TheDivergence-basedApproach .......................169

MichelB RONIATOWSKI

8.1.Introduction.................................169

8.1.1.Csiszárdivergences,variationalform.................170

8.1.2.Dualformofthedivergenceanddualestimatorsinparametric models......................................172

8.1.3.Decomposablediscrepancies......................178

8.2.Modelsandselectionofstatisticalcriteria................183

8.3.Non-regularcases:theinterplaybetweenthemodelandthe criterion......................................184

8.3.1.Teststatistics...............................185

8.4.References..................................187

Chapter9.DynamicsoftheGroupEntropyMaximization ProcessesandoftheRelativeEntropyGroupMinimization ProcessesBasedontheSpeed-gradientPrinciple ...........189

VasileP REDA andIrinaB ANCESCU

9.1.Introduction.................................190

9.1.1.TheSGprinciple.............................191

9.1.2.Entropygroups..............................193

9.2.GroupentropiesandtheSGprinciple...................196

9.2.1.Totalenergyconstraint.........................199

9.3.RelativeentropygroupandtheSGprinciple...............202

9.3.1.Equilibriumstability..........................205

9.3.2.Totalenergyconstraint.........................205

9.4.Anew (G,a) powerrelativeentropygroupandtheSGprinciple....206

9.5.Conclusion..................................210

9.6.References..................................210

Chapter10.InferentialStatisticsBasedonMeasuresof InformationandDivergence ..........................215

AlexK ARAGRIGORIOU andChristosM ESELIDIS

10.1.Introduction.................................215

10.2.Divergencemeasures............................216

10.2.1. ϕ-Divergences.............................216

10.2.2. α-Divergences.............................217

10.2.3.Bregmandivergences.........................218

10.3.Propertiesofdivergencemeasures....................219

10.4.Modelselectioncriteria..........................220

10.5.Goodnessoffittests............................222

10.5.1.Simplenullhypothesis.........................222

10.5.2.Compositenullhypothesis......................223

10.6.Simulationstudy..............................227

10.7.References.................................231

Chapter11.Goodness-of-FitTestsBasedonDivergenceMeasures forFrailtyModels .................................235 FiliaV ONTA

11.1.Introduction.................................235

11.2.Theproposedgoodness-of-fittest....................236

11.3.Mainresults.................................240

11.4.Frailtymodels...............................243

11.5.Simulations.................................244

11.5.1.Linearmodelsfortheestimationofcriticalvalues.........247

11.5.2.Sizeofthetest.............................248

11.6.References.................................250

Preface

ThiscollectivebookstemsfromaworkshopthattookplaceinRouen, October3–5,2018,withintheframeworkoftheproject RandomModelsand StatisticalTools,InformaticsandCombinatorics(MOUSTIC–Modèlesaléatoireset OutilsStatistiques,InformatiquesetCombinatoires),financedbytheRegionof NormandyandtheEuropeanRegionalDevelopmentFund.Themainideawasto bringtogetherleadingscientistsworkingonprobabilisticandstatisticaltopicsfor dependentdata,aswellasonassociatedapplications.

Thisisthewaythisbookwaswritten,withtheintentiontooffertothescientific communityapartofthelatestadvancesinthefieldofstochasticmodelingfor dependentdata,authoredbyleadingexpertsinthefield.Thisisacrucialaspectina timewhenwefaceanincreasingneedformoreandmorecomplexmodelscapableof capturingthemainfeaturesofincreasinglycomplexapplications.Fromatechnical pointofview,ourbookisimportantforgatheringtheoreticaldevelopmentsand applicationsrelatedtoMarkovtypemodels(semi-Markovprocesses,autoregressive processes,piecewisedeterministicMarkovprocesses,andvariablelengthMarkov chains)aswellasprobabilistic/statisticaltechniquesissuedfrominformationtheory, basedondivergencemeasuresandentropies.

Thisvolumeisdividedintothreeparts:thefirstoneexaminesMarkovand semi-Markovprocesses,thesecondonedealswithautoregressiveprocessesandthe lastonepresentsdivergencemeasuresandentropies.Particularattentionisgivento applicationsofthesemethodsinvariousfieldssuchasfinance,DNAanalysis, quantumphysicsandsurvivalanalysis.

TheworkshopinRouenand,consequently,thepresentbook,wouldnothave beenpossiblewithoutthesupportoftheLaboratoryofMathematicsRaphaëlSalem, theLaboratoryofMathematicsNicolasOresme,theUniversityofRouenNormandy, theRegionofNormandy,theFrenchStatisticalSociety-SFdS,theMASgroupofthe

xiiStatisticalTopicsandStochasticModelsforDependentDatawithApplications

FrenchSocietyofAppliedandIndustrialMathematics-SMAI,theNormandieMathématiquesResearchFederationandtheNormasticResearchFederation.

Wewouldliketothankallthespeakersoftheworkshopwhocontributed,although someofthemindirectly,tothequalityofthepresentvolume.Ourthanksgoalsoto theanonymousreviewersfortheirvaluablework:withouttheirsupport,thisvolume couldnothavebeensuccessfullycompleted.

WewouldalsoliketothankProfessorNikolaosLimniosforproposingand encouragingustoelaboratethiscollectivevolumeaswellastheeditorialstaffof ISTELtdfortheirtechnicalsupport.

VladStefanB

NicolasV ERGNE Rouen,May2020

PART 1 MarkovandSemi-MarkovProcesses

Statistical Topics and Stochastic Models for Dependent Data with Applications, First Edition. Vlad Stefan Barbu and Nicolas Vergne. © ISTE Ltd 2020. Published by ISTE Ltd and John Wiley & Sons, Inc.

VariableLengthMarkovChains,Persistent RandomWalks:ACloseEncounter

Weconsiderawalkeronthelinethatateachstepkeepsthesamedirectionwitha probabilitythatdependsonthetimealreadyspentinthedirectionthewalkeris currentlymoving.Thesewalkswithmemoriesofvariablelengthcanbeseenas generalizationsofdirectionallyreinforcedrandomwalks(DRRWs)introducedin Mauldin etal.(1996).Wegiveacompleteandusablecharacterizationofthe recurrenceortransienceintermsoftheprobabilitiestoswitchthedirection.These conditionsarerelatedtosomecharacterizationsofexistenceanduniquenessofa stationaryprobabilitymeasureforaparticularMarkovchain:inthischapter,wedefine thegeneralmodelforwordsproducedbyavariablelengthMarkovchain(VLMC)and weintroduceakeycombinatorialstructureonwords.ForasubclassoftheseVLMC, thisprovidesnecessaryandsufficientconditionsforexistenceofastationary probabilitymeasure.

1.1.Introduction

Thisisthestoryoftheencounterbetweentwoworlds:theworldofrandomwalks andtheworldofVLMCs.Themeetingpointturnsaroundthesemi-Markovproperty ofunderlyingprocesses.

InaVLMC,unlikefixed-orderMarkovchains,theprobabilitytopredictthenext symboldependsonapossiblyunboundedpartofthepast,thelengthofwhichdepends onthepastitself.Theserelevantpartsofpastsarecalled contexts.Theyarestoredin a contexttree.Witheachcontext,aprobabilitydistributionisassociated,prescribing theconditionalprobabilityofthenextsymbol,giventhiscontext.

VLMCsarenowwidelyusedasrandommodelsforcharacterstrings.Theywere introducedinRissanen(1983)toperformdatacompression.Whentheyhaveafinite memory,theyprovideaparsimoniousalternativetofixed-orderMarkovchainmodels, inwhichthenumberofparameterstoestimategrowsexponentiallyfastwiththeorder; theyarealsoabletocapturefinerpropertiesofcharactersequences.Whentheyhave infinitememory–thiswillbeourcaseofstudyinthischapter–theyareatractable waytobuildnon-Markovmodelsandtheymaybeconsideredasasubclassof“chaînes àliaisonscomplètes”(DoeblinandFortet1937)or“chainswithinfiniteorder”(Harris 1955).

VLMCsareusedinbioinformatics,linguisticsandcodingtheorytomodelhow wordsgrowortoclassifywords.Inbioinformatics,bothforproteinfamiliesandDNA sequences,identifyingpatternsthathaveabiologicalmeaningisacrucialissue.Using VLMCasamodelenablesonetoquantifytheinfluenceofameaningpatternby givingatransitionprobabilityonthefollowingletterofthesequence.Inthisway,these patternsappearascontextsofacontexttree.Notethattheirlengthmaybeunbounded (BejeranoandYona2001).

Inaddition,ifthecontexttreeisrecognizedtobeasignatureofafamily(say,of proteins),thisgivesanefficientstatisticalmethodtotestwhetherornottwosamples belongtothesamefamily(Busch etal.2009).

Therefore,estimatingacontexttreeisanissueofinterestandmanyauthors (statisticiansornot,appliedornot)stressthefactthattheheightofthecontexttree shouldnotbesupposedtobebounded.ThisisthecaseinGalvesandLeonardi (2008)wherethealgorithm CONTEXT isusedtoestimateanunboundedcontexttree, orinGarivierandLeonardi(2011).Furthermore,asexplainedinCsiszárandTalata (2006),theheightoftheestimatedcontexttreegrowswiththesamplesize,sothat estimatingacontexttreebyassuming apriori thatitsheightisboundedisnot realistic.

Thereisextensiveliteratureontheconstructionofefficientestimatorsofcontext trees,aswellforfiniteorinfinitecontexttrees.Thischapterisnotareviewofstatistics issues,whichwouldalreadyberelevantforfinitememoryVLMC.Thisisastudy oftheprobabilisticpropertiesofinfinitememoryVLMCasrandomprocesses,and morespecificallyofthemainpropertyofinterestforsuchprocesses:existenceand uniquenessofastationarymeasure.

Ashasalreadybeensaid,VLMCareanaturalgeneralizationtoinfinitememory ofMarkovchains.Itisusualtoindexasequenceofrandomvariablesforminga Markovchainwithpositiveintegersandtomaketheprocessgrowtotheright.The maindrawbackofthishabitforaninfinitememoryprocessisthatthesequenceofthe processisreadfromlefttoright,whereasthe(possiblyinfinite)sequencegivingthe pastneededtopredictthenextsymbolisreadinthecontexttreefromrighttoleft,

thusgivingrisetoconfusionandlackofreadability.Forthisreason,inthischapter, theVLMCgrowstotheleft.Inthisway,boththeprocesssequenceandthememory inthecontexttreearereadfromlefttoright.

Classicalrandomwalkshave independent andidenticallydistributedincrements. Intheliterature, persistent randomwalks(PRMs),alsocalled Goldstein-Kacrandom walks or correlatedrandomwalks,refertorandomwalkshavingaMarkovchainof finiteorderasanincrementprocess.Forsuchwalks,thedynamicsoftrajectorieshas ashortmemoryofgivenlengthandtherandomwalkitselfisnotMarkoviananymore. Whathappenswhenevertheincrementsdependona non-bounded pastmemory?

Considerawalkeron Z,allowedtoincrementitstrajectoryby 1 or 1 ateach stepoftime.Assumethattheprobabilitytokeepthecurrentdirection ±1 depends onthetimealreadyspentinthesaiddirection–thedistributionofincrementsthus actsasareinforcementofthedependencyfromthepast.Moreprecisely,theprocess ofincrementsofsuchaone-dimensionalrandomwalkisaMarkovchainontheset of(right-)infinitewords,withvariable–andunbounded–lengthmemory:aVLMC. TheconcernedVLMCisdefinedinsection1.3.1.Itisbasedonacontexttreecalled a doublecomb.Later,section1.3.2dealswithatwo-dimensionalpersistentrandom walkdefinedinananalogousmanneron Z2 byaVLMCbasedonacontexttreecalled a quadruplecomb

Theserandomwalksthathaveanunboundedpastmemorycanbeseenasa generalizationof“directionallyreinforcedrandomwalks(DRRW)”introducedby Mauldin etal.(1996),inthesensethatthepersistencetimesareanisotropicones.For aone-dimensionalrandomwalkassociatedwithadoublecomb,acomplete characterizationofrecurrenceandtransience,intermsofchanging(ornot)direction probabilities,isgiveninsection1.3.1.Moreprecisely,whenoneoftherandomtimes spentinagivendirection(theso-called persistencetimes)isanintegrablerandom variable,therecurrencepropertyisequivalenttoaclassicaldrift-vanishing.Inall othercases,thewalkistransientunlesstheweightofthetaildistributionsofboth persistenttimesareequal.Intwo-dimensionalrandomwalk,sufficientconditionsof transienceofrecurrencearegiveninsection1.3.2.

Actually,becauseoftheveryspecificformoftheunderlyingdrivingVLMC, thesePRWsturnouttobeinone-to-onecorrespondencewithso-called Markov additiveprocesses.Section1.5examinesthecloselinksbetweenPRWs,Markov additiveprocesses,semi-MarkovchainsandVLMC.

Insection1.2,thedefinitionofageneralVLMCandacoupleofexamplesare given.Insection1.3,thePRWsaredefinedandknownresultsontheirrecurrence propertiesarecollected.Inviewofsection1.5whereweshowhowPRWandVLMC meetthroughtheworldofsemi-Markovchains,section1.4isdevotedtoresults–togetherwithaheuristicapproach–ontheexistenceandunicityofstationary measuresforaVLMC.

1.2.VLMCs:definitionofthemodel

Let A beafiniteset,calledthe alphabet.Here A willmostoftenbethestandard alphabet A = {0, 1},butalso A = {d,u} (for down and up)or A = {n, e, w, s} (for thecardinaldirections).Let

R = {αβγ : α,β,γ, ···∈A}

bethesetof right-infinite wordsover A,writtenbysimpleconcatenation.AVLMC on A,definedbelowandmostoftendenotedby (Un )n∈N ,isaparticulartypeof R-valueddiscretetimeMarkovchainwhere:

–theprocessevolvesbetweentime n andtime n +1 byaddingoneletter onthe left of Un ;

–thetransitionprobabilitiesbetweentime n andtime n +1 dependonafinite–butnotbounded–prefix1 ofthecurrentword Un .

Givingaformalframeofsuchaprocessleadstothefollowingdefinitions.Fora completepresentationofVLMC,onecanalsorefertoCénac etal.(2012).

Asusual,a treeon A isaset T offinitewords–namelyasubsetof ∪n∈N An –whichcontainstheemptyword ∅ (the root of T )andwhichisprefix-stable:forall finitewords u,v , uv ∈T =⇒ u ∈T .Atreeismadeof internalnodes (u ∈T is internalwhen ∃α ∈A, uα ∈T )andof leaves (u ∈T isaleafwhenithasnochild: ∀α ∈A, uα/∈T ).

D EFINITION 1.1(Contexttree).– A contexttree on A isasaturatedtreeon A having anatmostcountablesetofinfinitebranches.

Thetree T is saturated wheneveranyinternalnodehas #(A) children:forany finiteword u andforany α ∈A, uα ∈T =⇒ (∀β ∈A,uβ ∈T ).Aright-infinite wordon A isan infinitebranch of T whenallitsfiniteprefixesbelongto T .

FollowingthevocabularyintroducedbyRissanen,a context ofthetreeisaleaf oraninfinitebranch.Afiniteorright-infinitewordon A isan externalnode whenit isneitherinternalnoracontext.SeeFigure1.1whichillustratesthesedefinitions,as wellasthe pref functiondefinedhereunder.

D EFINITION 1.2(pref function).– Let T beacontexttree.If w isanyexternalnode oranycontext,thesymbol pref w denotesthelongest(finiteorinfinite)prefixof w thatbelongsto T

1Infact,aninfiniteprefixmightbeneededinadenumerablenumberofcases.

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter7

Inotherwords, pref w istheonlycontext c forwhich w = c Foramorevisual presentation,hang w byitshead(itsleft-mostletter)andinsertitintothetree;theonly contextthroughwhichthewordgoesoutofthetreeisits pref

Aninternalnode

Figure1.1. Acontexttreeonthealphabet A = {0, 1}.Thedottedlinesarepossiblythe beginningofinfinitebranches.Anywordthatwrites 1000 ··· ,liketheonerepresented bythedashedline,admits 1000 asa pref .Foracolorversionofthisfigure,see www.iste.co.uk/barbu/data.zip

Withthesedefinitions,itisnowpossibletodefineaVLMC.

D EFINITION 1.3(VLMC).– Let T beacontexttree.Foreverycontext c of T ,let qc be aprobabilitymeasureon A.The VLMC definedby T andbythe (qc )c isthe R-valued discrete-timeMarkovchain (Un )n∈N definedbythefollowingtransitionprobabilities:

n ∈ N, ∀α ∈A,

TogetarealizationofaVLMCasaprocesson R,takea(random)rightinfinite word

U0 = X0 X 1 X 2 X 3

Ateachstepoftime n ≥ 0,onegets Un+1 byaddingarandomletter Xn+1 onthe leftof Un :

Un+1 = Xn+1 Un = Xn+1 Xn X1 X0 X 1 X 2 undertheconditionaldistribution[1.1].

R EMARK 1.1.– Probabilizing acontexttreeconsists,asindefinition1.3,ofendowing itwithafamilyofprobabilitymeasuresonthealphabet,indexedbythesetofcontexts. Thisvocabularyisusedbelow.

R EMARK 1.2.–Assumethatthecontexttreeisfiniteanddenoteitsheightby h;in thiscondition,theVLMCisjustaMarkovchainoforder h on A.Onthecontrary, whenthecontexttreeisinfinite,andthisismainlyourcaseofinterest,theVLMCis generally not aMarkovprocesson A

E XAMPLE 1.1.–Take A = {n, e, w, s} asan(ordered)alphabet,sothatthedaughters ofaninternalnodearerepresented,asshownontheleftsideofFigure1.2.Makingthe transitionprobabilities P (Un+1 = αUn |Un ) dependonlyonthelengthofthelargest prefixoftheform nk (k ≥ 0)of Un amountstotakingacombasacontexttree,as shownontherightsideofFigure1.2.Itsfinitecontextsarethe nk α where k ≥ 0 and α ∈A\{n}

n e w s

Figure1.2. Ontheleft:howonecanrepresenttreeson A = {n, e, w, s}.Ontheright,theso-called leftcomb on A = {n, e, w, s}

E XAMPLE 1.2.–Takeagain A = {n, e, w, s} asanalphabet.Makingthetransition probabilities P (Un+1 = αUn |Un ) dependonlyonthelengthofthelargestprefixof theform αk (k ≥ 1)of Un ,where α is any letter,amountstotakinga quadruplecomb asacontexttree,asshownontherightsideofFigure1.3.Inthesamevein,ifone takes A = {u,d},the doublecomb isthecontexttree,asshownontheleftsideof Figure1.3.InthecorrespondingVLMC,thetransitionsdependonlyonthelengthof thelastcurrentrun uk or dk , k ≥ 1.Thedoublecombandthequadruplecombare usedbelowtodefinePRWs.

Figure1.3. Thedoublecombandthequadruplecomb

E XAMPLE 1.3.–Take A = {0, 1} (naturallyorderedforthedrawings).Theleftcomb ofrightcombs,shownontheleftsideofFigure1.4,isthecontexttreeofaVLMCthat makesitstransitionprobabilitiesdependonthelargestprefixof Un oftheform 0p 1q .

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter9

Ifonehastotakeintoconsiderationthelargestprefixoftheform 0p 1q or 1p 0q ,onehas tousethedoublecombofoppositecombs,asshownontherightsideofFigure1.4.

Figure1.4. Contexttreeson A = {0, 1}:theleftcombofrightcombs (ontheleft)andadoublecombofoppositecombs(ontheright)

D EFINITION 1.4(Non-nullness).– AVLMCiscalled non-null whennotransition probabilityvanishes,i.e.when qc (α) > 0 foreverycontext c andforevery α ∈A

Non-nullnessappearsbelowasanirreducibility-likeassumptionmadeonthe drivingVLMCofPRWsandforexistenceandunicityofaninvariantprobability measureforageneralVLMCaswell.

1.3.DefinitionandbehaviorofPRWs

Inthissection,theso-called PRWs aredefined.APRWisarandomwalkdriven bysomeVLMC.Indimensionsoneandtwo,resultsontransienceandtherecurrence ofPRWaregiven.TheseresultsaredetailedandproveninCénac etal.(2018b,2013) indimensiononeandinCénac etal.(2020)indimensiontwo.

1.3.1. PRWsindimensionone

Inthissection,wedealwithone-dimensionalPRWs.Notethat,contrarytothe classicalrandomwalk,aPRWisgenerallynotMarkovian.Let A := {d,u} = {−1, 1} (d fordownand u forup)andconsiderthe doublecomb onthisalphabetasacontext tree,probabilizeitanddenoteby (Un )n arealizationoftheassociatedVLMC.The nth increment Xn ofthePRWisgivenasthefirstletterof Un :definethepersistent randomwalk S =(Sn )n≥0 by S0 =0 and,for n ≥ 1,

Sn := n =1 X , [1.2] sothatforany n ≥ 1, m ≥ 0,

P (Sm+1 = Sm +1|Um = dn u...)= qdn u (u)

P (Sm+1 = Sm 1|Um = un d...)= qun d (d).

Furthermore,forthesakeofsimplicityandwithoutlossofgenerality,wecondition thewalktostartalmostsurely(a.s.)from {X 1 = u,X0 = d} –thisamountsto changingtheoriginoftime.Inthismodel,awalkeronalinekeepsthesamedirection withaprobabilitythatdependsonthediscretetimealreadyspentinthedirectionthe walkeriscurrentlymoving(seeFigure1.5).Thismodelcanbeseenasageneralization ofDRRWsintroducedinMauldin etal.(1996).

Figure1.5. Aone-dimensionalPRW.Foracolorversion ofthisfigure,seewww.iste.co.uk/barbu/data.zip

Takingdifferentprobabilizedcontexttreeswouldleadtodifferentprobabilistic impactsontheasymptoticbehaviorofresultingPRWs.Moreover,the characterizationoftherecurrentversustransientbehaviorisdifficultingeneral.We statehereexhaustiverecurrencecriteriaforPRWsdefinedfromadoublecomb.

Inordertoavoidtrivialcases,weassumethat S cannotbefrozeninoneofthetwo directionswithapositiveprobability.Therefore,wemakethefollowingassumption.

SSUMPTION 1.1(Finitenessofthelengthofruns).–Forany

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter11

Let τ u n and τ d n be,respectively,thelengthofthe nth riseandofthe nth descent.

Then,byarenewal-typeproperty(seeCénac etal.2013,proposition2.3), (τ d n )n≥1 and (τ u n )n≥1 areindependentsequencesofi.d.d.randomvariables.Theirdistribution tailsarestraightforwardlygivenby:forany α,β ∈{u,d}, α = β and n ≥ 1,

Notethatassumption1.1amountstosupposingthatthe persistencetimes τ d n and τ u n area.s.finite.The jumptimes (orbreakingtimes)are: B0 =0 and,for n ≥ 1,

Inordertodealwithamoretractablerandomwalkbuiltwiththepossibly unboundedbuti.d.d.increments Yn :=

τ d n ,weintroducetheunderlying skeleton randomwalk (Mn )n≥1 ,whichistheoriginalwalkobservedattherandom timesofup-to-downturns:

Twomainquantitiesplayakeyroleintheasymptoticbehavior,namelythe expectationsofthelengthsofruns:withformula[1.4],let

Actually, Θd and Θu arealreadydiscussedinCénac etal.(2013,propositionB1), whereitisshownthatthedrivingVLMCofaone-dimensionalPRWadmitsaunique invariantprobabilitymeasureifandonlyif Θd < ∞ and Θu < ∞.

Notethattheexpectationof Y1 iswelldefinedin [−∞, +∞] wheneveratleastone ofthepersistencetimes τ u 1 or τ d 1 isintegrable.Thus,assoonas Θd < ∞ or Θu < ∞, let

M := E[Y1 ]=Θu Θd

and

Anelementarycomputationshowsthat E (Mn )= ndM and E (Sn ) ∼ ndS when n tendstoinfinity.Thus, dM and dS appearasasymptoticdriftswhenthewalks (Mn )n and (Sn )n respectively,turnouttobetransient(seeTable1.1).Thebehavior ofthewalkalsodependsonquantities Jα|β ,definedfor α and β ∈A,α = β by:

P(τ α 1 = n)

k =1 P(τ β 1 ≥ k )

Acompleteandusablecharacterizationoftherecurrenceandthetransienceof thePRWintermsoftheprobabilitiestopersistinthesamedirectionortoswitchis giveninproposition1.1.ItsproofreliesonacriterionofErickson(1973),appliedto theskeletonwalk (Mn )n ,whichissimplertodealwithbecauseitsincrementsare independent.

P ROPOSITION 1.1.–Underthenon-nullnessassumptionandassumption1.1,the randomwalk (Sn )n isrecurrentortransientasdescribedinTable1.1.

Drifting +∞

Drifting +∞ Recurrent dS > 0 dS =0 Drifting −∞ dS < 0 Θ

d < ∞

Table1.1. Recurrenceversustransience(drifting) for (Sn )n indimensionone

Themostfruitfulsituationemergeswhenbothrunningtimes τ u 1 and τ d 1 have infinitemeans.Inthatcase,therecurrencepropertiesof (Sn )n arerelatedtothe behavioroftheskeletonrandomwalk (Mn )n definedin[1.6],thedriftofwhich, dM ,isnotdefined.Thus,thebehaviorof (Sn )n dependsonthecomparisonbetween thedistributiontailsof τ u 1 and τ d 1 definedin[1.4],expressedbythequantities Jα|β . Notethatthecasewhenboth Ju|d and Jd|u arefinitedoesnotappearinthetable sinceitwouldimplythat Θu < ∞ and Θd < ∞ (seeErickson1973).

Inallthreeothercases,thedrift dS iswelldefinedandthePRWisrecurrentifand onlyif dS =0.Inthatcase, lim n→∞ Sn n = dS =0.Notethatmodifyingonetransition qc transformsarecurrentPRWintoatransientone,since dS becomesnon-zero.

1.3.2. PRWsindimensiontwo

Takethealphabet A := {n, e, w, s}.Here, (e, n) standsforthecanonicalbasis of Z2 , w = e and s = n.Hence,theletters e, n, w and s standformovestotheeast, north,westandsouth,respectively.Havinginmindarandomwalkwithincrements in A,anywordoftheform αβ , α,β ∈A,α = β iscalleda bend.Forthesakeof simplicity,weconditionthewalktostarta.s.withan ne bend: {X 1 = n,X0 = e}.

Figure1.6. Awalkindimensiontwo.Foracolorversion ofthisfigure,seewww.iste.co.uk/barbu/data.zip

Takeanon-nullVLMCassociatedwithaquadruplecombon A,asshownin

Figure1.3:thecontextsare αn β for α,β ∈A,α = β , n ≥ 1 andtheattached probabilitydistributionsaredenotedby qαn β .Thetwo-dimensionalPRW (Sn )n is defined,usingthisVLMC,asinformula[1.2].

Contrarytotheone-dimensionalPRWs,asdetailedbelow,theprobabilityto changedirectiondependsonthetimespentinthecurrentdirectionbutalsoonthe previousdirection.Asindimensionone,weintendtoavoidthat S remainsfrozenin oneofthefourdirectionswithapositiveprobability.Therefore,wemakethe followingassumption,analogoustoassumption1.1indimensiontwo.

A SSUMPTION 1.2(Finitenessofthelengthofruns).–Forany α,β ∈{n, e, w, s}, α = β ,

Let (Bn )n≥0 bethe breakingtimes definedinductivelyby B0 =0 and Bn+1 =inf {k>Bn : Xk = Xk 1 }

Asindimensionone,assumption1.2impliesthatthebreakingtimes Bn area.s. finite.

Definetheso-called internalchain (Jn )n≥0 by J0 = ne and,forall n ≥ 1,

Letusillustratetheserandomvariableswithasmallexample,inwhich: B1 =4, B2 =7, J0 = X 1 X0 , J1 = XB0 XB1 = X

:

Theprocess (Jn )n≥0 isanirreducibleMarkovchainonthesetofbends S := {αβ |α ∈A,β ∈A,α = β }.ItsMarkovkernelisdefinedby:forevery β,α,γ ∈A with β = α and α = γ ,

(βα; αγ ):=

thenumbers P (αβ,γδ ) being 0 foreverycoupleofbendsnotofthepreviousform. Remarkthatthenon-nullnessassumption(seedefinition1.4)impliestheirreducibility of (Jn )n anditsaperiodicity.Thestatespace S isfinitesothat (Jn )n ispositive recurrent:itadmitsauniqueinvariantprobabilitymeasure πJ .

Denote T0 =0 and Tn+1 := Bn+1 Bn forevery n ≥ 0.Thesewaitingtimes (alsocalled persistencetimes)arenotindependent,contrarytotheone-dimensional case.The skeletonrandomwalk (Mn )n≥0 on Z2 –whichisthePRWobservedatthe breakingtimes–isthendefinedas

Notethat (Mn )n isgenerallynotaclassicalRWwithi.d.d.increments. Nevertheless,takingintoaccounttheadditionalinformationgivenbytheinternal

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter15

Markovchain (Jn )n ,then (Jn ,Mn )n isaMarkovadditiveprocess(seeÇinlar1972) asitwillappearinsection1.5.

Here, (Jn )n ispositiverecurrentbutthisdoesnotimplytherecurrenceof (Sn )n or (Mn )n .Moreover, (Sn )n and (Mn )n mayhavedifferentbehaviors.Explicit, necessaryandsufficientconditionsfortherecurrenceof (Mn )n intermsof characteristicfunctionsandconvergenceofsuitableseriesaregiveninCénac etal (2020,theorem2.1).Thefollowingpropositionstatesadichotomybetweensome recurrenceversustransiencephenomenon.

T HEOREM 1.1.–Undernon-nullnessassumption,thefollowingdichotomyholds:

i)theseries n P (Mn =0) divergesifandonlyiftheprocess (Mn )n isrecurrent inthefollowingsense: ∃r> 0, P liminf n→∞ Mn <r =1

ii)theseries n P (Mn =0) convergesifandonlyiftheprocess (Mn )n is transientinthefollowingsense:

Doestherecurrence(respectively,thetransience)of (Mn )n and (Sn )n occurat thesametime?Theanswertothis20-year-oldquestionisno:

T HEOREM 1.2(DefinitiveinvalidationoftheconjectureinMauldin etal.1996).–ThereexistrecurrentPRWs (Sn )n havinganassociatedtransientskeleton (Mn )n .

Supposingthatthepersistencetimedistributionsarehorizontallyandvertically symmetricisanaturalnecessaryconditionfortherandomwalk (Sn )n tobe recurrent.OneexampleisgivenbytheDRRW,originallyintroducedinMauldin etal.(1996)(seeFigure1.7).Someparticularvaluesofthetransitionprobabilities qαn β providecounterexamples.ItisshowninCénac etal.(2020)thatthe correspondingdistributionsofthepersistencetimesmustbenon-integrable.In section1.5,thisnon-integrabilitywillberelatedtothenon-existenceofanyinvariant probabilitymeasureforthedrivingVLMC.

1.4.VLMC:existenceofstationaryprobabilitymeasures

ConsideraVLMCdenotedby U =(Un )n≥0 ,definedbyapair (T ,q ) where T isacontexttreeonanalphabet A and q =(qc )c∈C afamilyofprobabilitymeasures on A,indexedbythecontextsof T .Aprobabilitymeasure π on R is stationary or invariant (withregardto U )whenever π isthedistributionofevery Un assoonasitis thedistributionof U0 .Thequestionofinterestconsistshereoffindingconditionson

(T ,q ) fortheprocesstoadmitatleastone–orauniqueone–stationaryprobability measure.Theheuristicpresentationaimstoshowhowcombinatoricobjects,namely the α-LIS ofcontexts,andconditionalprobabilities,the cascades,naturallyemerge.

Figure1.7. Theoriginaldirectionallyreinforcedrandomwalk(DRRW). Foracolorversionofthisfigure,seewww.iste.co.uk/barbu/data.zip

Assumethat π isastationaryprobabilitymeasureon R:

–Firststep:finitewords.Since R isendowedwiththecylinder σ -algebra, π is determinedbyitsvalues π (w R) onthecylinders w R,where w runsoverallfinite wordson A.

–Secondstep:longestinternalsuffixesofwords.Assumethat e isafinite non-internalwordandtake a ∈A.Then,its pref iswelldefinedand,becauseof formula[1.1],since π isstationary,

π (αeR)= qpref(e) (α) × π (eR) . [1.15]

Iteratingthisformulaasfaraspossibleleadstothefollowingdefinitions. Consideranynon-emptyfiniteword w .Itisuniquelydecomposedas w = pαs = β1 β2 β3 ··· β αs,wherethe α andthe β arelettersand s isthe longest internalsuffix of w .Theinteger isnon-negativeand p = β1 β2 ··· β isaprefixof w thatmaybeempty–inwhichcase =0.

D EFINITION 1.5(Lisand α-LIS).– Withthesenotations,thelongestinternalsuffix s isshortenedasthe lis of w .Theword αs iscalledthe α-LIS of w

D EFINITION 1.6(Cascade).– Withthenotationabove,the cascade of w istheproduct casc(w )= qpref(β2 ···β αs) (

. [1.16]

Notethatthisdefinitionmakessensebecauseallthe βk ··· β αs arenon-internal words, k ≥ 2.Moreover,if w = αs where s isinternal,then =0 and casc(w )=1.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.