Statistical topics and stochastic models for dependent data with applications vlad stefan barbu - In by Education Libraries

StatisticalTopicsandStochasticModelsfor DependentDataWithApplicationsVladStefanBarbu

https://ebookmass.com/product/statistical-topics-andstochastic-models-for-dependent-data-with-applications-vladstefan-barbu/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Applied Modeling Techniques and Data Analysis 2: Financial, Demographic, Stochastic and Statistical Models and Methods, Volume 8 Yannis Dimotikalis

https://ebookmass.com/product/applied-modeling-techniques-and-dataanalysis-2-financial-demographic-stochastic-and-statistical-modelsand-methods-volume-8-yannis-dimotikalis/ ebookmass.com

Statistical Learning for Big Dependent Data (Wiley Series in Probability and Statistics) 1st Edition Daniel Peña

https://ebookmass.com/product/statistical-learning-for-big-dependentdata-wiley-series-in-probability-and-statistics-1st-edition-danielpena/

ebookmass.com

From Statistical Physics to Data-Driven Modelling: with Applications to Quantitative Biology Simona Cocco

https://ebookmass.com/product/from-statistical-physics-to-data-drivenmodelling-with-applications-to-quantitative-biology-simona-cocco/

ebookmass.com

Understanding Religious Violence: Radicalism and Terrorism in Religion Explored via Six Case Studies 1st ed. Edition James Dingley

https://ebookmass.com/product/understanding-religious-violenceradicalism-and-terrorism-in-religion-explored-via-six-casestudies-1st-ed-edition-james-dingley/ ebookmass.com

Molecular, Cellular, and Metabolic Fundamentals of Human Aging Evandro Fei Fang & Linda Hildegard Bergersen & Brian C. Gilmour

https://ebookmass.com/product/molecular-cellular-and-metabolicfundamentals-of-human-aging-evandro-fei-fang-linda-hildegardbergersen-brian-c-gilmour/

ebookmass.com

The Town with No Mirrors Christina Collins

https://ebookmass.com/product/the-town-with-no-mirrors-christinacollins-2/

ebookmass.com

Tribology of Graphene: Simulation Methods, Preparation Methods, and Their Applications Oleksiy V. Penkov

https://ebookmass.com/product/tribology-of-graphene-simulationmethods-preparation-methods-and-their-applications-oleksiy-v-penkov/

ebookmass.com

The Game Designer's Playbook: An Introduction to Game Interaction Design Samantha Stahlke

https://ebookmass.com/product/the-game-designers-playbook-anintroduction-to-game-interaction-design-samantha-stahlke/

ebookmass.com

Theatre, Brief Version 11th Edition Robert Cohen

https://ebookmass.com/product/theatre-brief-version-11th-editionrobert-cohen/

ebookmass.com

Life history evolution: a biological meta-theory for the social sciences Steven

C. Hertler

https://ebookmass.com/product/life-history-evolution-a-biologicalmeta-theory-for-the-social-sciences-steven-c-hertler/

ebookmass.com

Statistical Topics and Stochastic Models for Dependent Data with Applications

Series Editor Nikolaos Limnios

Statistical Topics and Stochastic Models for Dependent Data with Applications

Edited by

Vlad Stefan Barbu

Nicolas Vergne

First published 2020 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd

John Wiley & Sons, Inc.

27-37 St George’s Road 111 River Street London SW19 4EU Hoboken, NJ 07030

UK USA

www.iste.co.uk

www.wiley.com

The rights of Vlad Stefan Barbu and Nicolas Vergne to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2020938718

British Library Cataloguing-in-Publication Data

A CIP record for this book is available from the British Library

ISBN 978-1-78630-603-6

Preface ........................................xi VladStefanB ARBU andNicolasV ERGNE

Part1.MarkovandSemi-MarkovProcesses ................1

Chapter1.VariableLengthMarkovChains,PersistentRandom Walks:ACloseEncounter ...........................3

PeggyC ÉNAC ,BrigitteC HAUVIN ,FrédéricPACCAUT andNicolasP OUYANNE

1.1.Introduction.................................3

1.2.VLMCs:deﬁnitionofthemodel......................6

1.3.DeﬁnitionandbehaviorofPRWs.....................9

1.3.1.PRWsindimensionone.........................9

1.3.2.PRWsindimensiontwo........................13

1.4.VLMC:existenceofstationaryprobabilitymeasures..........15

1.5.WhereVLMCandPRWmeet.......................19

1.5.1.Semi-MarkovchainsandMarkovadditiveprocesses........19

1.5.2.PRWsinducesemi-Markovchains...................20

1.5.3.Semi-Markovchainofthe α-LISinastableVLMC.........22

1.5.4.Themeetingpoint............................23

1.6.References..................................27

Chapter2.BootstrapsofMartingale-differenceArraysUnderthe UniformlyIntegrableEntropy .........................29

SalimB OUZEBDA andNikolaosL IMNIOS

2.1.Introductionandmotivation........................29

2.2.Somepreliminariesandnotation......................30

2.3.Mainresults.................................35

2.4.Applicationforthesemi-Markovkernelestimators...........36

2.5.Proofs.....................................41 2.6.References..................................45

Chapter3.AReviewoftheDividendDiscountModel: FromDeterministictoStochasticModels .................47 GuglielmoD’A MICO andRiccardoD E B LASIS

3.1.Introduction.................................47

3.2.Generalmodel................................48

3.3.Gordongrowthmodelandextensions...................50

3.3.1.Gordonmodel..............................50

3.3.2.Two-stagemodel.............................51 3.3.3.Hmodel.................................52

3.3.4.Three-stagemodel............................52

3.3.5.N-stagemodel..............................53

3.3.6.Otherextensions.............................53

3.4.Markovchainstockmodels.........................54

3.4.1.HurleyandJohnsonmodel.......................54

3.4.2.Yaomodel................................56

3.4.3.Markovstockmodel...........................57

3.4.4.MultivariateMarkovchainstockmodel................61 3.5.Conclusion..................................64 3.6.References..................................65

Chapter4.EstimationofPiecewise-deterministicTrajectories inaQuantumOpticsScenario .........................69

RomainA ZAÏS andBrunoL EGGIO

4.1.Introduction.................................69

4.1.1.Thepostulatesofquantummechanics.................69

4.1.2.DynamicsofopenquantumMarkoviansystems...........71

4.1.3.Stochasticwavefunction:quantumdynamicsasPDPs.......74

4.1.4.EstimationforPDPs...........................76

4.2.Problemformulation............................77

4.2.1.Atom-ﬁeldinteraction..........................77

4.2.2.Piecewise-deterministictrajectories..................78

4.2.3.Measures.................................80

4.3.Estimationprocedure............................80

4.3.1.Strategy..................................80

4.3.2.Least-squareestimators.........................82

4.3.3.Numericalexperiments.........................83

4.4.Physicalinterpretation............................86

4.5.Concludingremarks.............................87

4.6.References..................................88

Chapter5.IdentiﬁcationofPatternsinaSemi-MarkovChain ....91

BrendaIvetteG ARCIA -M AYA andNikolaosL IMNIOS

5.1.Introduction.................................91

5.2.Thepreﬁxchain...............................93

5.3.Thesemi-Markovsetting..........................94

5.4.Thehittingtimeofthepattern.......................100

5.5.Agenomicapplication...........................102

5.6.Concludingremarks.............................106

5.7.References..................................106

Part2.AutoregressiveProcesses ......................109

Chapter6.TimeChangesandStationarityIssuesforContinuous TimeAutoregressiveProcessesofOrder p ................111

ValérieG IRARDIN andRachidS ENOUSSI

6.1.Introduction.................................111

6.2.Basics.....................................112

6.3.StationaryARprocesses..........................114

6.3.1.Formulasforthetwoﬁrst-ordermoments...............114

6.3.2.Examples.................................116

6.3.3.ConditionsforstationarityofCAR1 (p) processes..........118

6.4.Timetransforms...............................125

6.4.1.Propertiesoftimetransforms......................125

6.4.2.MSprocesses..............................131

6.5.Conclusion..................................132

6.6.Appendix...................................133

6.7.References..................................136

Chapter7.SequentialEstimationforNon-parametric AutoregressiveModels .............................139

OuerdiaA RKOUN ,Jean-YvesB RUA andSergueiP ERGAMENCHTCHIKOV

7.1.Introduction.................................139

7.2.Mainconditions...............................141

7.3.Pointwiseestimationwithabsoluteerrorrisk...............142

7.3.1.Minimaxapproach............................142

7.3.2.Adaptiveminimaxapproach......................144

7.3.3.Non-adaptiveprocedure.........................145

7.3.4.Sequentialkernelestimator.......................148

7.3.5.Adaptivesequentialprocedure.....................151

7.4.Estimationwithquadraticintegralrisk..................153

7.4.1.Passagetoadiscretetimeregressionmodel.............155

7.4.2.Modelselection.............................159

7.4.3.Mainresults...............................161

7.5.References..................................164

Part3.DivergenceMeasuresandEntropies ................167

Chapter8.InferenceinParametricandSemi-parametricModels: TheDivergence-basedApproach .......................169

MichelB RONIATOWSKI

8.1.Introduction.................................169

8.1.1.Csiszárdivergences,variationalform.................170

8.1.2.Dualformofthedivergenceanddualestimatorsinparametric models......................................172

8.1.3.Decomposablediscrepancies......................178

8.2.Modelsandselectionofstatisticalcriteria................183

8.3.Non-regularcases:theinterplaybetweenthemodelandthe criterion......................................184

8.3.1.Teststatistics...............................185

8.4.References..................................187

Chapter9.DynamicsoftheGroupEntropyMaximization ProcessesandoftheRelativeEntropyGroupMinimization ProcessesBasedontheSpeed-gradientPrinciple ...........189

VasileP REDA andIrinaB ANCESCU

9.1.Introduction.................................190

9.1.1.TheSGprinciple.............................191

9.1.2.Entropygroups..............................193

9.2.GroupentropiesandtheSGprinciple...................196

9.2.1.Totalenergyconstraint.........................199

9.3.RelativeentropygroupandtheSGprinciple...............202

9.3.1.Equilibriumstability..........................205

9.3.2.Totalenergyconstraint.........................205

9.4.Anew (G,a) powerrelativeentropygroupandtheSGprinciple....206

9.5.Conclusion..................................210

9.6.References..................................210

Chapter10.InferentialStatisticsBasedonMeasuresof InformationandDivergence ..........................215

AlexK ARAGRIGORIOU andChristosM ESELIDIS

10.1.Introduction.................................215

10.2.Divergencemeasures............................216

10.2.1. ϕ-Divergences.............................216

10.2.2. α-Divergences.............................217

10.2.3.Bregmandivergences.........................218

10.3.Propertiesofdivergencemeasures....................219

10.4.Modelselectioncriteria..........................220

10.5.Goodnessofﬁttests............................222

10.5.1.Simplenullhypothesis.........................222

10.5.2.Compositenullhypothesis......................223

10.6.Simulationstudy..............................227

10.7.References.................................231

Chapter11.Goodness-of-FitTestsBasedonDivergenceMeasures forFrailtyModels .................................235 FiliaV ONTA

11.1.Introduction.................................235

11.2.Theproposedgoodness-of-ﬁttest....................236

11.3.Mainresults.................................240

11.4.Frailtymodels...............................243

11.5.Simulations.................................244

11.5.1.Linearmodelsfortheestimationofcriticalvalues.........247

11.5.2.Sizeofthetest.............................248

11.6.References.................................250

Preface

ThiscollectivebookstemsfromaworkshopthattookplaceinRouen, October3–5,2018,withintheframeworkoftheproject RandomModelsand StatisticalTools,InformaticsandCombinatorics(MOUSTIC–Modèlesaléatoireset OutilsStatistiques,InformatiquesetCombinatoires),ﬁnancedbytheRegionof NormandyandtheEuropeanRegionalDevelopmentFund.Themainideawasto bringtogetherleadingscientistsworkingonprobabilisticandstatisticaltopicsfor dependentdata,aswellasonassociatedapplications.

Thisisthewaythisbookwaswritten,withtheintentiontooffertothescientific communityapartofthelatestadvancesinthefieldofstochasticmodelingfor dependentdata,authoredbyleadingexpertsinthefield.Thisisacrucialaspectina timewhenwefaceanincreasingneedformoreandmorecomplexmodelscapableof capturingthemainfeaturesofincreasinglycomplexapplications.Fromatechnical pointofview,ourbookisimportantforgatheringtheoreticaldevelopmentsand applicationsrelatedtoMarkovtypemodels(semi-Markovprocesses,autoregressive processes,piecewisedeterministicMarkovprocesses,andvariablelengthMarkov chains)aswellasprobabilistic/statisticaltechniquesissuedfrominformationtheory, basedondivergencemeasuresandentropies.

Thisvolumeisdividedintothreeparts:thefirstoneexaminesMarkovand semi-Markovprocesses,thesecondonedealswithautoregressiveprocessesandthe lastonepresentsdivergencemeasuresandentropies.Particularattentionisgivento applicationsofthesemethodsinvariousfieldssuchasfinance,DNAanalysis, quantumphysicsandsurvivalanalysis.

TheworkshopinRouenand,consequently,thepresentbook,wouldnothave beenpossiblewithoutthesupportoftheLaboratoryofMathematicsRaphaëlSalem, theLaboratoryofMathematicsNicolasOresme,theUniversityofRouenNormandy, theRegionofNormandy,theFrenchStatisticalSociety-SFdS,theMASgroupofthe

xiiStatisticalTopicsandStochasticModelsforDependentDatawithApplications

FrenchSocietyofAppliedandIndustrialMathematics-SMAI,theNormandieMathématiquesResearchFederationandtheNormasticResearchFederation.

Wewouldliketothankallthespeakersoftheworkshopwhocontributed,although someofthemindirectly,tothequalityofthepresentvolume.Ourthanksgoalsoto theanonymousreviewersfortheirvaluablework:withouttheirsupport,thisvolume couldnothavebeensuccessfullycompleted.

WewouldalsoliketothankProfessorNikolaosLimniosforproposingand encouragingustoelaboratethiscollectivevolumeaswellastheeditorialstaffof ISTELtdfortheirtechnicalsupport.

VladStefanB

ARBU

NicolasV ERGNE Rouen,May2020

PART 1 MarkovandSemi-MarkovProcesses

Statistical Topics and Stochastic Models for Dependent Data with Applications, First Edition. Vlad Stefan Barbu and Nicolas Vergne. © ISTE Ltd 2020. Published by ISTE Ltd and John Wiley & Sons, Inc.

VariableLengthMarkovChains,Persistent RandomWalks:ACloseEncounter

Weconsiderawalkeronthelinethatateachstepkeepsthesamedirectionwitha probabilitythatdependsonthetimealreadyspentinthedirectionthewalkeris currentlymoving.Thesewalkswithmemoriesofvariablelengthcanbeseenas generalizationsofdirectionallyreinforcedrandomwalks(DRRWs)introducedin Mauldin etal.(1996).Wegiveacompleteandusablecharacterizationofthe recurrenceortransienceintermsoftheprobabilitiestoswitchthedirection.These conditionsarerelatedtosomecharacterizationsofexistenceanduniquenessofa stationaryprobabilitymeasureforaparticularMarkovchain:inthischapter,wedeﬁne thegeneralmodelforwordsproducedbyavariablelengthMarkovchain(VLMC)and weintroduceakeycombinatorialstructureonwords.ForasubclassoftheseVLMC, thisprovidesnecessaryandsufﬁcientconditionsforexistenceofastationary probabilitymeasure.

1.1.Introduction

Thisisthestoryoftheencounterbetweentwoworlds:theworldofrandomwalks andtheworldofVLMCs.Themeetingpointturnsaroundthesemi-Markovproperty ofunderlyingprocesses.

InaVLMC,unlikeﬁxed-orderMarkovchains,theprobabilitytopredictthenext symboldependsonapossiblyunboundedpartofthepast,thelengthofwhichdepends onthepastitself.Theserelevantpartsofpastsarecalled contexts.Theyarestoredin a contexttree.Witheachcontext,aprobabilitydistributionisassociated,prescribing theconditionalprobabilityofthenextsymbol,giventhiscontext.

ChapterwrittenbyPeggyC ÉNAC ,BrigitteC HAUVIN ,FrédéricPACCAUT andNicolas P OUYANNE .

VLMCsarenowwidelyusedasrandommodelsforcharacterstrings.Theywere introducedinRissanen(1983)toperformdatacompression.Whentheyhaveafinite memory,theyprovideaparsimoniousalternativetofixed-orderMarkovchainmodels, inwhichthenumberofparameterstoestimategrowsexponentiallyfastwiththeorder; theyarealsoabletocapturefinerpropertiesofcharactersequences.Whentheyhave infinitememory–thiswillbeourcaseofstudyinthischapter–theyareatractable waytobuildnon-Markovmodelsandtheymaybeconsideredasasubclassof“chaînes àliaisonscomplètes”(DoeblinandFortet1937)or“chainswithinfiniteorder”(Harris 1955).

VLMCsareusedinbioinformatics,linguisticsandcodingtheorytomodelhow wordsgrowortoclassifywords.Inbioinformatics,bothforproteinfamiliesandDNA sequences,identifyingpatternsthathaveabiologicalmeaningisacrucialissue.Using VLMCasamodelenablesonetoquantifytheinﬂuenceofameaningpatternby givingatransitionprobabilityonthefollowingletterofthesequence.Inthisway,these patternsappearascontextsofacontexttree.Notethattheirlengthmaybeunbounded (BejeranoandYona2001).

Inaddition,ifthecontexttreeisrecognizedtobeasignatureofafamily(say,of proteins),thisgivesanefﬁcientstatisticalmethodtotestwhetherornottwosamples belongtothesamefamily(Busch etal.2009).

Therefore,estimatingacontexttreeisanissueofinterestandmanyauthors (statisticiansornot,appliedornot)stressthefactthattheheightofthecontexttree shouldnotbesupposedtobebounded.ThisisthecaseinGalvesandLeonardi (2008)wherethealgorithm CONTEXT isusedtoestimateanunboundedcontexttree, orinGarivierandLeonardi(2011).Furthermore,asexplainedinCsiszárandTalata (2006),theheightoftheestimatedcontexttreegrowswiththesamplesize,sothat estimatingacontexttreebyassuming apriori thatitsheightisboundedisnot realistic.

Thereisextensiveliteratureontheconstructionofefficientestimatorsofcontext trees,aswellforfiniteorinfinitecontexttrees.Thischapterisnotareviewofstatistics issues,whichwouldalreadyberelevantforfinitememoryVLMC.Thisisastudy oftheprobabilisticpropertiesofinfinitememoryVLMCasrandomprocesses,and morespecificallyofthemainpropertyofinterestforsuchprocesses:existenceand uniquenessofastationarymeasure.

Ashasalreadybeensaid,VLMCareanaturalgeneralizationtoinfinitememory ofMarkovchains.Itisusualtoindexasequenceofrandomvariablesforminga Markovchainwithpositiveintegersandtomaketheprocessgrowtotheright.The maindrawbackofthishabitforaninfinitememoryprocessisthatthesequenceofthe processisreadfromlefttoright,whereasthe(possiblyinfinite)sequencegivingthe pastneededtopredictthenextsymbolisreadinthecontexttreefromrighttoleft,

thusgivingrisetoconfusionandlackofreadability.Forthisreason,inthischapter, theVLMCgrowstotheleft.Inthisway,boththeprocesssequenceandthememory inthecontexttreearereadfromlefttoright.

Classicalrandomwalkshave independent andidenticallydistributedincrements. Intheliterature, persistent randomwalks(PRMs),alsocalled Goldstein-Kacrandom walks or correlatedrandomwalks,refertorandomwalkshavingaMarkovchainof ﬁniteorderasanincrementprocess.Forsuchwalks,thedynamicsoftrajectorieshas ashortmemoryofgivenlengthandtherandomwalkitselfisnotMarkoviananymore. Whathappenswhenevertheincrementsdependona non-bounded pastmemory?

Considerawalkeron Z,allowedtoincrementitstrajectoryby 1 or 1 ateach stepoftime.Assumethattheprobabilitytokeepthecurrentdirection ±1 depends onthetimealreadyspentinthesaiddirection–thedistributionofincrementsthus actsasareinforcementofthedependencyfromthepast.Moreprecisely,theprocess ofincrementsofsuchaone-dimensionalrandomwalkisaMarkovchainontheset of(right-)infinitewords,withvariable–andunbounded–lengthmemory:aVLMC. TheconcernedVLMCisdefinedinsection1.3.1.Itisbasedonacontexttreecalled a doublecomb.Later,section1.3.2dealswithatwo-dimensionalpersistentrandom walkdefinedinananalogousmanneron Z2 byaVLMCbasedonacontexttreecalled a quadruplecomb

Theserandomwalksthathaveanunboundedpastmemorycanbeseenasa generalizationof“directionallyreinforcedrandomwalks(DRRW)”introducedby Mauldin etal.(1996),inthesensethatthepersistencetimesareanisotropicones.For aone-dimensionalrandomwalkassociatedwithadoublecomb,acomplete characterizationofrecurrenceandtransience,intermsofchanging(ornot)direction probabilities,isgiveninsection1.3.1.Moreprecisely,whenoneoftherandomtimes spentinagivendirection(theso-called persistencetimes)isanintegrablerandom variable,therecurrencepropertyisequivalenttoaclassicaldrift-vanishing.Inall othercases,thewalkistransientunlesstheweightofthetaildistributionsofboth persistenttimesareequal.Intwo-dimensionalrandomwalk,sufﬁcientconditionsof transienceofrecurrencearegiveninsection1.3.2.

Actually,becauseoftheveryspeciﬁcformoftheunderlyingdrivingVLMC, thesePRWsturnouttobeinone-to-onecorrespondencewithso-called Markov additiveprocesses.Section1.5examinesthecloselinksbetweenPRWs,Markov additiveprocesses,semi-MarkovchainsandVLMC.

Insection1.2,thedeﬁnitionofageneralVLMCandacoupleofexamplesare given.Insection1.3,thePRWsaredeﬁnedandknownresultsontheirrecurrence propertiesarecollected.Inviewofsection1.5whereweshowhowPRWandVLMC meetthroughtheworldofsemi-Markovchains,section1.4isdevotedtoresults–togetherwithaheuristicapproach–ontheexistenceandunicityofstationary measuresforaVLMC.

1.2.VLMCs:deﬁnitionofthemodel

Let A beaﬁniteset,calledthe alphabet.Here A willmostoftenbethestandard alphabet A = {0, 1},butalso A = {d,u} (for down and up)or A = {n, e, w, s} (for thecardinaldirections).Let

R = {αβγ : α,β,γ, ···∈A}

bethesetof right-inﬁnite wordsover A,writtenbysimpleconcatenation.AVLMC on A,deﬁnedbelowandmostoftendenotedby (Un )n∈N ,isaparticulartypeof R-valueddiscretetimeMarkovchainwhere:

–theprocessevolvesbetweentime n andtime n +1 byaddingoneletter onthe left of Un ;

–thetransitionprobabilitiesbetweentime n andtime n +1 dependonaﬁnite–butnotbounded–preﬁx1 ofthecurrentword Un .

Givingaformalframeofsuchaprocessleadstothefollowingdeﬁnitions.Fora completepresentationofVLMC,onecanalsorefertoCénac etal.(2012).

Asusual,a treeon A isaset T offinitewords–namelyasubsetof ∪n∈N An –whichcontainstheemptyword ∅ (the root of T )andwhichisprefix-stable:forall finitewords u,v , uv ∈T =⇒ u ∈T .Atreeismadeof internalnodes (u ∈T is internalwhen ∃α ∈A, uα ∈T )andof leaves (u ∈T isaleafwhenithasnochild: ∀α ∈A, uα/∈T ).

D EFINITION 1.1(Contexttree).– A contexttree on A isasaturatedtreeon A having anatmostcountablesetofinﬁnitebranches.

Thetree T is saturated wheneveranyinternalnodehas #(A) children:forany finiteword u andforany α ∈A, uα ∈T =⇒ (∀β ∈A,uβ ∈T ).Aright-infinite wordon A isan infinitebranch of T whenallitsfiniteprefixesbelongto T .

FollowingthevocabularyintroducedbyRissanen,a context ofthetreeisaleaf oraninfinitebranch.Afiniteorright-infinitewordon A isan externalnode whenit isneitherinternalnoracontext.SeeFigure1.1whichillustratesthesedefinitions,as wellasthe pref functiondefinedhereunder.

D EFINITION 1.2(pref function).– Let T beacontexttree.If w isanyexternalnode oranycontext,thesymbol pref w denotesthelongest(finiteorinfinite)prefixof w thatbelongsto T

1Infact,aninﬁnitepreﬁxmightbeneededinadenumerablenumberofcases.

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter7

Inotherwords, pref w istheonlycontext c forwhich w = c Foramorevisual presentation,hang w byitshead(itsleft-mostletter)andinsertitintothetree;theonly contextthroughwhichthewordgoesoutofthetreeisits pref

Aninternalnode

Figure1.1. Acontexttreeonthealphabet A = {0, 1}.Thedottedlinesarepossiblythe beginningofinﬁnitebranches.Anywordthatwrites 1000 ··· ,liketheonerepresented bythedashedline,admits 1000 asa pref .Foracolorversionofthisﬁgure,see www.iste.co.uk/barbu/data.zip

Withthesedeﬁnitions,itisnowpossibletodeﬁneaVLMC.

D EFINITION 1.3(VLMC).– Let T beacontexttree.Foreverycontext c of T ,let qc be aprobabilitymeasureon A.The VLMC deﬁnedby T andbythe (qc )c isthe R-valued discrete-timeMarkovchain (Un )n∈N deﬁnedbythefollowingtransitionprobabilities:

n ∈ N, ∀α ∈A,

TogetarealizationofaVLMCasaprocesson R,takea(random)rightinﬁnite word

U0 = X0 X 1 X 2 X 3

Ateachstepoftime n ≥ 0,onegets Un+1 byaddingarandomletter Xn+1 onthe leftof Un :

Un+1 = Xn+1 Un = Xn+1 Xn X1 X0 X 1 X 2 undertheconditionaldistribution[1.1].

R EMARK 1.1.– Probabilizing acontexttreeconsists,asindeﬁnition1.3,ofendowing itwithafamilyofprobabilitymeasuresonthealphabet,indexedbythesetofcontexts. Thisvocabularyisusedbelow.

R EMARK 1.2.–Assumethatthecontexttreeisﬁniteanddenoteitsheightby h;in thiscondition,theVLMCisjustaMarkovchainoforder h on A.Onthecontrary, whenthecontexttreeisinﬁnite,andthisismainlyourcaseofinterest,theVLMCis generally not aMarkovprocesson A

E XAMPLE 1.1.–Take A = {n, e, w, s} asan(ordered)alphabet,sothatthedaughters ofaninternalnodearerepresented,asshownontheleftsideofFigure1.2.Makingthe transitionprobabilities P (Un+1 = αUn |Un ) dependonlyonthelengthofthelargest preﬁxoftheform nk (k ≥ 0)of Un amountstotakingacombasacontexttree,as shownontherightsideofFigure1.2.Itsﬁnitecontextsarethe nk α where k ≥ 0 and α ∈A\{n}

n e w s

Figure1.2. Ontheleft:howonecanrepresenttreeson A = {n, e, w, s}.Ontheright,theso-called leftcomb on A = {n, e, w, s}

E XAMPLE 1.2.–Takeagain A = {n, e, w, s} asanalphabet.Makingthetransition probabilities P (Un+1 = αUn |Un ) dependonlyonthelengthofthelargestpreﬁxof theform αk (k ≥ 1)of Un ,where α is any letter,amountstotakinga quadruplecomb asacontexttree,asshownontherightsideofFigure1.3.Inthesamevein,ifone takes A = {u,d},the doublecomb isthecontexttree,asshownontheleftsideof Figure1.3.InthecorrespondingVLMC,thetransitionsdependonlyonthelengthof thelastcurrentrun uk or dk , k ≥ 1.Thedoublecombandthequadruplecombare usedbelowtodeﬁnePRWs.

Figure1.3. Thedoublecombandthequadruplecomb

E XAMPLE 1.3.–Take A = {0, 1} (naturallyorderedforthedrawings).Theleftcomb ofrightcombs,shownontheleftsideofFigure1.4,isthecontexttreeofaVLMCthat makesitstransitionprobabilitiesdependonthelargestpreﬁxof Un oftheform 0p 1q .

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter9

Ifonehastotakeintoconsiderationthelargestpreﬁxoftheform 0p 1q or 1p 0q ,onehas tousethedoublecombofoppositecombs,asshownontherightsideofFigure1.4.

Figure1.4. Contexttreeson A = {0, 1}:theleftcombofrightcombs (ontheleft)andadoublecombofoppositecombs(ontheright)

D EFINITION 1.4(Non-nullness).– AVLMCiscalled non-null whennotransition probabilityvanishes,i.e.when qc (α) > 0 foreverycontext c andforevery α ∈A

Non-nullnessappearsbelowasanirreducibility-likeassumptionmadeonthe drivingVLMCofPRWsandforexistenceandunicityofaninvariantprobability measureforageneralVLMCaswell.

1.3.DeﬁnitionandbehaviorofPRWs

Inthissection,theso-called PRWs aredeﬁned.APRWisarandomwalkdriven bysomeVLMC.Indimensionsoneandtwo,resultsontransienceandtherecurrence ofPRWaregiven.TheseresultsaredetailedandproveninCénac etal.(2018b,2013) indimensiononeandinCénac etal.(2020)indimensiontwo.

1.3.1. PRWsindimensionone

Inthissection,wedealwithone-dimensionalPRWs.Notethat,contrarytothe classicalrandomwalk,aPRWisgenerallynotMarkovian.Let A := {d,u} = {−1, 1} (d fordownand u forup)andconsiderthe doublecomb onthisalphabetasacontext tree,probabilizeitanddenoteby (Un )n arealizationoftheassociatedVLMC.The nth increment Xn ofthePRWisgivenastheﬁrstletterof Un :deﬁnethepersistent randomwalk S =(Sn )n≥0 by S0 =0 and,for n ≥ 1,

Sn := n =1 X , [1.2] sothatforany n ≥ 1, m ≥ 0,

P (Sm+1 = Sm +1|Um = dn u...)= qdn u (u)

P (Sm+1 = Sm 1|Um = un d...)= qun d (d).

Furthermore,forthesakeofsimplicityandwithoutlossofgenerality,wecondition thewalktostartalmostsurely(a.s.)from {X 1 = u,X0 = d} –thisamountsto changingtheoriginoftime.Inthismodel,awalkeronalinekeepsthesamedirection withaprobabilitythatdependsonthediscretetimealreadyspentinthedirectionthe walkeriscurrentlymoving(seeFigure1.5).Thismodelcanbeseenasageneralization ofDRRWsintroducedinMauldin etal.(1996).

Figure1.5. Aone-dimensionalPRW.Foracolorversion ofthisﬁgure,seewww.iste.co.uk/barbu/data.zip

Takingdifferentprobabilizedcontexttreeswouldleadtodifferentprobabilistic impactsontheasymptoticbehaviorofresultingPRWs.Moreover,the characterizationoftherecurrentversustransientbehaviorisdifﬁcultingeneral.We statehereexhaustiverecurrencecriteriaforPRWsdeﬁnedfromadoublecomb.

Inordertoavoidtrivialcases,weassumethat S cannotbefrozeninoneofthetwo directionswithapositiveprobability.Therefore,wemakethefollowingassumption.

SSUMPTION 1.1(Finitenessofthelengthofruns).–Forany

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter11

Let τ u n and τ d n be,respectively,thelengthofthe nth riseandofthe nth descent.

Then,byarenewal-typeproperty(seeCénac etal.2013,proposition2.3), (τ d n )n≥1 and (τ u n )n≥1 areindependentsequencesofi.d.d.randomvariables.Theirdistribution tailsarestraightforwardlygivenby:forany α,β ∈{u,d}, α = β and n ≥ 1,

Notethatassumption1.1amountstosupposingthatthe persistencetimes τ d n and τ u n area.s.ﬁnite.The jumptimes (orbreakingtimes)are: B0 =0 and,for n ≥ 1,

Inordertodealwithamoretractablerandomwalkbuiltwiththepossibly unboundedbuti.d.d.increments Yn :=

τ d n ,weintroducetheunderlying skeleton randomwalk (Mn )n≥1 ,whichistheoriginalwalkobservedattherandom timesofup-to-downturns:

Twomainquantitiesplayakeyroleintheasymptoticbehavior,namelythe expectationsofthelengthsofruns:withformula[1.4],let

Actually, Θd and Θu arealreadydiscussedinCénac etal.(2013,propositionB1), whereitisshownthatthedrivingVLMCofaone-dimensionalPRWadmitsaunique invariantprobabilitymeasureifandonlyif Θd < ∞ and Θu < ∞.

Notethattheexpectationof Y1 iswelldeﬁnedin [−∞, +∞] wheneveratleastone ofthepersistencetimes τ u 1 or τ d 1 isintegrable.Thus,assoonas Θd < ∞ or Θu < ∞, let

M := E[Y1 ]=Θu Θd

and

Anelementarycomputationshowsthat E (Mn )= ndM and E (Sn ) ∼ ndS when n tendstoinﬁnity.Thus, dM and dS appearasasymptoticdriftswhenthewalks (Mn )n and (Sn )n respectively,turnouttobetransient(seeTable1.1).Thebehavior ofthewalkalsodependsonquantities Jα|β ,deﬁnedfor α and β ∈A,α = β by:

P(τ α 1 = n)

k =1 P(τ β 1 ≥ k )

Acompleteandusablecharacterizationoftherecurrenceandthetransienceof thePRWintermsoftheprobabilitiestopersistinthesamedirectionortoswitchis giveninproposition1.1.ItsproofreliesonacriterionofErickson(1973),appliedto theskeletonwalk (Mn )n ,whichissimplertodealwithbecauseitsincrementsare independent.

P ROPOSITION 1.1.–Underthenon-nullnessassumptionandassumption1.1,the randomwalk (Sn )n isrecurrentortransientasdescribedinTable1.1.

Drifting +∞

Drifting +∞ Recurrent dS > 0 dS =0 Drifting −∞ dS < 0 Θ

d < ∞

Table1.1. Recurrenceversustransience(drifting) for (Sn )n indimensionone

Themostfruitfulsituationemergeswhenbothrunningtimes τ u 1 and τ d 1 have infinitemeans.Inthatcase,therecurrencepropertiesof (Sn )n arerelatedtothe behavioroftheskeletonrandomwalk (Mn )n definedin[1.6],thedriftofwhich, dM ,isnotdefined.Thus,thebehaviorof (Sn )n dependsonthecomparisonbetween thedistributiontailsof τ u 1 and τ d 1 definedin[1.4],expressedbythequantities Jα|β . Notethatthecasewhenboth Ju|d and Jd|u arefinitedoesnotappearinthetable sinceitwouldimplythat Θu < ∞ and Θd < ∞ (seeErickson1973).

Inallthreeothercases,thedrift dS iswelldeﬁnedandthePRWisrecurrentifand onlyif dS =0.Inthatcase, lim n→∞ Sn n = dS =0.Notethatmodifyingonetransition qc transformsarecurrentPRWintoatransientone,since dS becomesnon-zero.

1.3.2. PRWsindimensiontwo

Takethealphabet A := {n, e, w, s}.Here, (e, n) standsforthecanonicalbasis of Z2 , w = e and s = n.Hence,theletters e, n, w and s standformovestotheeast, north,westandsouth,respectively.Havinginmindarandomwalkwithincrements in A,anywordoftheform αβ , α,β ∈A,α = β iscalleda bend.Forthesakeof simplicity,weconditionthewalktostarta.s.withan ne bend: {X 1 = n,X0 = e}.

Figure1.6. Awalkindimensiontwo.Foracolorversion ofthisﬁgure,seewww.iste.co.uk/barbu/data.zip

Takeanon-nullVLMCassociatedwithaquadruplecombon A,asshownin

Figure1.3:thecontextsare αn β for α,β ∈A,α = β , n ≥ 1 andtheattached probabilitydistributionsaredenotedby qαn β .Thetwo-dimensionalPRW (Sn )n is deﬁned,usingthisVLMC,asinformula[1.2].

Contrarytotheone-dimensionalPRWs,asdetailedbelow,theprobabilityto changedirectiondependsonthetimespentinthecurrentdirectionbutalsoonthe previousdirection.Asindimensionone,weintendtoavoidthat S remainsfrozenin oneofthefourdirectionswithapositiveprobability.Therefore,wemakethe followingassumption,analogoustoassumption1.1indimensiontwo.

A SSUMPTION 1.2(Finitenessofthelengthofruns).–Forany α,β ∈{n, e, w, s}, α = β ,

Let (Bn )n≥0 bethe breakingtimes deﬁnedinductivelyby B0 =0 and Bn+1 =inf {k>Bn : Xk = Xk 1 }

Asindimensionone,assumption1.2impliesthatthebreakingtimes Bn area.s. ﬁnite.

Deﬁnetheso-called internalchain (Jn )n≥0 by J0 = ne and,forall n ≥ 1,

Letusillustratetheserandomvariableswithasmallexample,inwhich: B1 =4, B2 =7, J0 = X 1 X0 , J1 = XB0 XB1 = X

Theprocess (Jn )n≥0 isanirreducibleMarkovchainonthesetofbends S := {αβ |α ∈A,β ∈A,α = β }.ItsMarkovkernelisdeﬁnedby:forevery β,α,γ ∈A with β = α and α = γ ,

(βα; αγ ):=

thenumbers P (αβ,γδ ) being 0 foreverycoupleofbendsnotofthepreviousform. Remarkthatthenon-nullnessassumption(seedeﬁnition1.4)impliestheirreducibility of (Jn )n anditsaperiodicity.Thestatespace S isﬁnitesothat (Jn )n ispositive recurrent:itadmitsauniqueinvariantprobabilitymeasure πJ .

Denote T0 =0 and Tn+1 := Bn+1 Bn forevery n ≥ 0.Thesewaitingtimes (alsocalled persistencetimes)arenotindependent,contrarytotheone-dimensional case.The skeletonrandomwalk (Mn )n≥0 on Z2 –whichisthePRWobservedatthe breakingtimes–isthendeﬁnedas

Notethat (Mn )n isgenerallynotaclassicalRWwithi.d.d.increments. Nevertheless,takingintoaccounttheadditionalinformationgivenbytheinternal

VariableLengthMarkovChains,PersistentRandomWalks:ACloseEncounter15

Markovchain (Jn )n ,then (Jn ,Mn )n isaMarkovadditiveprocess(seeÇinlar1972) asitwillappearinsection1.5.

Here, (Jn )n ispositiverecurrentbutthisdoesnotimplytherecurrenceof (Sn )n or (Mn )n .Moreover, (Sn )n and (Mn )n mayhavedifferentbehaviors.Explicit, necessaryandsufﬁcientconditionsfortherecurrenceof (Mn )n intermsof characteristicfunctionsandconvergenceofsuitableseriesaregiveninCénac etal (2020,theorem2.1).Thefollowingpropositionstatesadichotomybetweensome recurrenceversustransiencephenomenon.

T HEOREM 1.1.–Undernon-nullnessassumption,thefollowingdichotomyholds:

i)theseries n P (Mn =0) divergesifandonlyiftheprocess (Mn )n isrecurrent inthefollowingsense: ∃r> 0, P liminf n→∞ Mn <r =1

ii)theseries n P (Mn =0) convergesifandonlyiftheprocess (Mn )n is transientinthefollowingsense:

Doestherecurrence(respectively,thetransience)of (Mn )n and (Sn )n occurat thesametime?Theanswertothis20-year-oldquestionisno:

T HEOREM 1.2(DeﬁnitiveinvalidationoftheconjectureinMauldin etal.1996).–ThereexistrecurrentPRWs (Sn )n havinganassociatedtransientskeleton (Mn )n .

Supposingthatthepersistencetimedistributionsarehorizontallyandvertically symmetricisanaturalnecessaryconditionfortherandomwalk (Sn )n tobe recurrent.OneexampleisgivenbytheDRRW,originallyintroducedinMauldin etal.(1996)(seeFigure1.7).Someparticularvaluesofthetransitionprobabilities qαn β providecounterexamples.ItisshowninCénac etal.(2020)thatthe correspondingdistributionsofthepersistencetimesmustbenon-integrable.In section1.5,thisnon-integrabilitywillberelatedtothenon-existenceofanyinvariant probabilitymeasureforthedrivingVLMC.

1.4.VLMC:existenceofstationaryprobabilitymeasures

ConsideraVLMCdenotedby U =(Un )n≥0 ,deﬁnedbyapair (T ,q ) where T isacontexttreeonanalphabet A and q =(qc )c∈C afamilyofprobabilitymeasures on A,indexedbythecontextsof T .Aprobabilitymeasure π on R is stationary or invariant (withregardto U )whenever π isthedistributionofevery Un assoonasitis thedistributionof U0 .Thequestionofinterestconsistshereofﬁndingconditionson

(T ,q ) fortheprocesstoadmitatleastone–orauniqueone–stationaryprobability measure.Theheuristicpresentationaimstoshowhowcombinatoricobjects,namely the α-LIS ofcontexts,andconditionalprobabilities,the cascades,naturallyemerge.

Figure1.7. Theoriginaldirectionallyreinforcedrandomwalk(DRRW). Foracolorversionofthisﬁgure,seewww.iste.co.uk/barbu/data.zip

Assumethat π isastationaryprobabilitymeasureon R:

–Firststep:ﬁnitewords.Since R isendowedwiththecylinder σ -algebra, π is determinedbyitsvalues π (w R) onthecylinders w R,where w runsoverallﬁnite wordson A.

–Secondstep:longestinternalsuffixesofwords.Assumethat e isafinite non-internalwordandtake a ∈A.Then,its pref iswelldefinedand,becauseof formula[1.1],since π isstationary,

π (αeR)= qpref(e) (α) × π (eR) . [1.15]

Iteratingthisformulaasfaraspossibleleadstothefollowingdefinitions. Consideranynon-emptyfiniteword w .Itisuniquelydecomposedas w = pαs = β1 β2 β3 ··· β αs,wherethe α andthe β arelettersand s isthe longest internalsuffix of w .Theinteger isnon-negativeand p = β1 β2 ··· β isaprefixof w thatmaybeempty–inwhichcase =0.

D EFINITION 1.5(Lisand α-LIS).– Withthesenotations,thelongestinternalsufﬁx s isshortenedasthe lis of w .Theword αs iscalledthe α-LIS of w

D EFINITION 1.6(Cascade).– Withthenotationabove,the cascade of w istheproduct casc(w )= qpref(β2 ···β αs) (

. [1.16]

Notethatthisdeﬁnitionmakessensebecauseallthe βk ··· β αs arenon-internal words, k ≥ 2.Moreover,if w = αs where s isinternal,then =0 and casc(w )=1.