[FREE PDF sample] Random process analysis with r marco bittelli ebooks

Page 1


Random Process Analysis With R Marco Bittelli

Visit to download the full and correct content document: https://ebookmass.com/product/random-process-analysis-with-r-marco-bittelli/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Data Analysis for the Life Sciences with R 1st Edition

https://ebookmass.com/product/data-analysis-for-the-lifesciences-with-r-1st-edition/

Bayesian Analysis with Excel and R 1st Edition Conrad

Carlberg

https://ebookmass.com/product/bayesian-analysis-with-exceland-r-1st-edition-conrad-carlberg/

Financial Analysis with Microsoft Excel 8th Edition

Timothy R. Mayes

https://ebookmass.com/product/financial-analysis-with-microsoftexcel-8th-edition-timothy-r-mayes/

Handbook of Regression Analysis With Applications in R, Second Edition Samprit Chatterjee

https://ebookmass.com/product/handbook-of-regression-analysiswith-applications-in-r-second-edition-samprit-chatterjee/

Separation Process Engineering: Includes Mass Transfer Analysis (4th

https://ebookmass.com/product/separation-process-engineeringincludes-mass-transfer-analysis-4th/

Multiplayer Game Development with Unreal Engine 5 (for John Malkovec) Marco Secchi

https://ebookmass.com/product/multiplayer-game-development-withunreal-engine-5-for-john-malkovec-marco-secchi/

Analysis with an introduction to proof. Fifth Edition, Pearson New International Edition Steven R. Lay

https://ebookmass.com/product/analysis-with-an-introduction-toproof-fifth-edition-pearson-new-international-edition-steven-rlay/

Schaum's Outline of Probability, Random Variables, and Random Processes, Fourth Edition Hwei P. Hsu

https://ebookmass.com/product/schaums-outline-of-probabilityrandom-variables-and-random-processes-fourth-edition-hwei-p-hsu/

978-0131382275 Separation Process Engineering: Includes Mass Transfer Analysis

https://ebookmass.com/product/978-0131382275-separation-processengineering-includes-mass-transfer-analysis/

RANDOMPROCESSANALYSISWITHR

RandomProcessAnalysiswithR

UniversityofBologna

RobertoOlmi

NationalResearchCouncil,Italy

RodolfoRosa

NationalResearchCouncil,Italy

GreatClarendonStreet,Oxford,OX26DP, UnitedKingdom

OxfordUniversityPressisadepartmentoftheUniversityofOxford. ItfurtherstheUniversity’sobjectiveofexcellenceinresearch,scholarship, andeducationbypublishingworldwide.Oxfordisaregisteredtrademarkof OxfordUniversityPressintheUKandincertainothercountries ©MarcoBittelli,RobertoOlmi,andRodolfoRosa2022

Themoralrightsoftheauthorshavebeenasserted

Impression:1

Allrightsreserved.Nopartofthispublicationmaybereproduced,storedin aretrievalsystem,ortransmitted,inanyformorbyanymeans,withoutthe priorpermissioninwritingofOxfordUniversityPress,orasexpresslypermitted bylaw,bylicenceorundertermsagreedwiththeappropriatereprographics rightsorganization.Enquiriesconcerningreproductionoutsidethescopeofthe aboveshouldbesenttotheRightsDepartment,OxfordUniversityPress,atthe addressabove

Youmustnotcirculatethisworkinanyotherform andyoumustimposethissameconditiononanyacquirer PublishedintheUnitedStatesofAmericabyOxfordUniversityPress 198MadisonAvenue,NewYork,NY10016,UnitedStatesofAmerica

BritishLibraryCataloguinginPublicationData Dataavailable

LibraryofCongressControlNumber:2022935991

ISBN978–0–19–886251–2(hbk) ISBN978–0–19–886252–9(pbk) DOI:10.1093/oso/9780198862512.001.0001

Printedandboundby CPIGroup(UK)Ltd,Croydon,CR04YY

LinkstothirdpartywebsitesareprovidedbyOxfordingoodfaithand forinformationonly.Oxforddisclaimsanyresponsibilityforthematerials containedinanythirdpartywebsitereferencedinthiswork.

Whatmenreallywantisnotknowledgebutcertainty

Preface

Arandomorstochasticprocessisaprocessthatcanbedefinedbyrandomvariables. Inotherwords,itisaprocessinvolvingobservationswhoseoutcomeateverytime instantisnotcertain.Mathematically,thisreflectsinworkingwithfunctionswhose argumentsarecharacterizedbyaprobabilitydistributioninsteadofhavingacertain value.

Thechoiceofdescribingaphenomenonbydeterministicorprobabilisticlawsmay dependuponseveralreasons.Thephenomenonappearstobeconceptualizedonlyasa randomprocess.Ontheotherhand,weknowthatitcanbedescribedbydeterministic laws,butduetolackofknowledgeaboutsystemparametersorhighcomplexity,we decidetomodelitasarandomprocess.

Overall,whatisarandomprocess?Israndomnessaninherentfeatureofnature orsimplyourinabilitytodescribeitwithdeterministiclaws?Therefore,istherean inherentrandomnessinnatureorisitourlackofknowledgethatbringustodescribe itasarandomprocess?

Thesedifferencescanbereferredtothe‘objective’and‘subjective’viewpointsof randomness.Thefirstviewpointconsidersrandomnessaninherentfeatureofnature, whilethesecondconceptualizesitasan‘anthropomorphic’–‘subjective’interpretation ofnatureduetolackofknowledge.Isthereatruedistinctionbetweenthesetwo viewpoints?Theseideashavebeenthesubjectofscientificandphilosophicaldiscussion forcenturiesandabriefdiscussionispresentedinthefinalchapterofthisbook.

However,whenthephenomenawearestudyingappearasstochasticprocesses andhavebeensubjectedtorigorousmathematicaltestsanddonotrevealafully deterministicframework,probabilistictoolsmustbeemployed.Inthisbookwepresent concepts,theoryandcomputercodewritteninRforrandomprocessanalysis.

Acknowledgements

WearegratefultoDr.SonkeAdlungforbeingmostcooperative,considerateand helpfulduringthepublicationprocess.

9.2BayesianspectralanalysisappliedtoRADARtargetdetection309

Introduction

1.1Introduction

Thesubjectofrandomorstochasticprocessanalysisisaveryimportantpartof scientificinquiry.Thetermsstochasticandrandomprocessareusedinterchangeably. Randomprocessesareusedasmathematicalmodelsforalargenumberofphenomenainphysics,chemistry,biology,computerscience,informationtheory,economics, environmentalscienceandothers.Manybooksaboutrandomprocesseshavebeen publishedovertheyears.Overtime,itbecomemoreandmoreimportanttoprovide notonlythetheoryandexamplesregardingaspecificprocesses,butalsothecomputercodeandexampledata.Therefore,thisbookisintendedtopresentconcepts, theoryandcomputercodewritteninR,thathelpsreaderswithlimitedinitialknowledgeofrandomprocessestobecomeoperationalwiththematerial.Eachsubjectis describedandproblemsareimplementedinRcode,withrealdatacollectedinexperimentsperformedbytheauthorsortakenfromtheliterature.Withthisintent,the readercanpromptlyapplytheanalysistoherorhisowndata,makingthesubject operational.Consistentwithmoderntrendsinuniversityinstruction,thisbookmake readersactivelearners,withhands-oncomputerexperimentsdirectingreadersthrough applicationsofrandomprocessanalysis(RPA).Eachchapterisalsointroducedwith abriefhistoricalbackground,withspecificreferences,forfurtherreadingsabouteach subject.

Chapter 2 providesabriefhistoricalbackgroundabouttheoriginofrandomprocessestheory.InChapter 3,thereaderwillfindanin-depthdescriptionofthefundamentaltheoryofstochasticprocesses.Thechapterintroducesconceptsofstationarity,ergodicity,MarkovprocessesandMarkovchains.Examplesfrommathematics andphysicsarepresentedtoexemplifyrandomprocessessuchastheBuffon’sneedleandtheEhrenfestUrnModel.InChapter 4,Poisson’sprocessesarepresented. Derivationofthewell-knowndistributionispresentedaswellashomogeneousand non–homogeneousPoissonprocessesarediscussed.Onethecornerstonesofrandom processes,randomwalk,isdescribedinChapter 5.Theconceptsofabsorbingand reflectingbarriersarepresentedalongwiththegambler’sruinexample,aswellasa two-dimensionalrandomwalkcodeanddiscussionaboutrandomwalkappliedtothe processofBrownianmotion.

Chapter 6 entersintostochastictimeseriesanalysiswiththedescriptionofmoving average,autoregressiveandautoregressivemovingaverageprocesses.Seasonaltime seriesanalysisisintroducedwithexamplesappliedtomeasuresoftemperatureand

waterbudget.Theanalysisofrandomprocessesrequiresathoroughunderstanding ofspectrumandnoiseanalysis.InChapter 7 Fouriertransformsfordeterministic andstochastictimeseriesarepresented,withapplicationtospectrumanalysis.The singularspectrumanalysistechniqueisalsopresented,foranalysisandremovalof trends.

Chapter 8 presentstheMarkovChainMonteCarlomethodwithadescriptionof probablythemostfamousalgorithminstochastictheory,theMetropolisalgorithm. Afterathroughdescriptionofthetheoryandcode,thetravellingsalesmanproblemis presented,withthesimulatedannealingapproach.TheconceptofBayesiananalysis isherebrieflyintroducedandleadingintothenextchapter.Chapter 9 focusona cornerstoneofmodernstatistics,Bayesianinference,whichisappliedtoadescription ofautoregressiveprocesses.Afterintroducingthemainconcepts,examplesapplied torealdataoftemperatureandCO2 concentrationinAntarctica,aswellasradar detection,arepresented.BayesiananalysisofthePoissonprocessispresentedwiththe waiting-timeparadox.Thechapterendswithanapplicationtolighthousedetection asaremarkableexampleofBayesianinference.

Randomprocessesareusedastoolsforrandomsearchinminimizationalgorithms, asanalternativetogradient-basedsearchalgorithmsusedforinstanceinleastsquare optimization.GeneticalgorithmsarepresentedinChapter 10,withapplicationto non–linearfitting,autoregressivemovingaveragemodels.Asanexampleofimproved optimizationwithrespecttootherapproaches,thetravellingsalesmanproblemishere solvedwithgeneticalgorithms.Themodellingofstochasticprocessesdependsonthe accuracyoftheestimatorsderivedintheprocessanalysis.Theproblemofaccuracyis discussedinChapter 11,withexamplesonaveragingoftimeseries,batchmeansmethods,movingbootstrapandothertechniquestoimproveaccuracyinrandomprocesses modelling.

Chapter 12 addressesatopicthatisnottraditionallydescribedinbooksabout randomprocesses:spatialanalysis.Itisneverthelessanimportantsubjectdealing withtheapplicationofstatisticalconceptstopropertiesvaryinginspace.Thechapter providesanintroductiontogeostatisticalconceptsandthenpresentanovelapproach, wherespatialandtemporalanalysisarecombinedintoastochasticanalysisofspatiotemporalprocesses.Attheendofthechapter,theoptimizationprocedureforspatial parametersiscomputedalsowithgeneticalgorithm,showingthepossibilityofconnectingandapplyingvarioustechniquespresentedinthebook.

ThebookendswithChapter 13,whichdiscussestheverydefinitionofarandom process,themathematicaldefinitionofrandomnessandadiscussionofthedefinitionof entropies.Thisdiscussionisdevelopedintoageneralframeworkanditsimplications forscientificinquiry.Thebookalsohastwoappendicesprovidingadditionaltools presentedinthemainpartofthebook.

ThecodespresentedinthisbookarewrittenusingtheRStudiointegrateddevelopmentenvironment(IDE).RStudioincludesaconsole,aneditorthatsupports directcodeexecution,aswellastoolsforplotting,debuggingandworkspacemanagement.TherearemanybooksaboutprogramminginRthatcanbeusedasreferenceandinparticularpublicationsandlinkspresentedintheofficialComprehensiveRArchiveNetwork(CRAN)availableat: https://cran.r-project.org/.The

Introduction 3

codesandexampledatawritteninthisbookcanbedowloadedfromthewebsite http://www.marcobittelli.it underthesection Computercodesforbooks.Exercisesarepresentedattheendofeachchapterandsolutionsaredownloadableonthe book’swebsite.

Opensourcelanguagesandrelatedlibrariesaresubjecttochanges,updatesand modifications,thereforethepackagespresentedheremayundergochangesinthefuture.Toobtainspecificinformationanddocumentationaboutalibrary,thefollowing instructionshouldbeused: library(help=GA),whereforexamplethelibrary (GA) forgeneticalgorithmscanbeexplored.Herewelistthelibrariesnecessarytorunthe examplesindifferentchapters:

Chapter 7 requiresthelibrary lubridate;Chapter 9 thelibrary rjags,describedin detailintheAppendix B;Chapter 12 requires ggplot2, gstat, lattice, mapview, GA, quantmod, reshape, sf, sp, stars, tidyverse, xts and zoo andChapter 13 requires entropy, tseriesEntropy.

HistoricalBackground

Itisnotcertainthateverythingisuncertain.

BlaisePascal

2.1ThePhilosopherandtheGambler

Tointroducetheroleofcomputerstudiesinstochasticprocessesanalysis,wewill gobackafewcenturiestotheinventionofprobabilitytheory.Itistheyear1654, accordingtoafamiliarstory(Hacking,1975).AntoineGombaudChevalierdeM´er´e, SieurdeBaussay(1607 1684),askssomequestionsonagameofchancetoBlaise Pascal(1623 1662).Later,Sim´eon DenisPoisson(1781 1840),callsAntoineGombaud‘manoftheworld’andBlaisePascal‘austereJansenist’: Unprobl`emerelatif auxjeuxdehasard,propos´e`aunaust`erejans´enisteparunhommedumonde,a´et´e l’origineducalculdesprobabilit´es (Aproblemaboutgamesofchanceproposedtoan austereJansenistbyamanoftheworldwastheoriginofthecalculusofprobabilities) (Poisson,1837).WeknowthatPascalwasnotonlyaphilosopher,butalsoaphysicist,amathematician,awriter,atheologian.AntoineGambaudwasawriteranda philosopher,notonlyagambler.

WenowdiscussoneofthequestionsthatourChevalieraskedofPascalconcerning thethrowsoftwodice.Wethrowtwodiceandbetonthedoublesix.Howmanythrows doweneedtohaveachangeofwinning?

AntoineGombaudsaidthatagamblingrule,basedonthemathematicalanalogy betweentheprobabilitiesofobtainingsixwithasingledieordoublesixwithacouple ofdice,indicatesthatyouneedatleast24throws,butfromhispersonalgambling experiencesthethrowsmustbeatleast25.Pascal,afterdiscussingtheproblemwith PierredeFermat(1601 1665),answeredthatmathematicsisnotcontrarytoexperience.Letusbrieflydiscussthetopic.

Let A1 betheevent {6, 6} atthefirstthrow,so P {A1} =1/36(thesymbol P { } means‘probability’).Then,theprobabilityof not obtainingtwosixisthatofthe complementary event A1: P A1 =1 1/36=35/36.Atthesecondthrow,theevent A:nonedoublesixatthefirstthrow and nonedoublesixatthesecondthrowhas probability P A = P A1 P A2 =(35/36)2,andsoon.Theprobabilityofnot winningin24throwsis:

sotheprobabilityofwinningis P {A}[24] =1 0.5086=0.4914.Whilein25throwsit is P A [25] =0.4945and P {A}[25] =0.5055.Noticethatthedifferenceisverysmall, andthishonoursthepowerofobservationofourChevalier.Buttherearedoubtsabout thetruthofthisstory(Ore,1960).

LetusimaginebeingtheChevalierdeM´er´ewho,for30nights,goestothegame tabletothrowtwodice.Everynight,weplay20games,with25and24throwseach. Ifinagametwosixesappears,wewinthegame.Attheendofthenight,thatisafter 20games,ifthevictoriesaremorethan10,wehadaluckynight.

Wecandescribethethrowofadieasa stochasticprocess.Inthenextchapter wewillrigorouslydefine‘stochasticprocess’,butherewesimplysaythatstochastic processesaremathematicalmodelsofdynamicalsystemsthatevolveovertimeor spaceinaprobabilisticmanner.

Inourcase,thedynamicalsystemisthedie,thatateachthrowshowsafacewith probability1/6.Thecodebelowisthe‘transcription’inRofthedicegameabove.

##Code_2_1.RThrowoftwodice #25throws

#p_25<-0.5055:probabilityofgettingtwosixesin25throws n.nights<-30#numberofnights n.games<-20#numberofgames n.throws<-25#numberofthrows spot<-c(1:6)#spotsofa6-sideddie p_fair<-rep(1/6,6)#probabilitiesofa"fair"die d6<-numeric() d6T<-numeric() nseed<-50 for(jin1:n.nights) {#looponthenights nseed<-nseed+1 set.seed(nseed) for(lin1:n.games) {#looponthegames d6[l]<-0 for(iin1:n.throws) {#looponthethrows die.1<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie1 die.2<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie2 s.points<-die.1+die.2 if(s.points==12)d6[l]<-1 }#endlooponthethrows }#endlooponthegames d6T[j]<-sum(d6) }#endlooponthenights

d6T

###24throws

##p_24<-0.4914#probabilityofgettingtwosixesin24throws

n.nights<-30#numberofnights n.games<-20#numberofgames n.throws<-24#numberofthrows spot<-c(1:6)#spotsofa6-sideddie

p_fair<-rep(1/6,6)#probabilitiesofa"fair"die d6<-numeric()

d6T<-numeric()

nseed<-500

for(jin1:n.nights)

{#looponthenights nseed<-nseed+1 set.seed(nseed) for(lin1:n.games)

{#looponthegames

d6[l]<-0 for(iin1:n.throws)

{#looponthethrows

die.1<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie1 die.2<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie2 s.points<-die.1+die.2 if(s.points==12)d6[l]<-1

}#endlooponthethrows

}#endlooponthegames

d6T[j]<-sum(d6)

}#endlooponthenights

d6T

Forthesakeofclarity,werepeatfor24throwstheinstructionsfor25throws, changingonlythelines n.throws<-24 and nseed<-500.Thevector d6 atthe beginningofeachgameis0,ifadoublesixisobtaineditsvaluebecomes1,bysumming the n.nights componentsof d6,weknowifwewonorlost.NoticetheRfunction set.seed(.) isusedtosettheinitialseedofthe(pseudo)randomnumbergenerator (RNG).RNGsareinfactfullydeterministicalgorithms,sothesameseedgenerates thesamesequences,changingtheseed,wegetdifferentsequences.Theresultofthe codeabovefor25throwsis:

d6T: 10101291113121012101399813912119111111151212 111081113

Weseethatthetwofirstgamesweretied,wewonthethirdandlostthefourth,and soon.Wewon18gamesoutof30,lost7gamesandtied5games.Theresultofthe codeabovefor24throwsis:

d6T: 1176118171191210131077111661297913121113 13971411

Inthiscase,wewon16games,sothatafter30nightsweareagainwin-making.These resultsappearnottosupporttheChevalier’sclaimthat24throwsarenotenoughto hopetowin.However,suchastatementisnotcorrect.Forinstance,ifweput nseed <-100 with n.throws<-25 wehave:

d6T: 121310101012910812129910119121099896912 13991110

Wewononly10gamesoutof30.Thereasonforthisvariabilityisthatthesamplesize istoosmall,thatisthenumberofgamesisnotenoughtogivereliableresults.Let usincreasethenumberofgames.Foreachnightweplay100gamesandthenights

are180.Inthe Code_2_1.R,itisnow: n.nights<-180 and n.games<-100.The resultisfor25throws:

d6T:

[1]594457535449515450454350504541...

[176]5955505845

thatis,thefirstnightwewon59gamesoutof100,thesecondonly44,andsoon. Theresultisfor24throws:

d6T:

[1]464952485241475340465550514255...

[176]5348524954

Wecanshowtheresultsbothwith25and24throwsasinFig. 2.1.Thefigureis obtainedaddingthefollowinglinesafter d6T,bothfor25and24throws.

Fig.2.1 Numberoftimesforobtainingdoublesixhasthevalueintheabscissa.Solidline: 25throws,dashedline:24throws.

For n.throws<-25 and nseed<-100: lbin<-2 par(lwd=3) hist(d6T,main="",freq=T,xlab="N{6-6}",ylab="counts",cex.lab=1.3,lty=1, border="black",ylim=c(0,40),br=seq(36,64,by=lbin),font.lab=3)

For n.throws<-24 and nseed<-301: hist(d6T,lty=2,br=seq(36,64,by=lbin),add=T)

Notethat freq=T meansthatthehistogramreportsthecountscomponentofthe result,if freq=F thehistogramreportsprobabilitydensities,inthiscasethetotalarea oftheplotis1.

Intheabscissaitreportsthenumberoftimesthedoublesixwasobtainedin100 games.Thesymbol N{6-6} indicatesthedoublesix.Intheordinateitreportsthe

N{6–6}

numberoftimesthiseventoccurredin180nights.Forinstance,thebin(50, 52]is 38forthegameswith25throws,meaningthatin38timesoutof180thedoublesix occurred51or52timesin100games.

Inthehistogrameachbinisclosedontherightandopenontheleft,therefore the50occurrencesofthedoublesixarenotcountedinthe(50, 52]bin,butratherin the(48, 50]bin.Theresultsshowthatinthetotalnumberofgames,whichis180 × 100=18000,thedoublesixoccurredin9204games,thentheprobabilityofadouble sixestimatedin18000gamesis P{A}[25] =9204/18000 ≈ 0.5113(thehatstands forestimate),veryclosetothe‘theoretical’probability.Here‘theoretical’meansthe probabilityofperfectdice,thatisthatexpectedassumingequiprobabilityforeachface. Rigorouslyspeakingweshouldnotdefineprobabilitybycountingon‘equiprobable’ events,becausethatmakesthedefinitionrecursive.However,puttingasidephilosophy, ifthedieisfair,itsfacesare‘equally’likelytooccurandtheprobabilityofoutcomes canbecomputedaswedid.

Forgameswith24throws,adoublesixoccurredin8797games,thentheestimated probabilityofthedoublesixis P{A}[24] ≈ 0 4887,inagreementwiththe‘theoretical’ one P {A}[24] =0.4914.

Letusconsiderthe18000gamesas N independenttrials,eachwithprobability p ofsuccess,andlet n bethenumberofsuccesses.Thestandarderroroftheestimate oftheproportion p of1’sinthe N longsequenceisˆ σ = ˆ p(1 ˆ p)/N ,whereˆ p isan estimateof p,denotedaboveas P{A}[25].Inourcaseˆ σ[25] =0 0037,practicallyequal tothetheoreticalone.Obviouslyalsoˆ σ[24] resultsinthesame.

Wehaveseenthat P{A}[25] =0 5113and P{A}[24] =0 4887,wecouldaskourselves ifthedifferencebetweenthetwomeans ˆ d =0.5113 0.4887=0.0226issignificant. Wecantestforthesignificanceofthedifferencebetweentwopopulationmeansusing theStudent’s t,whichcanbedoneinRbytheline: t.test(z,y,alt="greater",var.equal=TRUE)

where z arethewinningsinthe180nightswith25throwsineachgame(59,54, ...,58,45)and y with24throwsineachgame(46,49,...,49,54).Theoption alt="greater" istospecifyaone–tailedtestand var.equal=TRUE tospecifyequal variances.Thesignificancelevel(p value)is p-value=5.159e-06,thatishighlystatisticallysignificant.Inpassing,20throwsfor30nightsyieldnosignificantdifference, confirmingwhatwesaidaboveaboutthesmallnumberofgames.

Wecantestthedifferenceofthemeansalsobythebootstrapmethod.Wewill discussthismethodinAppendixA,herewelimitourselvestoshowtheresultin Fig. 2.2,whichisalsopresentedasExercise2.1.AscanbeseeninFig. 2.2,the difference ˆ d issignificant,sinceoutof B =1000replications ˆ d∗,noneofthemisless than0.

2.2Comments

ItisdifficulttobelievethatarealgentlemansuchasAntoineGombaudwentto playwithdiceforaboutthreemonthsplaying100gameseachnight.Regardlessof whetherthestoryistrueorfalse,itteachesussomething.Inthedoubtexpressed bytheChevalier,differentconceptsofprobabilityareinvolved.Doubtless,theterm

Fig.2.2 Distributionof1000bootstrapreplications ˆ d∗ ofthedifferenceofthemeansofthe twosamplesobtainedwith25and24throws(samplesize=180).Thedottedlinelocatesthe observeddifference ˆ d =0 0226.Therearenoreplicationslessthan0. ‘probability’hadnotyetappeared,butfromonesidehespeaksaboutatheoretical mathematicalargumenttocalculatethenumberofchancestogettwosixes.Fromthe othersidehereliesonhisexperienceasagamblertoevaluatethefrequenciesofthe results.Itisalreadyrecognizablethetensionsbetweentheoryandexperience,between probabilityassubjectofstudyofapurelymathematicaldisciplineandprobabilityas apropertyofarealphysicalrandomprocessevolvingovertime.

WecouldaskourselveswhytheChevalieraffirmedthatmathematicswaswrong inalleging24throws.Accordingtosomehistorians,perhapshebelievedthat.Ifthe probabilityofsuccessinonethrowis1/n,in m throwsitis m/n,thatis24/36=0.667. Ofcourse,thisreasoningiswrong:probabilitieshavetobemultiplied,notsummed.

Historianssaythatsimilarproblemsaboutgamesofchancewerepresentbefore PascalandFermat.GerolamoCardano,forinstance,wrote LiberdeLudoAleae (‘The BookonGamesofChance’),writtenaboutin1560andpublishedposthumouslyin 1663,inwhichmanyresultsinvariousgameswithdicearediscussed.Inparticular,the chanceofvariouscombinationsofpointsingameswiththreedicearepresented.The sameproblemsaboutthreedicewerestudiedalsobyGalileoGalilei(Todhunter,1865), inabout1610 1620inhis Sopralescopertedeidadi,translatedindifferentways,for instance,‘AnalysisofDiceGames’,‘OnaDiscoveryConcerningDice’,‘Concerningan InvestigationonDice’,andsoon.Actuallytheword‘scoperte’meansthefacesofthe dicethatappear,sowecouldtranslateitsimplyas‘OntheOutcomesofDice’. Galileowasaskedwhyplayingwiththreedicethesumofpoints10or11are observedmorefrequentlythanthesumofpoints9or12.Galileo’sanswerwas:

Chenelgiuocodedadialcunipuntisienopi`uvantaggiosidialtri,vihalasuaragioneassai manifesta,laquale`eilpoterquellipi`ufacilmenteepi`ufrequentementescoprirsichequesti (Thefactthatindicegamescertainoutcomesaremoreadvantageousthanothershasavery clearreason,whichisthatcertainoutcomescanappearmoreeasilyandmorefrequentlythan

others.)

Forinstance,thesum9isobtainedwiththefollowingsixtriplenumber(triplicit`a), thatisthe scoperte ofthethreedice:

1.2.6.;1.3.5.;1.4.4.;2.2.5.;2.3.4.;3.3.3. Sixtriplenumberarealsonecessarytogetthesum10:

1.3.6.;1.4.5.;2.2.6.;2.3.5.;2.4.4.;3.3.4. Howeverthesum 3.3.3,forinstance,canbeproducedbyonlyonethrow,whilethe sum 3.3.4. bythreethrows: 3.3.4.,3.4.3.,4.3.3. Inconclusion,thesumof points10canbeproducedby27differentthrows,whilethesumofpoints9by25 only.

Letusdoaforwardstimewarpandreadthefollowingquotation:

Fromanurn,inwhichmanyblackandanequalnumberofwhitebutotherwiseidentical spheresareplaced,let20purelyrandomdrawingsbemade.Thecasethatonlyblackballs aredrawnisnotahairlessprobablethanthecasethatonthefirstdrawonegetsablack sphere,onthesecondawhite,onthethirdablack,etc.Thefactthatoneismorelikelytoget 10blackspheresand10whitespheresin20drawingsthanoneistoget20blackspheresis duetothefactthattheformereventcancomeaboutinmanymorewaysthanthelatter.The relativeprobabilityoftheformereventascomparedtothelatteristhenumber20!/10!10!, whichindicateshowmanypermutationsonecanmakeofthetermsintheseriesof10white and10blackspheres[...].Eachoneofthesepermutationsrepresentsaneventthathasthe sameprobabilityastheeventofallblackspheres.

ThatiswhatLudwigBoltzmannwrotein1896inhis Vorlesungen ¨ uberGastheorie translatedby Brush(1964).Itis notimpossible todraw20blackballs,sincethisballs extractionhasthesameprobabilityasanyotherone,butthenumberofwaystodraw 10blackballsand10whiteballsisfargreaterthanthatofallblackballs.

InhisFoundationsofStatisticalMechanics,Boltzmannexplicitlyintroducesthe postulateofequalaprioriprobabilityof microstates beingcompatiblewithagiven macroscopic state.Onthis Ansatz,Boltzmannexplainswhythe‘arrowoftime’points tothemoreprobablemacrostate.Wewill–sotospeak–holdinourhandthese conceptsbystudyingthestochasticprocess‘Ehrenfest’surnmodel’inthenextchapter.

BothCardanoandGalileofoundthesolutionoftheproblemsofthreedice,assumingthatthepossibleoutcomesare equally possibleandcountsthechanceofcompound events.Instatisticalmechanics,eachmicrostatedescribesthepositionandvelocityof eachmolecule.Amacrostateisastatedescriptionofthemacroscopicpropertiesof thesystem:forinstanceitspressure,volumeandsuch.Eachmacrostateismadeup ofmanymicrostates.Tohaveanideaofmicrostatesandmacrostates,letusthink ofmacrostatesasthesumofthepointsofthethreediceandofmicrostatesasthe numberoffavourableoutcomes.Sowesaythatthe‘system’(thesystemisformedby thethreedice)isinthemacrostate10(thatis,thesumofthepointsis10)whichis realizedby27microstates.IntheBoltzmann’sexampleofthe20balls,eachextracted sequenceisamicrostate,whilethenumberofwhite(orblack)ballsisamacrostate.

Thehypothesisofequalaprioriprobability(explicitlystatedorimplicitlyassumed),initsturn,restsontheprincipleofindifference:equalprobabilitieshaveto beassignedtoeachoccurrenceifthereisnoreasontothinkotherwise.Withasmall timeleap,welearnfrom Einstein(1925)that,inwhatwillbecalled‘Bose-Einstein Statistics’,themicrostatesarenotequallypossible,eventhoughthereisnoreasonto

Exercises 11

consideranyoneofthesemicrostateseithermoreorlesslikelytooccurthananyother (onthissubject,see Rosa(1993)).

Toconcludethisintroductorychapter,wenoticethatinthecomputerexperiments performedwithtwodice(Code_2_1.R)(thetermmostusedis simulation andmore exactly MonteCarlosimulation),itispossibletoexperiencethenotionofprobability. Inotherwords,thecomputerisregardedassomethinglikea‘statisticallaboratory’ withwhichprobabilisticexperimentscanbeperformed,experimentsnotquitefeasibleinpractice.Wewillencounterexpressionslike‘measurements’,‘statisticaland systematicerrors’,‘errorpropagation’,andsoon,justasinalaboratoryexperiment.

2.3Exercises

Exercise2.1 Wehaveseenin Code_2_1.R,relativetothethrowoftwodice,thattheestimatedprobabilityofthedoublesixin18000gameswith24throwsis P{A}[24] ≈ 0.4887, whilewith25throwsitis P{A}[25] =0 5113,resultpracticallyequaltothetheoreticalone. Thedifference ˆ d =0 5113 0 4887=0 0226resultedsignificant.WriteacodetoobtainFig. 2.2,showingifthedifferenceofthemeansissignificantwiththebootstrapmethod.Before youhavetoread AppendixA ifyouarenotfamiliarwiththebootstrapmethod.

Exercise2.2 InthedicegameUndersandOvers(U&O),twodicearerolled.Playersbet ononeofthefollowingalternatives:(1)Theresult(sumofthedicefaces)isbelow7,(2)the resultis7,(3)theresultisabove7.

Incases(1)and(3)thepayoffoddsare1:1,i.e.ifyoubet £1thehousegivesyouback yourmoneyplusanadditional £1.Incase(2)theoddsare4:1,i.e.betting £1yougain £4 (youget £5).

Supposeyoubetat £1ontheoutcome(1).Whatisyourexpectedaveragewin/loss(i.e. inaninfinitenumberofthrows)?

Exercise2.3 Referringtothepreviousexercise,writeacodetosimulateafinitegame consistingof10,100or1000throws.Discusstheresultofthesimulations,comparedtothe theoreticalwin/lossexpectation.

Hint:UsetheRfunction sample forsamplingadieface,i.e.anintegernumberfrom1:6

Exercise2.4 AvariantoftheU&Ogameallowstheplayertobetuptotwoalternatives (placing £1overeachone).Howdoesthewin/lossexpectationchange?

Exercise2.5 Justifythefollowingassertion:‘thehousealwayswins’.Hint:Ifyoucanbet £1oneachofthethreealternatives,whatistheexpectedoutcome?

Exercise2.6 Withreferencetoexercise 2.5,comparethetheoreticalresultforaninfinite numberofthrowswiththoseobtainedinasmallnumberofthem,e.g.10.Simulatingthe probleminR,in100repetitionshowmanytimesdoyouwinandhowmanydoyouloseyour money?

IntroductiontoStochasticProcesses

Noicorriamosempreinunadirezione, maqualsiaechesensoabbiachilosa...

Weareallheadedinonedirection, butwhichitisandwhatsenseitmakes,whoknows...

Probabilitytheoryisessentialtotheunderstandingofmanyprocesses(physical,chemical,biological,economic,etc.).Bymeansofrandomvariables,webuildmodelsofsuch processes,thatisofsystemsthatevolveovertime.Weareinterestedinwhathappens inthefuture.Ifweknowtheprobabilitydistributionuntilnow,howwillitbemodified ifcarriedforwards,throughtime?Theanswerisamatterofstochasticprocesses.For furtherreadingmanybooksareavailableonthesubject(Feller,1970; Lawler,2006; YatesandGoodman,2015; JonesandSmith,2018; Grimmett,2018).

3.1Basicnotion

IndictionariesofclassicalGreek,theword στoχ ´ αζεσϑαι (stochazesthai )means‘to aimatsomething’,‘toaimatatarget,atagoal’,ata στoκ ´ oς (st´ochos).Later,figuratively,‘toaimatsomething’becomes‘tohavesomethinginview’,or‘toconjecture’, fromwhich στ oχαστικ´oς (stochastik´os),‘skilfulinaimingat’,‘abletoconjecture’. Sothe‘target’becomesthe‘conjecture’.Conjectureofwhat?Ofsomethingbelow theapparentchance?Ofundisclosedcauses?Isthereahidden‘determinism’evenin (seemingly)randomphenomena?Thatisthequestion.Theinterestedreadercanrefer toChapter2,wheresomehistoricalanswerswerecalledsuccinctly.

Astochastic(orrandom)processisdefinedasafamilyofrandomvariables:

X1,X2,...,Xt,...,

indexedbyaparameter t,anddefinedonthesameprobabilityspace(Ω, F , P)formally definedasfollows:Ωisthesamplespace,i.e.thespaceofallpossibleoutcomes, F isa familyofsubsetsofΩ,mathematicallydefineda σ-algebra,withparticularproperties (forexample,thatofincludingthewholesamplespaceandallpossibleunionsof subsets)thatmakeΩameasurablespace. P isaprobabilitymeasurefunctionoperating on F,suchthat P(Ω)=1and P(Φ)=0,Φbeingtheemptyset.Theindex t often, butnotalways,standsforatime(days,years,seconds,nanoseconds,etc.).The‘time’ canalsobeanon-physicaltime,asforinstance‘MonteCarlosteps’.

Astochasticprocessiswrittenas {Xt; t ∈ T}.Theset T isthe parametricspace, itcanbeasubsetofnaturalnumbersorintegers,thatis T = {0, 1, 2,... },or T = {..., 2, 1, 0, 1, 2,... },or T = {0, 1, 2,...,n}.Inthesecases {Xt} issaidtobe a discrete-timestochasticprocess.If T istherealline R oritssubset,forinstance T =(−∞, ∞),or T =[0, ∞),or T =[a,b),or T =[a,b], {Xt} issaidtobea continuous-timestochasticprocess.

Discrete-timeandcontinuous-timeprocessesessentiallydifferinthetimescale:in theformercaseeventsoccurinapredeterminedsuccessionoftimepoints t1,t2,... , inthelattereventscanoccurateachtimepoint t ofacontinuousrangeofpossible values.

Thename stochasticprocess refersthereforetotwoinherentaspects:theterm ‘process’referstoatimefunction;theadjective‘stochastic’referstorandomness,in thesensethatarandomvariableisassociatedtoeacheventinthetimescale.Insome casesthe stochasticprocess canalsobeassociatedtospaceandnotjusttime.

Timehasanarrow.Theprocesshasthereforeabeforeandanafter,apastanda future.Therealization xt attime t oftherandomvariable Xt issupposedtobecloser toobservations xt 1 and xt+1,ratherthantothosefartherintime.Thismeansthat the chronologicalorder ofobservationsplaysanessentialrole.

Wesaidthatthe Xt’sarerandomvariables.Theyaredefinedonthesameprobabilityspace(Ω, F , P).Theytakevaluesinameasurablespace,whosevaluesarecalled states.Wesaythat‘theprocessattime t isinthestate xi’,ormoresimply,‘the processat t isin i’,tomeanthattherandomvariable Xt hastakenthevalue xi.The setofallvaluestakenbythevariablesoftheprocessiscalledthe statespace andit willbedenotedas S.Wecansaythat‘the system attime t isinthestate i’,orthat it‘occupies’or‘visits’thestate i,if Xt = xi for xi ∈S

Aprocessis discrete,orisindiscretevalues,if S isdiscrete,thatisif S iscountable (finiteorinfinite): S⊆ N or S⊆ Z.Theprocessis continuous,orisincontinuous values,if S⊆ R.Sothat,stochasticprocessesmaybeclassifiedintofourtypes:

1. discrete-timeanddiscretestatespace,

2. discrete-timeandcontinuousstatespace,

3. continuous-timeanddiscretestatespace,

4. continuous-timeandcontinuousstatespace.

Inotherwords,discreteorcontinuoustimeconcernsthedomainofthetimevariable t,whilediscreteorcontinuousstateconcernsthedomainof Xt foragiven t Letustakeapracticalexample.Weareinterestedin continuously recordingthe temporalvariationsofthetemperatureofadevice.Fortechnicalreasons,thetemperaturedoesnotremainconstant,butfloatswithinacertainrange.Supposethe measurementsarereadonananaloguescaleand,tobespecific,supposealsothat measurementsaredowntothousandthsofadegree,withprecisionoftheorderof1%. Wecanconsiderthemeasurementsexpressedinrealnumbers,eventhough,obviously, anymeasurementhasafinitenumberofdigits.Withsuchpremises,thesequence intimeoftherandomvariable‘temperature’canberepresentedasacontinuous-time stochasticprocessincontinuousvalues.Ifwedecidetogroupthemeasurementswithin arange,say,oftenthsofadegree,theprocessisstillacontinuous-typeprocess,butin

IntroductiontoStochasticProcesses

discretevalues.Ifwegroupthedatareadingswithinapredefinedrange,forinstance every20or60seconds,theprocesswillbeadiscrete-timeprocess,in(approximately) continuousvalues(thousandthsofadegree)ordiscretevalues(tenthofadegree) FurtherclassificationconcernsthedimensionofΩand T.IfthespaceΩhasdimensiongreaterthan1,werefortoa multivariate stochasticprocess,astheyare,for instance,space-timeprocesses.Allthevariables Xt aredefinedinthesamespaceΩ, theneach Xt isarandomfunctionoftwoargumentsofdifferentnature:thevariable ofprobabilisticnature ω ∈ Ωindicatestheevent,thevariableofmathematicalnature t ∈ T createsanorderintherandomvariablesfamily.Thestochasticprocess {Xt; t ∈ T},inmorecompletemanner,shouldbewrittenas:

{X(ω,t); ω ∈ Ω,t ∈ T}

tohighlightthattheparticularrealizationofthestochasticprocessattime t depends ontheparticularevent ω ∈ Ω.

Letusfix t, t = t.Then Xt (ω) ≡ X(ω, t)isarandomvariableand,ifthepossible outcomesofthe‘trial’are ω1,ω2,... ,thepossiblerealizationsof Xt (ω)aregivenby:

Xt (ω1)= x1,Xt (ω2)= x2,...,

wherethesubscript i (i =1, 2,... )ofthe xi’snumbersarethedifferentpossible realizationsofthesamerandomvariable Xt(ω)attime t = t. Itispossibletoregard X(t,ω)asafunctionof t,fixed ω = ω inΩ,sowehave Xt (ω).Inthiscase,weconsideraparticularoutcomeattimes t = t1,t = t2,... ,that is:

Xt1 (ω1)= x1,Xt2 (¯ ω2)= x2,..., (3.1) wherethesubscripts i (i =1, 2,... )ofthe xi’snumberarethetimepoints t1 atwhich theevent ω1 hasoccurred, t2 atwhichtheevent ω2,hasoccurred,etc.Inthiscase,for eachfixed ω,thesequence(x1,x2,... )iscalled realization or history or samplepath or trajectory oftheprocess.Allthepossiblesamplepathsresultingfromanexperiment constitutean ensemble.Thereforeastochasticprocesscanberegardedasformedby thewholenessofallitspossiblerealisations.Afiniteportionofarealizationiscalled timeseries: (...,xk+1,xk+2,...,xk+t,...,xk+n timeseries ,... )(3.2)

So,torecap, Xt (ω)means,dependingonthecontext:

1. Xt(ω)(t and ω variables):afamilyoftimedependentreal-valuedfunctions,that isastochasticprocess.

2. Xt (ω)(t constantand ω variable):arandomvariable,thatis,fordefinition,a measurablefunctiondefinedonaprobabilityspace.

3. Xt (ω)(t variableand ω constant):asinglemathematicalfunctiondependingon time.

4. Xt (ω)(t constantand ω constant):arealnumber.

Comingbacktotheexampleofthetemperaturemeasurementofadevice,suppose thatthemeasurementsaretakeneveryminuteandwerounduptherecordstotenths ofadegree.Thestochasticprocess,modellingthetimevariationofthetemperature, isthereforeadiscrete-valuedprocessindiscretetime.Supposewehavecarriedoutthe measurementsforonehouron30similardevices(k =1,..., 30).

Therecordsofthemeasurementsateveryminutewillnotbethesameforthe30 devices,thereforetherecordswillbedisplayedasinTable 3.1.

Table3.1 Temperaturevstimedependenceof30devices.

If t isfixed,forinstance, t = t =3,thecorrespondingcolumnshowsthetemperatureofthe30devicesattime t.Thevalues x1 =13 9, x2 =17 1,... , x30 =20 1are then30realizationsoftherandomvariable Xt (ω).If ω isfixed,itmeanspickinga deviceandfollowingitschangeintemperatureovertime.Forinstance,theensemble {18.4, 16.3, 17.1, 17.8 ..., 14.4, 15.0} showsapossiblesamplepathcorrespondingtothe device k =2.Figure 3.1 showsthetemperatureasafunctionoftimeforthefirstfour devicesanditrepresentsfourrealizations,ortimeseries,ofadiscrete-valuedstochastic processindiscretetime.

Fig.3.1 Realizationsofadiscrete-valuedstochasticprocessindiscretetime.Dashedline: devicestemperatureattime t =3.Thelinesbetweenpointsaretoguidetheeye.

IntroductiontoStochasticProcesses

Letusconsidersomewell-knownstochasticprocesses,basedonthe Bernoulliprocess,i.e.onasequenceofindependentandidenticallydistributed, i.i.d.,randomvariables:

X1,X2,...,Xi,..., with Xi ∼ Bern(p)

Thissequencesiscalleda Bernoulliprocess {Xt; t ∈ N} ifthe Xi’sareindependentof eachotherand ∀i ∈ N, P {Xi =1} = p and P {Xi =0} =1 p.Itisadiscrete-valued stochasticprocessindiscretetime.Theindexsetisthesetofnaturalnumbers T = N, or T = N 0),andthestatespace S istheset {0, 1}

Definetherandomvariablesequence:

where {Xt; t ∈ [1, ∞)} istheBernoulliprocess.Thesequence {St; t ∈ [1, ∞)} isalsoa discrete-valuedstochasticprocessindiscretetime,with S = N

Considertherandomvariablesequences: Yt = S

Thesequence {Yt; t ∈ [1, ∞)} isadiscrete-timestochasticprocess,butwithcontinuousvalues,thatis S = R.Wealsoknow,fromthelawoflargenumbersforbinomial randomvariables,thatthesequence {Yt} convergesinprobabilitytoE[Xi]= p.

Somegeneralconsiderationsaboutstochasticprocessesarenowdefined.Afirst questionis:whatdoesitmeantohaveacompleteknowledgeofaprocess {Xt; t ∈ T}?Wehaveacompleteknowledgeofarandomvariable X,andwesaythat X is knownwhenweknowitsrepartition(orcumulativedistribution)function F (x)= P {X x} , ∀x ∈ R.

Asaconsequence,astochasticprocessisknownwhentherepartitionfunctionof eachvariableofthefamilyisknown,thatisall Ft(x)= P {Xt x} , ∀x ∈ R and ∀t ∈ T.Butthisisnotenough.Wehavetoknowallthejointprobabilitydistributions ofthevariables Xt,thatisallthedoublecumulativefunctions:

t1,t2 (x1,x2

and,ingeneral,allthe n tuple:

t1,t2,...,tn (x1,x2,...,xn)=

Thefamilyoffunctions Ft1,t2,...,tn (x1,x2,...,xn)iscalledthe temporallaw oftheprocess.Forafinitedimensionfamilyandundercertainpreciseconditionstheprobabilisticstructureofthecompleteprocessmaybespecified.Suchaformidabletheoretical

problemwasfacedbyKolmogorovinthefirsthalfofthe20thcentury(Kolmogorov, 1950).

Forpracticalpurposes,itisadvantageoustosearchforvaluessummarizingthemain propertiesoftheprocess.Suchsummariesarethefinite–ordermoments,inparticular thefirst-andsecond-ordermoments.Supposethatsuchmomentsexistforeach Xt Thefirst-ordermoment,the expectedvalue,isthemeanofeach Xt,definedas:

µt =E[Xt]

which,ingeneral,isdifferentforeach t.Thesecondcentralmoment,orautocovariance function,atthelag k,isdefinedas:

γt,t k =Cov[Xt,Xt k]=E[(Xt µt)(Xt k µt k)](3.4)

Theautocovariancefunctionrepresentsthecovariancebetweentherandomvariable Xt and Xt k, k =0, 1, 2,... ,thatisthecovarianceoftheprocesswithitselfatpairsof timepoints.Thisfunctionmeasuresthejointvariationof Xt and Xt k,eitherinthe samedirection(positivevaluesof γt,t k)orintheoppositedirection(negativevalues of γt,t k),atthetimepoints k =0, 1, 2,...

Remark3.1 Inbivariate,orgenerallymultivariateprocesses,covariancesarenamed ‘cross-covariances’,whenthedependenceofoneprocessoveranother(ormorethan one)isinvestigated.Forinstance,fortwoprocesses {Xt} and {Yt},thecross-covariance isgivenby:

withaslightchangeofsymbolstoindicateabivariateprocess.

If k =0,fromeqn(3.4):

isthevarianceoftheprocess,alsodenotedby σ2 t

The (linear)autocorrelationfunction ρt,t k atthelag k,isgivenbynormalizing theautocovariance γt,t k ,eqn(3.4),tothevariance(3.6):

t,t k

Thefunction ρt,t k isdimensionlessanddoesnotvarybyinterchanging Xt and Xt k.Itreachesitsmaximumwhen k =0.If Xt and Xt k arelinearlyindependent ρt,t k isequalto0,whileitis+1or 1,inpresenceofaperfectcorrelation (thereisanexactoverlappingwhentimeisshiftedby k)oraperfectanti-correlation, respectively.Thequantities ρt,t k and γt,t k areessentialtocharacterizestochastic processes;indeed,theymeasuretheinternalstructureofprocessesandtheirmemory.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.