RProgrammingforActuarialScience
PeterMcQuire UniversityofKent
Canterbury UK
AlfredKume UniversityofKent
Canterbury UK
Thiseditionfirstpublished2024 © 2024JohnWiley&SonsLtd
Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmitted,in anyformorbyanymeans,electronic,mechanical,photocopying,recordingorotherwise,exceptaspermittedby law.Adviceonhowtoobtainpermissiontoreusematerialfromthistitleisavailableat http://www.wiley.com/ go/permissions
TherightofPeterMcQuireandAlfredKumetobeidentifiedastheauthorsofthisworkhasbeenassertedin accordancewithlaw.
RegisteredOffices
JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA
JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UK
Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWileyproductsvisitus at www.wiley.com
Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Somecontentthat appearsinstandardprintversionsofthisbookmaynotbeavailableinotherformats.
Trademarks: WileyandtheWileylogoaretrademarksorregisteredtrademarksofJohnWiley&Sons,Inc.and/or itsaffiliatesintheUnitedStatesandothercountriesandmaynotbeusedwithoutwrittenpermission.Allother trademarksarethepropertyoftheirrespectiveowners.JohnWiley&Sons,Inc.isnotassociatedwithany productorvendormentionedinthisbook.
LimitofLiability/DisclaimerofWarranty
Whilethepublisherandauthorshaveusedtheirbesteffortsinpreparingthiswork,theymakeno representationsorwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthisworkand specificallydisclaimallwarranties,includingwithoutlimitationanyimpliedwarrantiesofmerchantabilityor fitnessforaparticularpurpose.Nowarrantymaybecreatedorextendedbysalesrepresentatives,writtensales materialsorpromotionalstatementsforthiswork.Thisworkissoldwiththeunderstandingthatthepublisheris notengagedinrenderingprofessionalservices.Theadviceandstrategiescontainedhereinmaynotbesuitable foryoursituation.Youshouldconsultwithaspecialistwhereappropriate.Thefactthatanorganization,website, orproductisreferredtointhisworkasacitationand/orpotentialsourceoffurtherinformationdoesnotmean thatthepublisherandauthorsendorsetheinformationorservicestheorganization,website,orproductmay provideorrecommendationsitmaymake.Further,readersshouldbeawarethatwebsiteslistedinthisworkmay havechangedordisappearedbetweenwhenthisworkwaswrittenandwhenitisread.Neitherthepublishernor authorsshallbeliableforanylossofprofitoranyothercommercialdamages,includingbutnotlimitedto special,incidental,consequential,orotherdamages.
AcataloguerecordforthisbookisavailablefromtheLibraryofCongress
HardbackISBN:9781119754978;ePubISBN:9781119754992;ePDFISBN:9781119754985; oBookISBN:9781119755005
CoverDesign:Wiley
CoverImage: © PeterMcQuire
Setin9.5/12.5ptSTIXTwoTextbyIntegraSoftwareServicesPvt.Ltd,Pondicherry,India
Tomywife,Jenny,anddaughter,Lauren,fortheirconstantsupportandencouragement. (PeterMcQuire)
TomywifeOrtenca,forhersupportthroughouttheprocess.(AlfredKume)
AbouttheCompanionWebsite xxi Introduction 1 1 MainObjectivesofThisBook 1 2 WhoIsThisBookFor? 2 3 HowtoUseThisBook 3 4 BookStructure 3 5 ChapterStyle 4 6 ExamplesandExercises 4
7 VerificationofCodeandCalculations–BestPractice 5 8 Website:www.wiley.com/go/rprogramming.com 6 9 RorMicrosoftExcel? 6 10 Caveats 8 11 Acknowledgements 8
1 R:WhatYouNeedtoKnowtoGetStarted 9
1.1Introduction 9
1.2GettingStarted:InstallationofRandRStudio 10
1.2.1InstallingR 10
1.2.2WhatIsRStudio? 10
1.2.3InputtingRCommands 13
1.3AssigningValues 14
1.4HelpinR 15
1.5DataObjectsinR 15
1.6Vectors 15
1.6.1NumericVectors 15
1.6.2LogicalVectors 18
1.6.3CharacterVectors 19
1.6.4FactorVectors 20
1.7Matrices 21
1.8Dataframes 22
1.9Lists 24
1.10SimplePlotsandHistograms 24
1.11Packages 26
1.12ScriptFiles 27
1.13Workspace,SavingObjects,andMiscellany 28
1.14SettingYourWorkingDirectory 29
1.15ImportingandExportingData 29
1.15.1ImportingData 29
1.15.2ExportingData 30
1.16CommonErrorsMadeinCoding 30
1.17NextSteps 31
1.18RecommendedReading 31
1.19Appendix:Coercion 31
2 FunctionsinR 33
2.1Introduction 33
2.1.1Objectives 33
2.1.2CoreandPackageFunctions 33
2.1.3User-DefinedFunctions 34
2.2AnIntroductiontoApplyingCoreandPackageFunctions 34
2.2.1ExamplesofSimple,CommonFunctions 34
2.3User-DefinedFunctions 38
2.3.1Whatdoesa“udf”consistof? 38
2.3.2NamingConventions 38
2.3.3ExamplesandExercises 39
2.4UsingLoopsinR-the“for”Function 41
2.5IntegralCalculusinR 42
2.5.1The“Integrate”Function 42
2.5.2NumericalIntegration 43
2.6RecommendedReading 44
3 FinancialMathematics(1):InterestRatesandValuingCashflows 45
3.1Introduction 45
3.2TheForceofInterest 46
3.3PresentValueofFutureCashflows 48
3.4InstantaneousForwardRatesandSpotRates 50
3.5Non-ConstantForceofInterest 51
3.5.1DiscreteCashflows 51
3.5.2CashflowsWhichAreContinuous 54
3.6EffectiveandNominalRatesofInterest 57
3.6.1EffectiveRatesofInterest 57
3.6.2WhyDoWeUseEffectiveRates? 60
3.6.3NominalInterestRates 60
3.7Appendix:ForceofInterest–AnAnalogywithMortalityRates 62
3.8RecommendedReading 62
4 FinancialMathematics(2):MiscellaneousExamples 63
4.1Introduction 63
4.2WritingAnnuityFunctions 64
4.2.1Writingafunctionforanannuitycertain 64
4.3The‘presentValue’Function 66
4.4AnnuityFunction 70
4.5Bonds–PricingandYieldCalculations 70
4.6BondPricing:Non-ConstantInterestRates 75
4.7TheEffectofFutureYieldChangesonBondPricesThroughouttheTermof theBond 77
4.8LoanSchedules 80
4.8.1Introduction 80
4.8.2Method1 81
4.8.3Method2 84
4.9RecommendedReading 85
5 FundamentalStatistics:ASelectionofKeyTopics–DrAKume 87
5.1Introduction 87
5.2BasicDistributionsinStatistics 87
5.3SomeUsefulFunctionsforDescriptiveStatistics 93
5.3.1Introduction 93
5.3.2BivariateorHigherOrderDataStructure 94
5.4StatisticalTests 95
5.4.1ExploringforNormalityorAnyOtherDistributionintheData 95
5.4.2Goodness-of-fitTestingforFittedDistributionstoData 98
5.4.2.1Continuousdistributions 98
5.4.2.2Discretedistributions 99
5.4.3T-tests 100
5.4.3.1Onesampletestforthemean 100
5.4.3.2Twosampletestsforthemean 101
5.4.4F-testforEqualVariances 102
5.5MainPrinciplesofMaximumLikelihoodEstimation 102
5.5.1Introduction 102
5.5.2MLEoftheExponentialDistribution 103
5.5.2.1ObtainingtheMLEnumericallyusingR 104
5.5.2.2ObtainingtheMLEanalytically 105
5.5.3LargeSample(Asymptotic)PropertiesofMLE 106
5.5.4FittingDistributionstoDatainRUsingMLE 108
5.5.5LikelihoodRatioTest,LRT 109
5.6Regression:BasicPrinciples 110
5.6.1SimpleLinearRegression 111
5.6.2QuantifyingUncertaintyon ̂ �� 113
5.6.3AnalysisofVarianceinRegression 114
5.6.3.1 R2 andadjusted R2 CoefficientofDetermination 115
x Contents
5.6.4SomeVisualDiagnosticsfortheProposedSimpleRegressionModel 115
5.7MultipleRegression 118
5.7.1Introduction 118
5.7.2RegressionandMLE 120
5.7.2.1MultivariateRegression 120
5.7.3Tests 122
5.7.3.1LikelihoodRatioTestinRegression 122
5.7.3.2AkaikeInformationCriterion:AIC 122
5.7.3.3AICandRegressionmodelselection 123
5.7.3.4BayesianInformationCriterion:BIC 123
5.7.4VariableSelection,FindingtheMostAppropriateSub-Model 123
5.7.5BackwardElimination 123
5.7.6ForwardSelection 125
5.7.7UsingAIC/BICCriteria 126
5.7.8LRTinModelSelection 128
5.7.9AutomaticSearchUsingR-squaredCriteria 129
5.7.10ConcludingRemarksonTestData 130
5.7.11ModellingBeyondLinearity 130
5.8Dummy/IndicatorVariableRegression 131
5.8.1IntroducingCategoricalVariables 131
5.8.2ContinuousandIndicatorVariablePredictors–IncludingLoadin theModel 134
5.9RecommendedReading 137
6 MultivariateDistributions,andSumsofRandomVariables 139
6.1MultivariateDistributions–ExamplesinFinance 139
6.2SimulatingMultivariateNormalVariables 140
6.3TheSummationofaNumberofRandomVariables 143
6.4Conclusion 146
6.5RecommendedReading 146
7 BenefitsofDiversification 147
7.1Introduction 147
7.2Background 147
7.3KeyMathematicalIdeas 148
7.4RunningSimulations 151
7.5RecommendedReading 153
8 ModernPortfolioTheory 155
8.1Introduction 155
8.22-AssetPortfolio 156
8.33-AssetPortfolio 159
8.4IntroductionofaRisk-freeAssettothePortfolio 163
8.4.1AddingaRisk-freeAsset 163
8.4.2CapitalMarketLineandtheSharpeRatio 164
8.4.3BorrowingtoObtainHigherReturns 165
8.5Appendix:LagrangeMultiplierMethod 166 8.6RecommendedReading 169
9 Duration–AMeasureofInterestRateSensitivity 171
9.1Introduction 171
9.2Duration–DefinitionsandInterpretation 171
9.3DurationFunctioninR 173
9.4PracticalApplicationsofDuration 174
9.5RecommendedReading 175
10 Asset-LiabilityMatching:AnIntroduction 177
10.1Introduction 177
10.2WhatInterestRatesDoInstitutionsUseToMeasure TheirLiabilities? 178
10.3VarianceoftheSolvencyPosition 178
10.4CharacteristicsofVariousAssetClassesandLiabilities 179
10.5OurScenarios 180
10.6Results 181
10.7Simulations 182
10.8ExerciseandDiscussion–anInsurerWithPredominatelyShort-Term Liabilities 183
10.9PotentialExercise 184
10.10Conclusions 185
10.11RecommendedReading 186
11 Hedging:ProtectingAgainstaFallinEquityMarkets 187
11.1Introduction 187
11.2OurExample 187
11.2.1FuturesContracts–ABriefExplanation 187
11.2.2OurTask 189
11.3AdoptingaBetterHedge 192
11.4AllowanceforContractandPortfolioSizes 193
11.5NegativeHedgeRatio 193
11.6ParameterandModelRisk 193
11.7AFinalReminderonHedging 193
11.8RecommendedReading 194
12 Immunisation–RedingtonandBeyond 195
12.1Introduction 195
12.2OutlineofRedingtonTheoryandAlternatives 196
12.3Redington’sTheoryofImmunisation 198
12.4ChangesintheShapeoftheYieldCurve 202
12.5AMoreRealisticExample 203
12.5.1DeterminingaSuitableBondAllocation 203
12.5.2ChangeinYieldCurveShape 206
12.5.3LiquidityRisk 207
12.6Conclusion 209
12.7RecommendedReading 210
13 Copulas 211
13.1Introduction 211
13.2CopulaTheory–TheBasics 212
13.3CommonlyUsedCopulas 213
13.3.1TheIndependentCopula 214
13.3.2TheGaussianCopula 214
13.3.3ArchimedianCopulas 217
13.3.4ClaytonCopula 217
13.3.5GumbelCopula 219
13.4CopulaDensityFunctions 220
13.5MappingfromCopulaSpacetoDataSpace 222
13.6Multi-dimensionalDataandCopulas 224
13.7FurtherInsightintotheGaussianCopula:ANon-rigorousView 225
13.8TheRealPowerofCopulas 226
13.9GeneralMethodofFittingDistributionsandSimulations–ACopula Approach 226
13.9.1FittingtheModel 226
13.9.2SimulatingDataUsingthe mvdc and rMvdc Functions 227
13.10HowNon-GaussianCopulasCanImproveModelling 227
13.11TailCorrelations 229
13.12Exercise(Challenging) 231
13.13Appendix1–CopulaProperties 232
13.14Appendix2–RankCorrelationandKendall’sTau, �� 233 13.15RecommendedReading 235
14 Copulas–AModellingExercise 237
14.1Introduction 237
14.2ModellingFutureClaims 237
14.2.1Data 237
14.2.2FittingAppropriateMarginalDistributions 238
14.2.3FittingTheCopula 239
14.2.4AssessingRiskFromtheAnalysisofSimulatedValues 241
14.2.5ComparisonwiththeGaussianCopulaModel 242
14.2.6ComparisonoftheModelswiththeData 243
14.3AnotherExample:BankingRegulator 244
14.4Conclusion 245
15 BondPortfolioValuation:ASimpleCreditRiskModel 247
15.1Introduction 247
15.2OurExampleBondPortfolio 249
15.2.1Description 249
15.2.2TheTransitionMatrix 251
15.2.3CorrelationMatrix 252
15.2.4SimulationsandResults 253
15.2.5IncorporatingInterestRateRisk–ASimpleAdjustment 255
15.2.6PortfolioConsistingofHighlyCorrelatedBonds 256
15.3FurtherDevelopmentofthisModel 256
15.4RecommendedReading 257
16 TheMarkov2-StateMortalityModel 259
16.1Introduction 259
16.2Markov2-StateModel 259
16.3SimpleApplicationsofthe2-StateModel 261
16.4EstimatingMortalityRatesfromData 264
16.5AnExample:CalculatingMortalityRatesforOneAgeBand 267
16.6UncertaintyinOurEstimates 268
16.7NextSteps? 269
16.8Appendices 269
16.8.1InformalDiscussionof �� 269
16.8.2Intuitivemeaningof ����(��) 270
16.9RecommendedReading 271
17 ApproachestoFittingMortalityModels:TheMarkov2-stateModelandan IntroductiontoSplines 273
17.1Introduction 273
17.2GraduationofMortalityRates 274
17.3FittingOurData 277
17.3.1Objective 277
17.3.2SummarisedData 277
17.4ModelFittingwithLeastSquares 281
17.5IndividualMemberData 283
17.6ComparingLifeTableswithaParametricFormula 286
17.7Splines:AnIntroduction 287
17.7.1Overview 287
17.7.2Data 289
17.7.3FittingtheModel:Splineregression 290
17.7.4AdjustedDataset 292
17.8Summary 293
17.9RecommendedReading 293
18 AssessingtheSuitabilityofMortalityModels:StatisticalTests 295
18.1Introduction 295
18.2Theory 296
18.3OurMortalityDataandVariousProposedMortalityRates 297
18.4TestingtheStandardTableRates–Table1, ��s1 x 298
18.4.1Dataandinitialplot 298
18.4.2 ��2 test 299
18.4.3SignsTest–forOverallBias 300
18.4.4SerialCorrelationsTest;TestingforBiasOverAgeRanges 301
18.4.5AnalysingtheDistributionofDeviances 302
18.4.6logL,AICCalculations 302
18.4.7Conclusionson ��s1 x 303
18.5GraduationofMortalityRatesbyAdjustingaStandardTable 303
18.5.1TestingTable2, ��s2 x 303
18.5.2AdjustingTable2 303
18.6TestingGraduatedRatesObtainedfromaParametricFormula, ��par x 305
18.7ComparingOurCandidateRates 306
18.8Over-fitting 307
18.9OtherThoughts 307
18.10Appendix–AlternativeCalculationsofLogL’s 308 18.11RecommendedReading 310
19 TheLee-CarterModel 311
19.1Introduction 311
19.2UsingtheL-CModeltoCreateDataandFittheModel 312
19.2.1IntroducingtheLee-CarterModel 312
19.2.2CalculatingtheParameterValues 313
19.2.3Interpretationof ����,����,and ���� 316
19.3UsingL-CtoModelActualMortalityDatafromHMD 316
19.4Usingthe lca Functioninthe Demography Package 318
19.5ConstructingYourOwn Demogdata Object 319
19.6ForecastingMortalityRates 319
19.7CaseStudy:TheImpactoftheHIVVirusonMortalityRates 324
19.8RecommendedReading 326
20 TheKaplan-MeierEstimator 329
20.1Introduction 329
20.2WhatIsCensoring? 330
20.2.1Non-InformativeCensoring 330
20.3DefiningtheRelevantEvent 331
20.4K-MTheory 332
20.5IntroductoryExample:MonitoringDelaysinMaking ClaimPayments 333
20.6LungCancerExample 335
20.6.1BasicResults 335
20.6.2ComparisonofMaleandFemaleRates 336
20.6.3DoctorAssessmentScores–ph.ecog 336
20.7IssueswiththeKaplan-MeierModel 337
20.8Recommendedreading 338
21 CoxProportionateHazardsRegressionModel 339
21.1Introduction 339
21.2CoxModelEquation 340
21.3Applications 341
21.3.1Smokers’Mortality:SmallDataSet 341
21.3.2Smokers’Mortality:LargerDataSet 344
21.3.3Multiplecovariatesandinteractions 345
21.4ComparisonofCoxandKaplanMeierAnalysesofLungCancerData 347
21.5RecommendedReading 349
22 MarkovMultipleStateModels:ApplicationstoLifeContingencies 351
22.1Introduction 351
22.2TheMarkovProperty 352
22.3MarkovChainsandJumpModels 352
22.3.1Examples 352
22.3.2DifferencesbetweenMarkovChainandMarkovJumpModels 353
22.4MarkovChains(DiscreteTime) 354
22.4.1ApplyingMarkovChainstoEstimateFutureProbabilities 355
22.4.2MarkovChainModel-NCD 359
22.4.3CodingExerciseforMarkovChains 360
22.5MarkovJumpModels 360
22.5.1Example-Simple3-StateModel(AllTransitionsPossible) 361
22.5.2Example–H-S-DModel 363
22.6Non-ConstantRates 371
22.7PremiumCalculations 373
22.8TransitionRateEstimation 376
22.9MultipleDecrementModels 376
22.9.1Introduction 376
22.9.2UsingaNumericalApproachfortheaboveFixedRateProblems 377
22.9.3AnExactApproach 378
22.9.4Age-DependentRates 380
22.10RecommendedReading 382
23 ContingenciesI 383
23.1Introduction 383
23.2WhatisMeantby“Contingencies”inanActuarialContext? 384
23.3TheLifeTable 384
23.4ExpectedPresentValuesoftheKeyContingencyFunctions 385
23.5WritingOurOwnCode–SomeIntroductoryExercises 387
23.6The Lifecontingencies Package 389
23.6.1TheLifetableandActuarialtableObjects 390
23.6.2ApplicationtoActualMortalityTables:AM92andAF92 391
23.6.3Annuities 392
23.6.4AnnuitiesPaidmoreFrequentlythanAnnually 393
23.6.5IncreasingAnnuities 394
23.6.6ReversionaryAnnuities 394
23.6.7Example:AnnuityCompanyValuation 396
23.6.8LifeAssurancefunctions 398
23.6.9AssurancePolicieswith immediate PaymentonDeath: Ax 398
23.7SimulationofFutureLifetimes 399
23.8RecommendedReading 402
24 ContingenciesII 403
24.1Introduction 403
24.2MortalityTables:AM92 404
24.3UncertaintyinPresentValues:Variance 405
24.4Simulations 409
24.4.1SinglePolicy 409
24.4.2Portfolioswith100Policies–PortfolioClaimDistributionfrom Simulations 411
24.5SimulationofAnnuities 414
24.6PremiumCalculations 416
24.7Profits–ProbabilityDistributionsofSinglePoliciesandPortfolios 417
24.8Progressionofexpectedprofitsthroughoutthelifetimeofapolicy:no reservesheld 423
24.9PolicyValues 425
24.9.1CalculatingPolicyValues 425
24.9.2RecursiveFormulae–DiscreteandContinuous(Thiele) 428
24.9.3RecursiveEquationwith3States–HSDModel 430
24.10ProfitsfromPolicieswhereReservesAreHeld 432
24.10.1CalculatingtheProfitVector 432
24.10.2MeasuresofProfitandProfitTesting 435
24.11ProfitUncertainty:InterestRateandMortalityRisk 438
24.12RiskCapitalandRisk-adjustedReturnMeasures 440
24.13Unit-linkedPolicies 440
24.13.1Introduction 440
24.13.2ExamplewithDeterministicandStochasticProjections 442
24.14AdditionalExercises 445
24.15Appendix:DependentandIndependentRates 445
24.16RecommendedReading 446
25 ActuarialRiskTheory–AnIntroduction:CollectiveandIndividual RiskModels 447
25.1Introduction 447
25.2CollectiveRiskModel 448
25.3PoissonCompoundCollectiveRiskModel 448
25.4ApplicationsoftheModel 451
25.4.1SettingAppropriateReservesandPremiumPricing 451
25.4.2IncreasingtheNumberofIndependentPolicies 452
25.4.3AdoptingaNormalDistributionApproximation 453
25.4.4ReturnonCapital 455
25.4.5SkewnessoftheCompoundPoissonModel 455
25.4.6SumofCompoundPoissonDistributions 456
25.5CompoundBinomialCollectiveRiskModel 460
25.6CompoundNegativeBinomialDistribution 461
25.7Panjer’sRecursionFormula 462
25.8ClosingThoughtsonCollectiveRisksModels 464
25.9IndividualRiskModel 464
25.9.1StandardIndividualRiskModel 464
25.9.2AlternativeModel–‘ThePoissonIndividualRiskModel’ 466
25.10IssueswithHeterogeneity 467
25.11PoliciesWhichAreNotIndependent 468
25.12IncorporatingParameterUncertaintyintheModels 470
25.13ClaimAmountDistributions:AlternativestotheGammaDistribution 472
25.14Conclusions 472
25.15RecommendedReading 472
26 CollectiveRiskModels:Exercise 473
26.1Introduction 473
26.2AnalysisofClaimsData 473
26.3RunningSimulations 476
26.4TailsoftheDistribution 478
26.5AllowingforParameterUncertainty 478
26.6Conclusions 479
26.7RecommendedReading 479
27 GeneralisedLinearModels:PoissonRegression 481
27.1Introduction 481
27.2Examples/Exercises/Data 481
27.3BriefRecaponMultipleLinearRegression 482
27.4GeneralisedLinearModels(“GLMs”) 482
27.5GoodnessofFitofGLMs 483
27.6PoissonRegression 484
27.6.1Introduction 484
27.6.2UsingPoissonRegressiontoModelClaimNumbers 484
27.7DatawithVaryingExposurePeriods 488
27.7.1ClaimRatesandthe Offset 488
27.7.2ApplicationtoAggregatedDatainSection27.1 489
27.8CategoricalandContinuousVariables 491
27.8.1ProblemwithContinuousVariables 491
27.8.2CategoricalVariables 492
27.9InteractionbetweenVariables 496
27.10Over-dispersion 498
27.11MiscellaneousExercises 498
27.12FurtherStudy/NextSteps 499
27.13RecommendedReading 499
28 ExtremeValueTheory 501
28.1Introduction 501
28.2WhyUseEVT? 502
28.3GeneralisedParetoDistribution–“GPD” 502
28.4EVTAnalysisofHistoricDailyEquityMarketReturns(S&P500) 507
28.4.1BasicEVTAnalysis 507
28.4.2WillaNormalDistribution(andOtherAlternatives)DoJustasWell? 509
28.5DataforFurtherEVTAnalysis 510
28.6RecommendedReading 511
29 IntroductiontoMachineLearning:k-NearestNeighbours(kNN) 513
29.1Introduction 513
29.2Example1–IdentifyingaFruitType 514
29.2.1Data 514
29.2.2OverviewoftheProcess 514
29.2.3HowdoesthekNNAlgorithmWork? 514
29.2.4NormalisingOurData 517
29.2.5Varying k 518
29.2.6UsingOurModel 518
29.3AnalysisofOurModel–theConfusionMatrix 519
29.4Example2–CancerDiagnoses 520
29.5Conclusion 522
29.6RecommendedReading 522
30 TimeSeriesModellinginR–DrAKume 523
30.1Introduction 523
30.2LinearRegressionVersusAutoregressiveModel 524
30.3ThreeComponentsforTimeSeriesModelling 525
30.4Stationarity 527
30.5MainToolsinRforARIMAModelling 532
30.5.1PACFasaDerivationofACFandTheirGeneralBehaviourforARMA(p,q) Models 532
30.5.2HowtoSimulateandObtaintheTheoreticalValuesofACFandPACFfor ARMAModels 534
30.6IdentifyingaSetPossibleModelstotheDataIncludingtheOrderof Differencing 536
30.6.1ModelFittingtoTimeSeriesData 537
30.6.2ParameterEstimationforPureAuto-RegressiveModels 540
30.6.3DiagnosticPlots 540
30.6.4Forecasting 543
30.7DealingwithRealDatafarfromStationary 545
30.7.1NonParametricApproaches 545
30.7.2AirlineDataModellingUsingMultiplicativeSeasonalModels 546
30.8RecommendedReading 550
31 VolatilityModels–GARCH 551
31.1Introduction 551
31.2WhyUseGARCHModels? 551
31.3OutlineoftheChapter 553
31.4KeyTheoreticalConceptswithGARCH 553
31.5SimulationofDataUsingaGARCHModel 555
31.6FittingaGARCHModeltoData,andAnalysis 556
31.6.1FittingaGARCHModel 556
31.6.2FurtherAnalysisoftheData;ComparisonwiththeNormalDistribution 558
31.6.3FurtherAnalysisoftheData;VolatilityClustering 560
31.7ANoteonCorrelationandDependency 562
31.8GARCHLong-TermVariance 563
31.9Exercise:ShockstoGlobalEquityMarkets–TheGlobalFinancialCrisis2008, andCOVID-19 565
31.10ExtensionstotheGARCHModel 567
31.11Appendix–AMixtureofNormalDistributions 568
31.12RecommendedReading 569
32 ModellingFutureStockPricesUsingGeometricBrownianMotion:An Introduction 571
32.1Introduction 571
32.1.1DiscreteGaussianRandomWalk 572
32.2GeometricBrownianMotion 576
32.3ApplicationsofGBM,andSimulatingPrices 577
32.4RecommendedReading 583
33 FinancialOptions:Pricing,Characteristics,andStrategies 585
33.1Introduction 585
33.2WhatisaFinancialOption? 585
33.3WhatareFinancialOptionsUsedfor? 586
33.4Black,ScholesandMertonDifferentialEquation 587
33.4.1AssumptionsUnderlyingB-S-MFormulation 587
33.4.2SolutiontoB-S-MEquationforEuropeanCallOptions 588
33.4.3CallOptionPriceFunction 588
33.5CalculatingtheOptionPriceUsingSimulations 589
33.6FactorsWhichAffectthePriceofaCallOption 590
33.6.1SharePrice 590
33.6.2TimetoExpiry 592
33.6.3CombinedEffectofSharePriceandTimetoExpiry 592
33.6.4OtherFactors 594
33.7Greeks 594
33.8VolatilityofCallOptionPositions 595
33.9PutOptions 597
33.10DeltaHedging 598
33.11SketchoftheB-S-MDerivation 601
33.12FurtherTasks 602
33.13Appendix 602
33.14RecommendedReading 603
Index 605
Introduction
1MainObjectivesofThisBook
Theoverridingobjectiveofthisbookistohelpstudentsofactuarialmathematicsand relateddisciplinessuchasfinancialmathematics,developprogrammingskillswhichwill enhancetheirunderstandingofactuarial,financial,andstatisticalconcepts,enablingthem tosolvereal-worldproblemsencounteredinthesefields.Breakingthisdownfurther,the purposesofthebookistwo-fold:
1. Toprovideanintroductiontotheprogramminglanguage, R.Thisisachievedusing workedexamplesandundertakingexercisescommonlyseeninthefieldsofactuarial andfinancialmathematics.
2. Secondly,toimprovethereader’slevelofunderstandingofactuarialandfinancialtopicsbyusingtheseprogrammingskills.Webelievethatmoststudentscandevelopa deeperunderstandingofmathematicalmaterialbysolvingproblemsusingaprogramminglanguage.Fromourexperienceofteachingactuarialmathematicsandstatistics, studentsoftenconfirmthattheirunderstandingofatopichasvastlyimprovedfollowing thecompletionofacomputer-basedexerciseorproject.
Asimilareffectisnotedinstudentswhoopttotakeayearoutfromtheirstudiesto workinthefinancialindustry,oftenapplyingextensiveprogrammingskillstosolverealworldproblems.Suchstudentsinvariablynoticeasimilarlevelofimprovementintheir understandingofconcepts.Itishoped,tosomeextent,thatthislearningexperiencecan bemirroredthroughoutthisbook.
Theauthorshavesignificantteachingexperienceatbothundergraduateandpostgraduatelevels,enhancedwithexperienceinassessmentprocessesforuniversities andtheactuarialprofession.Thishasgiveninsightsintothetypicalissuesstudents experiencewithactuarialmathematics–problemsoftenarisefromafundamentalmisunderstandingofintroductorymaterial.Forexample,afinalyearundergraduatemay onlyfullyunderstandaconceptintroducedintheirfirstyearwhilstundertakingprogrammingcourseworkintheirfinalyearonaspecificapplicationofthematerialtaught inthefirstyear,experiencingthat“Eureka”moment.
Thereadershouldnotunderestimatetheextenttowhichlearningaprogramming language,suchas R,toalevelsuchthatmostexercisesinthisbookcanbecompleted,
RProgrammingforActuarialScience,FirstEdition.PeterMcQuireandAlfredKume. © 2024JohnWiley&SonsLtd.Published2024byJohnWiley&SonsLtd. CompanionWebsite: www.wiley.com/go/rprogramming.com
willhelpthereaderintheemploymentmarket.Havingagoodworkingknowledgeof R orsimilarlanguageshouldimprovethecareerprospectsofthegraduate.
Afurthermotivatingfactorforwritingthisbookoriginatesfromthedecisiontakenby theInstituteandFacultyofActuaries(IFoA)in2018tochoosethe R programminglanguageasanintegralpartofitssyllabus.Indeed,muchoftheIFoA’ssyllabiforsubjects CM1,CM2,CS1,and,inparticular,CS2arecoveredinthebook.
2WhoIsThisBookFor?
Thisbookisaimedattwomaingroups:
1. Ithasbeenwrittenprincipallyforuniversitylevelactuarialandfinancialmathsstudents,togetherwithgraduatesundertakingprofessionalactuarialexams(e.g.withthe IFoAandSOA),andmoregenerallytoanyoneaspiringtocareersinactuarialmathematicsandfinance.Thebookshouldbeusefultothestudentthroughouttheirstudies, whetherfirst-yearundergraduateorpostgraduate,spanningtopicsfromfundamentals offinancialmathematicsandBrownianmotion,toavarietyofmortalitymodelsand analysinginvestmentstrategiessuchasasset–liabilitymatchingandhedging.
2. Secondly,wehopethebookappealstomoreexperiencedprofessionalsinrelateddisciplineswishingtodevelopskillsinaprogramminglanguage,whomayhavehadlimited opportunitiestodosoearlierintheircareer.Byundertakingexamplesandexercises relatedtomaterialwithwhichtheyarealreadyfamiliar,thisbookprovidesanefficient journeytoacquiringsuchprogrammingskills.SuchusersofthisbookmaythereforewishtoreviewChapters3,4,23,and25,whichincludetraditionalmaterialmost actuarieswouldbefamiliarwith.
Inwritingthisbook,wehaveattemptedtocaterforawiderangeofexperiencesandabilities.Theoverallstyleofthebookaimstoensurethatthebasicsofeachtopicarecovered, withappropriatetext,examples,andexercises,whilstincludingseveralmoreadvanced tasks.Asnotedelsewhere,thereadershouldaimatexpandingonthetasksincludedin thisbook.
Itisassumedthatthereaderwillhaveaknowledgeofstatisticsandmathematics atalevelexpectedfromthatofafirstyearundergraduateinamaths-baseduniversity degree.
Thebookincludesthemajorityoftopicscoveredinatypicalundergraduatecourse inactuarialscience.Thereisalsoperhapsagreateremphasisplacedonanumberof actuarialconceptswhichmaynotdirectlybeassessedintraditionaluniversitycourses; indeed,severalexamplesinvolveaddressingpracticalproblemswhichthestudentwill seeintheworkplace.Forexample,weintroducemodelswhichmayhelpinimproving howcorrelationsaredealtwithbyinsurancecompanies,anddevelopanunderstanding offundamentalriskmanagementtechniquessuchashedging,asset-liabilitymatching, anddiversification.Ultimately,wehopethereaderdevelopsagoodunderstandingofthe problem-solvingapproachesusedintheworkplace.
3HowtoUseThisBook
Togetthemostfromthisbookitisanticipatedthatduringeachstudysessiontheuserwill simultaneously:
● studythematerialinthisbook,
● accessthebook’swebsite(codeanddata),and
● writeandruncodeontheircomputer.
Itwouldbeexpectedthattheuserproceedstowritetheirowncodeandduplicatethe results.Thesuggestedcodeforeachexample/exerciseisoneofmanypossiblesolutions; itmaybequitereasonable,dependingonthescenario,foryourcodetobequitedifferent tothatsetoutinthisbook.Itisimportantthattheuserpractiseswritingtheirown,independentcode,anddoesnottrytolearn,byrote,thecodeinthebook.AsnotedinChapter 1,thereadermaywishtosaveascriptfileinrespectofeachchapter.Indeedthereader maywishtowritefunctionsincorporatingandcombiningseveralsectionsofcodefrom thewebsite,improvingtheefficiencyoftheircode.
Wewouldexpectmostuserstohavehadsomepriorexposureto,andknowledgeof,the materialinachapterbeforeembarkingonit,eitherfollowinganinitialperiodofindependentstudy,orattendanceatrelateduniversitylecturesortutorials;itisanticipatedthat readerswillhaveaccesstoalternativestudymaterialforeachtopic.
Thewebsitecontainsthemajorityofthe R codeincludedinthebook,togetherwith suggestedcoderelatingtotheexercises.Itisintendedthatthereaderwilltreatthebook andwebsiteascompanions;itisnotexpectedthatmostusersusethebookandwebsite separately(forthemostpartatleast).Notethatasmallamountofcodeisnotincludedon thewebsite(themissingcodecansimplybecopiedfromthebook)–thisistoencourage moreactivelearningofthematerial.
Thevastmajorityofstudentswillgainmostbenefitfromfrequentpractiseofwriting code;occasionalengagementislikelytoendinlesssatisfactoryresults.Itishopedthatthe styleofthebookwilllenditselftoencouragingagreaterlevelofcreativityfromthestudent, developingtheirownexamplesandexercisesastheirskillsandknowledgeincrease.
4BookStructure
Westartbycoveringthefundamentalsof R inChapter1,“R:Whatyouneedtoknowtoget started”,andChapter2,“Functionsin R”.Ifyouarenewto R werecommendthatyoufirst readthesetwochapters,andrevisitthemwhenrequired.Chapter1explainsthekeyaspects of R,e.g.writingyourfirstcodein R,howobjectsareusedetc.Fromexperience,most studentsfinditbeneficialtoinitiallyreadthischapterrelativelyquickly,referringback toitfrequently.Readersnewto R shouldbenefitfromspendingsometimedigestingthe examplesinChapter2togetafeelforwritingbasic R codeandapplyingexistingfunctions.
ThetypicalactuarialandfinancialmathematicsstudentisthenlikelytocoverChapters 3and4–“FinancialMathematicsI”(and“II”);thematerialincludedinthesetwochapters isusuallycoveredinthefirstyearofactuarialmathematicsprogrammesatuniversity.
Asnotedabove,wethinkmostreaderswillbenefitfromonlyarelativelybriefstudyof Chapters1and2,andtomoveontothemainchaptersandstartpractising!Itisunlikely tobebeneficialtospenddaysmemorisingthematerialintheseintroductorychapters.
Mostchaptersarelargelyself-contained,withafewobviousexceptions,e.g.Financial MathematicsIandII,ContingenciesIandII,thechaptersoncopulas,andMarkovmortalitymodels.Thereisacertainamountofgroupingofchapterswherethematerialisstrongly related,anditislikelythatmostreaderswilltendtoreadagroupoftopicstogether.
Anumberofchapterslendthemselvesparticularlytoarelativelybriefinitialstudy,subsequentlyre-visitingthemwhenstudyingalaterchapterwhichusesthatmaterial.For example,applicationofthematerialinChapters5and6isusedinseverallaterchapters ofthebook.
5ChapterStyle
Mostchaptersbeginwithsettingkeyobjectivesandabroaddiscussionofthemainideas behindthetopicofthechapter.Thisisusuallyfollowedwithacertainamountoftheory, thelengthofwhichisbasedonourexperienceofhowwellstudentsgenerallytendtograsp theconcepts.Comparedtoothertexts,therewill,ingeneral,belesstheoryincludedinthis book.Manytopicscoveredinthisbookalreadyhaveawealthofexcellenttexts–repetition ofthesametheoryisnotwarrantedhere.Theimportanceofmathematicalrigourshould bestressedatthispoint;thestudentwillbenefitgreatlyoverthelong-termbydeveloping adeeperunderstandingofthematerialwhichcanbeadaptedtovariousscenarios(such comparisonsarehighlightedinseveralchaptersofthebook).Eachchapterendswitha RecommendedReadinglist.
AsnotedinSection 3,mostreaderswillrequireadditionalprincipalreadingmaterial oneachtopictosupplementthematerialinthisbook.Thisbookfocusesonsolvingactuarialproblemsbyusingthe R programminglanguage,andisnotintendedtobeusedasa student’ssolesourceoflearningforeachsubject.
Thereadermayalsofinditbeneficialtoownacopyofthe“FormulaeandTables”issued bytheInstituteandFacultyofActuaries(2002)(alsofreelyavailableonlineatthetimeof publishing).
6ExamplesandExercises
Thereareover400examplesandexercisesincludedinthisbookwhichreaderscanuseto developtheirprogrammingskillsandunderstandingofthemathematicalconcepts.The bookincludestwomaintypesoftasks:
1. Analysisofdatasets(suchasclaimsdata,investmentdata,mortalitydataetc.),fitting variousmodelstodata,andtestingtheresults.Youwillfindthesedatasetsonthebook’s website.
2. Othertasksdonotrequiredatasets.Codeisusedtodevelopabetterunderstandingof actuarialconcepts,oftenwiththeuseofsimulations.Thebookincludes,forthemost
part,theuseofrelativelysimplecode,aimedatcommunicatingthefundamentalideas ofthemathematicsinvolved–itistheintentionthatthereaderwilldeveloptheircoding skillsthroughself-study.
Itishopedthatreaderswillalsocombinecodefromvariouspartsofthebook,developing theirownmoreadvancedmodels.Forexample,bycombiningcodefromvariouschapters onassetmodelling,claimsmodels,andmortalitymodels,onecoulddevelopamodelfor aninsurancecompany.
Ultimately,actuariesareinvolvedinthemanagementofrisk–muchofthisbookrelates tomeasuringriskanduncertainty,andhowtomanagetherisksidentified.Indeed,inadequateriskmanagementhascontributedtomanycorporatefailures,bothonthemacro orgloballevel,andalsowithinfirmsandindustries.Manyoftheexamplesandexercises aimtodeveloptheseanalyticalskills.Wemainlydiscussriskinthecontextoffinancial risk(suchasinterestrateriskandmarketpricerisk),anddemographicrisk(suchasmortalityrisk),althoughmanyoftheprinciplescouldbeappliedtooperational-typerisks. Mostofthesediscussionswillrelatetothefieldsoffinanceandinsurance(bothlifeand non-life).
Awordofwarning–thematerialinthisbookmainlyrelatestothequantitativemanagementofrisk,thatis,analysingdataandproposingstatisticaldistributionsandmodelsto predictfinancialoutcomes.Itisimportantwhenanalysingreal-worldriskthataqualitative approachistakenalongsidesuchaquantitativeapproach–therelativeweightsassignedto thetwoapproachesdependingontheparticularscenario.Ariskinitselfisanover-reliance onquantitativefinancialmodels,attheexpenseofanyqualitativeanalysisandexerciseof judgement.
Thestudentislikelytobenefitfromareviewofcasestudymaterialwhichrelateto riskmanagementcases.Studyofsuchcaseswillprovideamoreroundededucationand knowledge-baseofriskmanagement,ratherthansolelyunderstandingthemathematical approachdiscussedinthisbook.Examplesofsuchcasestudiesinclude:RobertMaxwell andtheMirrorGroupnewspapers,BaringsBank,EquitableLifeAssuranceCompany, Long-TermCapitalManagement,GFC2008,NorthernRock,LehmannBrothers,UKpensionschemes/LDIcrisis2022,SiliconValleyBank;andrelevantregulations,suchas:Basel Accords,Solvency2,TheDodd-FrankAct,TheSarbanes-OxleyAct.
7VerificationofCodeandCalculations–BestPractice
Akeyskilloftheactuaryistoverifycomplexcalculationsefficiently.Forexample,actuarial valuationsofinsurancecompaniesandcompanypensionschemestypicallyinvolvemillionsofcalculations;clearlyitisnotsensibletocheckallofthem.Theactuarymustbeable tocheckcalculationsinanappropriate,cost-effectivemannersuchthatthey,andother stakeholders,havesufficientconfidenceinthemandcanrelyontheiraccuracy.Errorsin thesecalculationsmayresultinadvicewhichhasasignificantimpactoncompanybalance sheets,solvencylevels,profits,amountofadditionalfundingrequired,dividendpayouts, andevenfuturecareerprospects.Wewilloftenprovidemorethanonecodedsolutionto
aprobleme.g.byperforminganalternative,approximatecalculation.Thisisaskillthe authorsbelievemostundergraduateswouldbenefitfromimprovingpriortoenteringthe workplace.
8Website: www.wiley.com/go/rprogramming.com
Thebook’swebsiteincludescodefromeachchapterofthebook,togetherwithalldata filesused.Itwillalso,periodically,beupdatedwithextracodingexercisesandsolutions.Wewelcomefeedbackfromourreadersregardingareaswhichrequirefurther examples.
Asreferredtoearlier,thewebsiteisfundamentalinusingthisbookefficiently.Aswell asprovidingsolutionstoexercisesinthebook,itallowscopyingofthesuggested R code thussavingsignificanttime.
9RorMicrosoftExcel?
ItisexpectedthatmanyreaderswillhavesomelevelofexperienceinusingMicrosoftExcel. Excelisafantasticcalculationtool.AsignificantbenefitofExcelisitsintuitivenature, makingitrelativelystraightforwardtolearnthebasicsandquicklyreachareasonablelevel ofcompetency.IndeedmanyfinancialinstitutionsuseExcelasaprincipalpieceofsoftware.Programminglanguagessuchas R haveasignificantlysteeperlearningcurvethan Excel;mostnewusers,particularlythosewithnoprogramminglanguagebackground,will takeseveraldaystofamiliarisethemselveswiththebasicworkingsofthe R language.
Ingeneral,Excelislikelytobepreferredto R forsimplertasks,orformoreinvolved calculationswhichareunlikelytorequirenumerousre-runswithvariousadjustments; theextracostandtimeinvolvedinwriting R codemaynotbejustifiedinsuchcasesgiven therelativelysmallsavingsoverthelongterm.Inthesamewaythatthereareoccasions whenacalculatorisapreferabletooltoExcel,therearemanyoccasionswhenExcelwill bepreferableto R.
ItisimportantforthereaderwhohasexperiencewithExceltodevelopanunderstanding ofwhetheraprogramminglanguagesuchas R willbemoresuitedatsolvingaparticularproblem,orsetofproblems,thanExcel.Thisshouldbeachievedasthereadermakes progresswiththisbook.ThereaderisencouragedtotackleexercisesinExcel(wherepossible)andtocomparetheprocesswith R.AnobviousexampleisthatoftheLoanSchedule discussedinChapter4(wherethereaderisspecificallyencouragedtoreproducethecalculationsinExcel);itmaywellbethecaseherethatusing R codeisnotwarranted.Anumber ofcalculationschedulesinvolvingLifeContingencyexamples(Chapters23and24)may alsoprovemoreuser-friendlywithExcel.However,asthesemodelsbecomemorecomplex(e.g.incorporatingstochasticinterestratemodels)atsomepointitislikelythat R will becomemoreefficient.
Forexample,apensionscheme’svaluationcalculationsmaytakeseveralminutestorun inExcelcomparedtoafewsecondsin R;suchcalculationsmaybere-runhundredsof
timesthroughouttheanalysisandverificationprocessofthevaluation,thusbenefiting fromfasterrunningspeeds.Adecisionisoftenrequiredthereforeregardingprogramming andrunningtimesandrelatedcostswhencomparingExcelandaprogramminglanguage suchas R.
Forabasiclevelofstatisticalanalysis,Excelmaybethepreferredchoice;however R will bethepreferredrouteinvolvingtaskswhichrequireanythingmorethanbasicanalysis. Tounderstandthebenefitsofusingaprogramminglanguagesuchas R requiresacertain amountofpracticeandapplication;alltheaboveshouldbecomeclearerwithexperience.
Forthecasualdatauser,Excelisbetter,giventhesteeperlearningcurverequiredto learnmostprogramminglanguagessuchas R.Excelisindeedusedinseveralofouractuarialmathsclassestodemonstratesmall-scale,simplifiedcalculations;however,theseoften tendnottobeparticularlyrealistic,andultimatelyleadtoabetterlearningandteaching experiencewhencarriedoutin R.Indeed,withstudentsmigratingtowardslanguageslike R andPython,agreaterproportionofourassessmentsnowinvolve R programming.
Therearetaskswhichareparticularlyunsuitedto,orjustnotpossibleinExcel.For example,itisnotasimpletasktocalculateeigenvectorsinExcel,butthesecanbecalculatedalmostinstantlywithonelineof R code;similarly,runninglargenumbersof simulationsusingcomplexmodels,largematrixcalculations,complexregressionanalysisetc.areproblematic.Manystudentprojectsrequiretheuseof R (orsimilarlanguage), andaresimplynotpossibleusingExcel. R hasmanystatisticalfunctionswhichrunsignificantlyquicker,orareunavailableinExcel.Wewillseemanyexamplesoftasksinthisbook whereusingExcelisextremelyslowandimpractical,suchaswhenrunninglargetasks, anddoesnotdealparticularlywellwithhugedatasets–oftencausingittoslowdownand crash.Simpletasksin R suchasanalysingmillionsofrowsofdataacrossseveraldatabases canbeextremelytime-consuminginExcel,andpronetocalculationerrors.Several R script file(thefilewhichcontainsthecode)canbeusedtocombinevarioustasks;withExcelthe solutionissignificantlylesselegant,andmoredifficulttoverifyandaudit.
Alsowearefrequentlyrequiredtosolveproblemsnumericallye.g.anexactsolutionmay notbepossibleoreasytoobtain.Suchanumericalapproachisusuallymoresuitedto R thanExcel.
Theuseof R isalsolikelytoreducetheriskofdatacorruptionandothererrorsbeing made.Physicallymanipulatingdataandformulae(cutting,copying,deleting,pasting)in Excelisgenerallyquickandeasy,butnotparticularlyrobust.Humanerrorintroduces mouse-slips,movingtoincorrectcellsetc.Suchprocessesmayberequiredtobeperformed severaltimesonmanysimilardatasets–with R werunthesameprogramrequiringno manualinteractionwiththedata.Lessexperiencedusersmaysuggestapplyingcareis requiredwhenhandlingdata;eventually,however,errorswillbemade.Theexperienced Exceluserwillonlybetooawareofsuchproblemsandthepotentialforcalculationdisaster. Ultimately, R islikelytobemorerobust,andlesspronetomanualerrors.
Asimilarcomparisoncouldbemadewiththerequirementfordatabasesystems.For small-scaledatascenariosaspreadsheetmayperformtherequiredtasksadequately;similartothecomparisonwith R,Excelwilltendtobeinitiallymoreuser-friendlythan adatabaseprogramme.However,withaddedcomplexityarobust,dedicateddatabase systemisrequired.Readersmaywishtoreviewtherecenthigh-profilecaserelating