(ebook pdf) applied survival analysis using r (use r!) 1st ed. 2016 edition - Own the ebook now and

Page 1


We believe these products will be a great fit for you. Click the link to download now, or visit ebooksecure.com to discover even more!

Progress in Heterocyclic Chemistry Volume 29 1st Edition - eBook PDF

https://ebooksecure.com/download/progress-in-heterocyclicchemistry-ebook-pdf/

(eBook PDF) Time Series: A Data Analysis Approach Using R

http://ebooksecure.com/product/ebook-pdf-time-series-a-dataanalysis-approach-using-r/

(eBook PDF) Translational Medicine in CNS Drug Development, Volume 29

http://ebooksecure.com/product/ebook-pdf-translational-medicinein-cns-drug-development-volume-29/

Cardiology-An Integrated Approach (Human Organ Systems) (Dec 29, 2017)_(007179154X)_(McGraw-Hill) 1st Edition Elmoselhi - eBook PDF

https://ebooksecure.com/download/cardiology-an-integratedapproach-human-organ-systems-dec-29-2017_007179154x_mcgraw-hillebook-pdf/

Bayesian Analysis with Excel and R 1st Edition Conrad Carlberg - eBook PDF

https://ebooksecure.com/download/bayesian-analysis-with-exceland-r-ebook-pdf/

Behavioral Forensics: Using Applied Behavior Analysis in Psychological Court Evaluations 1st Edition- eBook PDF

https://ebooksecure.com/download/behavioral-forensics-usingapplied-behavior-analysis-in-psychological-court-evaluationsebook-pdf/

(eBook PDF) A Handbook of Statistical Analyses using R 3rd Edition

http://ebooksecure.com/product/ebook-pdf-a-handbook-ofstatistical-analyses-using-r-3rd-edition/

Poisoning & Drug Overdose 7th Edition Kent R. Olson (Ed.) - eBook PDF

https://ebooksecure.com/download/poisoning-drug-overdose-ebookpdf/

Contemporary Management, 12th ed. 12th Edition Gareth R. Jones - eBook PDF

https://ebooksecure.com/download/contemporary-management-12th-edebook-pdf/

Applied Survival Analysis Using R

SeriesEditors: RobertGentlemanKurtHornikGiovanniParmigiani

Moreinformationaboutthisseriesat http://www.springer.com/series/6991

UseR!

Wickham: ggplot2(2nded.2016)

Luke: AUser’sGuidetoNetworkAnalysisinR

Monogan: PoliticalAnalysisUsingR

Cano/M.Moguerza/PrietoCorcoba: QualityControlwithR

Schwarzer/Carpenter/Rücker: Meta-AnalysiswithR

Gondro: PrimertoAnalysisofGenomicDataUsingR

Chapman/Feit: RforMarketingResearchandAnalytics

Willekens: MultistateAnalysisofLifeHistorieswithR

Cortez: ModernOptimizationwithR

Kolaczyk/Csàrdi: StatisticalAnalysisofNetworkDatawithR

Swenson/Nathan: FunctionalandPhylogeneticEcologyinR

Nolan/TempleLang: XMLandWebTechnologiesforDataScienceswithR

Nagarajan/Scutari/Lèbre: BayesianNetworksinR vandenBoogaart/Tolosana-Delgado: AnalyzingCompositionalDatawithR Bivand/Pebesma/Gòmez-Rubio: AppliedSpatialData AnalysiswithR(2nded.2013)

Eddelbuettel: SeamlessRandC++IntegrationwithRcpp Knoblauch/Maloney: ModelingPsychophysicalDatainR Lin/Shkedy/Yekutieli/Amaratunga/Bijnens: ModelingDose-ResponseMicroarray DatainEarlyDrugDevelopment ExperimentsUsingR

Cano/M.Moguerza/Redchuk: SixSigmawithR

Soetaert/Cash/Mazzia: SolvingDifferentialEquationsinR

RutgersSchoolofPublicHealth

Piscataway,NJ,USA

ISSN2197-5736ISSN2197-5744(electronic) UseR!

ISBN978-3-319-31243-9ISBN978-3-319-31245-3(eBook) DOI10.1007/978-3-319-31245-3

LibraryofCongressControlNumber:2016940055

©SpringerInternationalPublishingSwitzerland2016

Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.

Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse.

Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.

Printedonacid-freepaper

ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland

Preface

Thisbookservesasanintroductoryguideforstudentsandanalystswhoneed toworkwithsurvivaltimedata.Theminimumprerequisitesarebasicapplied coursesinlinearregressionandcategoricaldataanalysis.Studentswhoalsohave takenamaster’slevelcourseinstatisticaltheorywillbewellpreparedtowork throughthisbook,sincefrequentreferenceismadetomaximumlikelihoodtheory. Studentslackingthistrainingmaystillbe abletounderstandmostofthematerial, providedtheyhaveanunderstandingofthebasicconceptsofdifferentialand integralcalculus.Specifically,studentsshouldunderstandtheconceptofthelimit, andtheyshouldknowwhatderivativesandintegralsareandbeabletoevaluatethem insomebasiccases.

Thematerialforthisbookhascomefromtwosources.Thefirstsourceis anintroductoryclassinsurvivalanalysisforgraduatestudentsinepidemiology andbiostatisticsattheRutgersSchoolofPublicHealth.Biostatisticsstudents,as onewouldexpect,haveamuchfirmergraspofmoremathematicalaspectsof statisticsthandoepidemiologystudents. Still,Ihavefoundthat thoseepidemiology studentswithstrongquantitativebackgroundshavebeenabletounderstandsome mathematicalstatisticalproceduressuchasscoreandlikelihoodratiotests,provided thattheyarenotexpectedtosymbolicallydifferentiateorintegratecomplex formulas.InthisbookIhave,whenpossible,usedthenumericalcapabilitiesofthe Rsystemtosubstituteforsymbolicmanipulation.Thesecondsourceofmaterial isderivedfromcollaborationswithphysiciansandepidemiologistsattheRutgers CancerInstituteofNewJerseyandattheRutgersRobertWoodJohnsonMedical School.Anumberofthedatasetsinthistextarederivedfromthesecollaborations. Also,theexperienceoftrainingstatisticalanalyststoworkonthesedatasets providedadditionalinspirationforthebook.

Thefirstchapterintroducestheconceptsofsurvivaltimesandhowright censoringoccursanddescribesseveralofthedatasetsthatwillbeusedthroughout thebook.Chapter 2 presentsfundamentalsofsurvivaltheory.Thisincludeshazard, probabilitydensity,survivalfunctions,andhowtheyarerelated.Thehazard functionisillustratedusingbothlifetabledataandusingsomecommonparametric distributions.Thechapterendswithabriefintroductiontopropertiesofmaximum

IwouldliketothankRebeccaMossforpermissiontousethe“pancreatic”data andMichaelSteinbergforpermissiontousethe“pharmacoSmoking”data.Bothof thesedatasetsareusedrepeatedlythroughoutthetext.Iwouldalsoliketothank GraceLu-Yao,WeichungJoeShih,andYongLinforyears-longcollaborations onusingtheSEER-Medicaredataforstudyingthesurvivaltrajectoriesofprostate cancerpatients.Thesecollaborationsledtothedevelopmentofthe“prostateSurvival”datasetdiscussedinthistextinChapter 9.IthanktheDivisionofCancer EpidemiologyandGeneticsoftheUSNationalCancerInstituteforprovidingthe “asheknazi”data.IalsothankWanYeeLauformakingthe“hepatoCellular”data publicallyavailableintheonlineDryaddatarepositoryandforallowingmeto includeitinthe“asaur”Rpackage.

Piscataway,NJ,USADirkF.Moore October2015

5RegressionAnalysisUsingtheProportionalHazardsModel

5.1CovariatesandNonparametricSurvivalModels

5.2ComparingTwoSurvivalDistributionsUsing aPartialLikelihoodFunction ........................................56

5.3PartialLikelihoodHypothesisTests

5.3.1TheWaldTest ...............................................60

5.3.2TheScoreTest

5.3.3TheLikelihoodRatioTest

5.4ThePartialLikelihoodwithMultipleCovariates

5.5EstimatingtheBaselineSurvivalFunction

5.6HandlingofTiedSurvivalTimes ....................................65

5.7LeftTruncation .......................................................69

5.8AdditionalNotes .....................................................71

6ModelSelectionandInterpretation

6.1CovariateAdjustment ................................................73

6.2CategoricalandContinuousCovariates

6.3HypothesisTestingforNestedModels

6.4TheAkaikeInformationCriterionforComparing Non-nestedModels ...................................................81

6.5IncludingSmoothEstimatesofContinuousCovariates inaSurvivalModel ..................................................84

6.6AdditionalNote

7.1AssessingGoodnessofFitUsingResiduals

7.1.1MartingaleandDevianceResiduals

7.1.2CaseDeletionResiduals

7.2CheckingtheProportionHazardsAssumption

7.2.1LogCumulativeHazardPlots

7.2.2SchoenfeldResiduals .......................................96 7.3AdditionalNote ......................................................100

8TimeDependentCovariates

8.2PredictableTimeDependentVariables

8.2.1UsingtheTimeTransferFunction

8.2.2TimeDependentVariablesThatIncrease LinearlywithTime

8.3AdditionalNote

9MultipleSurvivalOutcomesandCompetingRisks

9.1ClusteredSurvivalTimesandFrailtyModels

9.1.1MarginalSurvivalModels ..................................115

9.1.2FrailtySurvivalModels ....................................116

9.1.3AccountingforFamily-BasedClusters inthe“ashkenazi”Data

9.1.4AccountingforWithin-PersonPairingofEye ObservationsintheDiabetesData .........................120

9.2Cause-SpecificHazards ..............................................121

9.2.1Kaplan-MeierEstimationwithCompetingRisks

9.2.2Cause-SpecificHazardsandCumulative IncidenceFunctions ........................................123

9.2.3CumulativeIncidenceFunctionsforProstate CancerData .................................................126

9.2.4RegressionMethodsforCause-SpecificHazards

9.2.5ComparingtheEffectsofCovariateson DifferentCausesofDeath ..................................131

9.3AdditionalNotes .....................................................134

10.2TheExponentialDistribution

10.3TheWeibullModel ...................................................138

10.3.1AssessingtheWeibullDistributionasaModel forSurvivalDatainaSingleSample

10.3.2MaximumLikelihoodEstimationofWeibull ParametersforaSingleGroupofSurvivalData ..........141

10.3.3ProfileWeibullLikelihood .................................142

10.3.4SelectingaWeibullDistributiontoModel SurvivalData ...............................................143

10.3.5ComparingTwoWeibullDistributionsUsing theAcceleratedFailureTimeandProportional HazardsModels .............................................146

10.3.6ARegressionApproachtotheWeibullModel ............148

10.3.7UsingtheWeibullDistributiontoModel SurvivalDatawithMultipleCovariates ...................149

10.3.8ModelSelectionandResidualAnalysiswith WeibullSurvivalData ......................................151

10.4OtherParametricSurvivalDistributions ............................153

11.1PowerandSampleSizeforaSingleArmStudy

11.2DeterminingtheProbabilityofDeathinaClinicalTrial ...........161

11.3SampleSizeforComparingTwoExponentialSurvival Distributions ..........................................................163

11.4SampleSizeforComparingTwoSurvivalDistributions UsingtheLog-RankTest ............................................165

11.5DeterminingtheProbabilityofDeath fromaNon-parametricSurvivalCurveEstimate ...................166

11.6Example:CalculatingtheRequiredNumberofPatients foraRandomizedStudyofAdvancedGastricCancerPatients

11.7Example:CalculatingtheRequiredNumberofPatients foraRandomizedStudyofPatientswithMetastatic ColorectalCancer ....................................................170

11.8UsingSimulationstoEstimatePower ...............................171

11.9AdditionalNotes .....................................................174

12AdditionalTopics ...........................................................177

12.1UsingPiecewiseConstantHazardstoModelSurvivalData

12.2IntervalCensoring ....................................................187

12.3TheLassoMethodforSelectingPredictiveBiomarkers

AABasicGuidetoUsingRforSurvivalAnalysis ........................201

A.1TheRSystem .........................................................201

A.1.1AFirstRSession ...........................................202

A.1.2ScatterplotsandFittingLinearRegressionModels

A.1.3AccommodatingNon-linearRelationships

A.1.4DataFramesandtheSearchPathforVariableNames

A.1.5DefiningVariablesWithinaDataFrame

A.1.6ImportingandExportingDataFrames

A.2WorkingwithDatesinR .............................................212

A.2.1DatesandLeapYears .......................................213

A.2.2Usingthe“as.date”Function ...............................213

A.3PresentingCoefficientEstimatesUsingForestPlots

A.4ExtractingtheLogPartialLikelihoodandCoefficient EstimatesfromacoxphObject ......................................217

thosewhoarenot,anoverviewofRmaybefoundintheappendix,andlinkstomore extensiveRguidesandmanualsmaybefoundonthemainRwebsite.Readerswho masterthetechniquesinthisbookwillbeequippedtouseRtocarryoutsurvival analysesinapracticalsetting,andthosewhoarefamiliarwithoneofthemany excellentcommercialstatisticalpackagesshouldbeabletoadaptwhattheyhave learnedtotheparticularcommandsyntaxandoutputstyleofthatpackage.

1.2WhatYouNeedtoKnowtoUseThisBook

Survivalanalysisresembleslinearandlogisticregressionanalysisinseveralways: thereis(typically)asingleoutcomevariableandoneormorepredictors;testing statisticalhypothesesabouttherelationshipofthepredictorstotheoutcome variableisofparticularinterest;adjustingforconfoundingcovariatesiscrucial; andmodelselectionandcheckingofassumptionsthroughanalysisofresidualsand othermethodsarekeyrequirements.Thus,readersshouldbefamiliarwithbasic conceptsofclassicalhypothesistestingandwithprinciplesofregressionanalysis. Familiaritywithcategoricaldataanalysis methods,includingcontingencytables, stratifiedcontingencytables,andPoissonandlogisticregression,arealsoimportant. However,survivalanalysisdiffersfromtheseclassicalstatisticalmethodsinthat censoringplaysacentralroleinnearlyallcases,andthetheoreticalunderpinningsof thesubjectarefarmorecomplex.WhileIhavestrivedtokeepthemathematicallevel ofthisbookasassessableaspossible,manyconceptsinsurvivalanalysisdepend onsomeunderstandingofmathematicalstatistics.Readersataminimummust understandkeyideasfromcalculussuchaslimitsandthemeaningofderivativesand integrals;thedefinitionofthehazardfunction,forexample,underlieseverything wewilldo,anditsdefinitiondependsonlimits.Anditsconnectiontothesurvival functiondependsonanintegral.Thosewhoarealreadyfamiliarwithbasicconcepts oflikelihoodtheoryatthelevelofaMastersprograminstatisticsorbiostatisticswill havetheeasiesttimeworkingthroughthisbook.Forthosewhoarelessfamiliarwith thesetopicsIhaveendeavoredtousethenumericalcapabilitiesofRtoillustrate likelihoodprinciplesastheyarise.Also,asalreadymentioned,thereaderisexpected tobefamiliarwiththebasicsofusingtheRsystem,includingsuchconceptsas vectors,matrices,datastructuresandcomponents,anddataframes.Heorsheshould alsobesufficientlyfamiliarwithRtocarryoutbasicdataanalysesandmakedata plots,aswellasunderstandhowtoinstallinRpackagesfromthemainCRAN (ComprehensiveRArchiveNetwork)repository.

1.3SurvivalDataandCensoring

Akeycharacteristicofsurvivaldataisthattheresponsevariableisanon-negative discreteorcontinuousrandomvariable,andrepresentsthetimefromawelldefinedorigintoawell-definedevent.Asecondcharacteristicofsurvivalanalysis,

censoring,ariseswhenthestartingorendingeventsarenotpreciselyobserved. Themostcommonexampleofthisisrightcensoring,whichresultswhenthefinal endpointisonlyknowntoexceedaparticularvalue.Formally,if T isarandom variablerepresentingthetimetofailureand U isarandomvariablerepresenting thetimetoacensoringevent,whatweobserveis T D min.T ; U / andacensoring indicator ı D I ŒT < U .Thatis, ı is0or1accordingtowhether T isacensored timeoranobservedfailuretime.Lesscommonlyonemayhaveleftcensoring, whereeventsareknowntohaveoccurred before acertaintime,orintervalcensoring, wherethefailuretimeisonlyknowntohaveoccurredwithinaspecifiedintervalof time.Fornowwewilladdressthemoreprevalentright-censoringsituation.

Censoringmaybeclassifiedintothreetypes:TypeI,TypeII,orrandom.In TypeIcensoring,thecensoringtimesarepre-specified.Forexample,inananimal experiment,acohortofanimalsmaystartataspecifictime,andallfolloweduntil apre-specifiedendingtime.Animalswhichhavenotexperiencedtheeventof interestbeforetheendofthestudyarethencensoredatthattime.Anotherexample, discussedindetailinExample 1.5,isasmokingcessationstudy,wherebydesign eachsubjectisfolloweduntilrelapse(returntosmoking)or180days,whichever comesfirst.Thosesubjectswhodidnotrelapsewithinthe180dayperiodwere censoredatthattime.

TypeIIcensoringoccurswhentheexperimentalobjectsarefolloweduntilaprespecifiedfractionhavefailed.Suchadesignisrareinbiomedicalstudies,butmay beusedinindustrialsettings,wheretimetofailureofadeviceisofprimaryinterest. Anexamplewouldbeonewherethestudystopsafter,forinstance,25outof100 devicesareobservedtofail.Theremaining75deviceswouldthenbecensored.In thisexample,thesmallest25%oftheorderedfailuretimesareobserved,andthe remainderarecensored.

Thelastgeneralcategoryofcensoringis random censoring.Carefulattentionto thecauseofthecensoringisessentialinordertoavoidbiasedsurvivalestimates.In biomedicalsettings,onecauseofrandomcensoringispatientdropout.Ifthedropout occurstrulyatrandom,andisunrelatedtothediseaseprocess,suchcensoringmay notcauseanyproblemswithbiasintheanalysis.Butifpatientswhoareneardeath aremorelikelytodropoutthanotherpatients,seriousbiasesmayarise.Another causeofrandomcensoringiscompetingevents.Forinstance,inExample 1.4,the primaryoutcomeistimetodeathfromprostatecancer.Butwhenapatientdiesof anothercausefirst,thenthatpatientwillbecensored,sincethetimehewouldhave diedofprostatecancer(hadhenotdiedfirstoftheothercause)isunknown.The questionofindependenceofthecompetingcausesis,ofcourse,animportantissue, andwillbediscussedinSect. 9.2.

Inclinicaltrials,themostcommonsourceofrandomcensoringis administrative censoring,whichresultsbecausesomepatientsinaclinicaltrialhavenotyetdied atthetimetheanalysisiscarriedout.Thisconceptisillustratedinthefollowing example.

Example1.1. Considerahypotheticalcancerclinicaltrialwheresubjectsenterthe trialoveracertainperiodoftime,knownastheaccrualperiod,andarefollowed foranadditionalperiodoftime,knownasthefollow-upperiod,todeterminetheir survivaltimes.Thatis,foreachpatient,wewouldliketoobservethetimebetween whenapatiententeredthetrialandwhenthatpatientdied.Butunlessthetypeof cancerbeingstudiedisquicklyfatal,somepatientswillstillbealiveattheendofthe follow-uptime,andindeedmanypatientsmaysurvivelongafterthistime.Forthese patients,thesurvivaltimesareonlypartiallyobserved;weknowthatthesepatients surviveduntiltheendoffollow-up,butwedon’tknowhowmuchlongertheywill survive.Suchtimesaresaidtoberight-censored,andthistypeofcensoringisboth themostcommonandthemosteasilyaccommodated.Othertypesofcensoring,as wehaveseen,includeleftandintervalcensoring.Wewilldiscussthesebrieflyin thelastchapter.

Figure 1.1 presentsdatafromahypotheticalclinicaltrial.Here,fivepatientswere enteredovera2.5-yearaccrualperiodwhichranfromJanuary1,2000untilJune 30,2002.Thiswasfollowedby4.5yearsofadditionalfollow-uptime,whichlasted untilDecember31,2007.Inthisexample, thedataweremeanttobeanalyzedat thistime,butthreepatients(Patients1,3and4)werestillalive.Alsoshowninthis exampleistheultimatefateofthesethreepatients,butthiswouldnothavebeen knownatthetimeofanalysis.Thus,forthesethreepatients,wehaveincomplete informationabouttheirsurvivaltime.Forexample,weknowthatPatient1survived atleast7years,butasoftheendof2007itwouldnothavebeenknownhowlong thepatientwouldultimatelylive.

Fig.1.1 Clinicaltrialaccrual andfollow-upperiods.The verticaldashedlines indicate thetrialstart,endofaccrual, andendoffollow-up.TheX’s denotedeathsandthe open circles denotecensoring events

Accrual Follow−up

Informative censoring,bycontrast,may(forexample)resultifindividualsina clinicaltrialtendtodropoutofthestudy(andbecomelosttofollow-up)forreasons relatedtothefailureprocess.Thistypeofcensoringcanintroducebiasesintothe analysisthataredifficulttoadjustfor.Themethodswediscusswillrequirethe assumptionthatcensoringisnon-informative.

Thegoalsofsurvivalanalysisaretoestimatethesurvivaldistribution,tocompare twoormoresurvivaldistributions,or(moregenerally)toassesstheeffectsofa numberoffactorsonsurvival.Thetechniquesbearsomeresemblancetoregression analysis,withtheimportantdistinctionsthattheoutcomevariable(time)isalways positiveandoftencensored.

1.4SomeExamplesofSurvivalDataSets

Followingareafewexamplesofstudiesusingsurvivalanalysiswhichwewillrefer tothroughoutthetext.Thedatasetsmaybeobtainedbyinstallingthetext’spackage “asaur”fromthemainCRANrepository.Datafortheseexamplesispresentedina numberofdifferentformats,reflectingtheformatsthatadataanalystmayseein practice.Forexample,mostdatasetspresentsurvivaltimeintermsoftimefrom theorigin(typicallyentryintoatrial).Onecontainsspecificdates(dateofentry intoatrialanddateofdeath)fromwhichwe computethesurvivaltime.Allcontain additionalvariables,suchascensoringvariables,whichindicatethatpartialtime informationonsomesubjectsisavailable. Mostalsocontaintreatmentindicators andothercovariateinformation.

Example1.2. Xeloxinpatientswithadvancedgastriccancer

ThisisaPhaseII(singlesample)clinicaltrialofXelodaandoxaliplatin(XELOX) chemotherapygivenbeforesurgeryto48advancedgastriccancerpatientswithparaaorticlymphnodemetastasis(Wangetal.[74]).Animportantsurvivaloutcomeof interestisprogression-freesurvival,whichisthetimefromentryintotheclinical trialuntilprogressionordeath,whichevercomesfirst.Thedata,whichhavebeen extractedfromthepaper,areinthedataset“gastricXelox”inthe“asaur”package; asampleoftheobservations(forpatients23through27)areasfollows:

>library(asaur)

>gastricXelox[23:27,] timeWeeksdelta 23421 24431 25430 26461 27480

Thefirstcolumnisthepatient(row)number.Thesecondisalistofsurvival times,roundedtothenearestweek,andthethirdis“delta”,whichisthecensoring indicator.Forexample,forpatientnumber23,thetimeis42anddeltais1,indicating

thattheobservedendpoint(progressionordeath)hadbeenobserved42weeksafter entryintothetrial.Forpatientnumber25,thetimeis43anddeltais0,indicating thatthepatientwasaliveat43weeksafterentryandnoprogressionhadbeen observed.WewilldiscussthisdatasetfurtherinChap. 3

Example1.3. Pancreaticcancerinpatientswithlocallyadvancedormetastatic disease

ThisisalsoasinglesamplePhaseIIstudyofachemotherapeuticcompound,and themainpurposewastoassessoverallsurvivalandalso“progression-freesurvival”, whichisdefinedasthetimefromentryintothetrialuntildiseaseprogression ordeath,whichevercomesfirst.Asecondaryinterestinthestudyistocompare theprognosisofpatientswithlocallyadvanceddiseaseascomparedtometastatic disease.TheresultswerepublishedinMossetal.[51]Thedataareavailableinthe dataset“pancreatic”inthe“asaur”package.Herearethefirstfewobservations: >head(pancreatic)

stageonstudyprogressiondeath

1M12/16/20052/2/200610/19/2006

2M1/6/20062/26/20064/19/2006

3LA2/3/20068/2/20061/19/2007

4M3/30/2006.5/11/2006

5LA4/27/20063/11/20075/29/2007

6M5/7/20066/25/200610/11/2006

Forexample,Patient#3,apatientwithlocallyadvanceddisease(stage=“LA”), enteredthestudyonFebruary3,2006.Thatpersonwasfoundtohaveprogressive diseaseonAugust2ofthatyear,anddiedonJanuary19ofthefollowingyear. Theprogression-freesurvivalforthatpatientisthedifferenceoftheprogressiondateandtheon-studydate.Patient#4,apatientwithmetastaticdisease (stage=“M”),enteredonMarch302006anddiedonMay11ofthatyear,with norecordeddateofprogression.Theprogression-freesurvivaltimeforthatpatients isthusthedifferenceofthedeathdateandtheon-studydate.Forbothpatients,the overallsurvivalisthedifferencebetweenthedateofdeathandtheon-studydate.In thisstudytherewasnocensoring,sincenoneoftheseseriouslyillpatientssurvived forverylong.InChap. 3 wewillseehowtocomparethesurvivalofthetwogroups ofpatients.

Example1.4. Survivalprospectsofprostatecancerpatientswithhigh-riskdisease Inthisdatasettherearetwooutcomesofinterest,deathfromprostatecancerand deathfromothercauses,sowehavewhat iscalledacompetingriskssurvival analysisproblem.Inthisexample,wehavesimulateddatafrom14,294prostate cancerpatientsbasedondetailedcompetingrisksanalysespublishedbyLu-Yao etal.[46].Foreachpatientwehavegrade(poorlyormoderatelydifferentiated),age ofdiagnosis(66-70,71-75,76-80,and80+),cancerstage(T1cifscreen-diagnosed usingaprostate-specificantigenbloodtest,T1abifclinicallydiagnosedwithout screening,orT2ifpalpableatdiagnosis),survivaltime(daysfromdiagnosisto deathordatelastseen),andanindicator(“status”)forwhetherthepatientdied

>pharmacoSmoking[1:6,2:8]

ttrrelapsegrpagegenderraceemployment 11820patchOnly36Malewhiteft 2141patchOnly41Malewhiteother 351combination25Femalewhiteother 4161combination54Malewhiteft 501combination45Malewhiteother 61820combination43Malehispanicft

Thevariable“ttr”isthenumberofdayswithoutsmoking(“timetorelapse”),and “relapse=1”indicatesthatthesubjectstartedsmokingagainatthegiventime.The variable“grp”isthetreatmentindicator,and“employment”cantakethevalues“ft” (fulltime),“pt”(parttime),or“other”.Theprimaryobjectivesweretocomparethe twotreatmenttherapieswithregardtotimetorelapse,andtoidentifyotherfactors relatedtothisoutcome.

Example1.6. Predictionofsurvivalofhepatocellularcarcinomapatientsusing biomarkers

Thisstudy(Lietal.[42, 43])focusedonusingexpressionofachemokind knownasCXCL17,andotherclinicalandbiomarkerfactors,topredictoveralland recurrence-freesurvival.Thisexamplecontainsdataon227patients,eachwitha widerangeofclinicalandbiomarkervalues.The“hepatoCellular”dataarepublicly availableintheDryadonlinedatarepository[43]aswellasinthe“asaur”Rpackage thataccompaniesthistext.Here,forillustration,isasmallselectionofcasesand covariates.

>hepatoCellular[c(1,2,3,65,71),c(2,3,16:20,24,47)]

AgeGenderOSDeathRFSRecurrenceCXCL17TCD4NKi67 1570830131113.9472406.04350 258181081054.07154NANA 365179079022.18883NANA 653815151106.78169044.24411 7157111111198.49680099.59232

Thesurvivaloutcomesare“OS”(overallsurvival)and“RFS”(recurrence-freesurvival),andthecorrespondingcensoring indicatorsare“Death”and“Recurrence”. Thefulldatasethas48columns.Incolumns23to48therearemanypatientswith missingvalues,withonly117patientshavingcompletedata.

1.5AdditionalNotes

1.Anothertypeofincompleteobservationwithsurvivaldataistruncation,aresult oflength-biasedsampling.WediscusslefttruncationinSect. 3.5.Righttruncationislesscommonandmoredifficulttomodel.SeeKleinandMoeschberger [36]forfurtherdiscussion.

2.TheHealthcareDeliveryResearchProgramoftheDivisionofCancerControl andPopulationSciences,NationalCancer Institute,USAmaintainstheSEERMedicareLinkedDatabase,whichprovidedthedatausedinLu-Yaoetal.[46].

ThisNCI-basedresearchprogrammakesthisdataavailableforresearchonly, andwillnotpermitittobedistributedforeducationalpurposes.Thusitcannotbe usedinthisbook.Fortunately,however,theLu-Yaopublicationcontainsdetailed cause-specificsurvivalcurvesforpatientscross-classifiedbyfouragegroups, threestagecategories,andtwoGleasonstages,aswellasprecisecountsofthe numbersofpatientsineachcategory.This informationwasusedtosimulatea survivaldatasetthatmaintainsmanyofthecharacteristicsoftheoriginalSEERMedicaredatausedinthepaper.Thissimulateddataset,“prostateSurvival”,is whatisusedinthisbookforinstructionalpurposes.

3.Numerousexcellentillustrativesurvivalanalysisdatasetsarefreelyavailable toall.Thestandard“survival”librarythatisdistributedwiththeRsystem hasanumberofsurvivalanalysisdatasets.Also,the“KMsurv”Rpackage containsarichadditionalsetofdatasetsthatwerediscussedinKleinand Moeschberger[36].The“asaur”Rpackagecontainsdatasetsusedinthecurrent text.

Exercises

1.1.Considerasimpleexampleoffivecancerpatientswhoenteraclinicaltrialas illustratedinthefollowingdiagram:

Re-writethesesurvivaltimesintermsofpatienttime,andcreateasimpledata setlistingthesurvivaltimeandcensoringindicatorforeachpatient.Howmany patientsdied?Howmanyperson-yearsarethereinthistrial?Whatisthedeathrate perperson-year?

1.2.Forthe“gastricXelox”dataset,useRtodeterminehowmanypatientshadthe event(deathorprogression),thenumberofperson-weeksoffollow-uptime,and theeventrateperperson-week.

Another Random Scribd Document with Unrelated Content

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.