Statistical methods for survival data analysis 3rd edition lee download pdf

Page 1


Statistical Methods for Survival Data Analysis 3rd Edition Lee

Visit to download the full and correct content document: https://ebookmass.com/product/statistical-methods-for-survival-data-analysis-3rd-editi on-lee/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

An Introduction to Statistical Methods and Data Analysis 7th Edition, (Ebook PDF)

https://ebookmass.com/product/an-introduction-to-statisticalmethods-and-data-analysis-7th-edition-ebook-pdf/

Qualitative Data Analysis: A Methods Sourcebook 3rd Edition – Ebook PDF Version

https://ebookmass.com/product/qualitative-data-analysis-amethods-sourcebook-3rd-edition-ebook-pdf-version/

Hands On With Google Data Studio: A Data Citizen's Survival Guide Lee Hurst

https://ebookmass.com/product/hands-on-with-google-data-studio-adata-citizens-survival-guide-lee-hurst/

Qualitative Data Analysis: A Methods Sourcebook Third Edition

https://ebookmass.com/product/qualitative-data-analysis-amethods-sourcebook-third-edition/

Handbook of statistical analysis and data mining applications Second Edition Elder

https://ebookmass.com/product/handbook-of-statistical-analysisand-data-mining-applications-second-edition-elder/

Applied Modeling Techniques and Data Analysis 2: Financial, Demographic, Stochastic and Statistical Models and Methods, Volume 8 Yannis Dimotikalis

https://ebookmass.com/product/applied-modeling-techniques-anddata-analysis-2-financial-demographic-stochastic-and-statisticalmodels-and-methods-volume-8-yannis-dimotikalis/

Exact Statistical Inference for Categorical Data 1st Edition Shan

https://ebookmass.com/product/exact-statistical-inference-forcategorical-data-1st-edition-shan/

Numerical Methods Using Kotlin: For Data Science, Analysis, and Engineering 1st Edition Haksun Li

https://ebookmass.com/product/numerical-methods-using-kotlin-fordata-science-analysis-and-engineering-1st-edition-haksun-li-2/

Numerical Methods Using Kotlin: For Data Science, Analysis, and Engineering 1st Edition Haksun Li

https://ebookmass.com/product/numerical-methods-using-kotlin-fordata-science-analysis-and-engineering-1st-edition-haksun-li/

StatisticalMethodsfor SurvivalDataAnalysis

StatisticalMethodsfor SurvivalDataAnalysis

ThirdEdition

ELISAT.LEE JOHNWENYUWANG

DepartmentofBiostatisticsandEpidemiologyand CenterforAmericanIndianHealthResearch CollegeofPublicHealth UniversityofOklahomaHealthSciencesCenter OklahomaCity,Oklahoma

Copyright 2003byJohnWiley&Sons,Inc.Allrightsreserved.

PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey. PublishedsimultaneouslyinCanada.

Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinany formorbyanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise, exceptaspermittedunderSection107or108ofthe1976UnitedStatesCopyrightAct, withouteitherthepriorwrittenpermissionofthePublisher,orauthorizationthroughpaymentof theappropriateper-copyfeetotheCopyrightClearanceCenter,Inc.,222RosewoodDrive, Danvers,MA01923,978-750-8400,fax978-750-4470,oronthewebatwww.copyright.com. RequeststothePublisherforpermissionshouldbeaddressedtothePermissionsDepartment, JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030, (201) 748-6011,fax (201) 748-6008, e-mail:permreq wiley.com.

LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbest effortsinpreparingthisbook,theymakenorepresentationsorwarrantieswithrespecttothe accuracyorcompletenessofthecontentsofthisbookandspecificallydisclaimanyimplied warrantiesofmerchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedor extendedbysalesrepresentativesorwrittensalesmaterials.Theadviceandstrategiescontained hereinmaynotbesuitableforyoursituationYoushouldconsultwithaprofessionalwhere appropriate.Neitherthepublishernorauthorshallbeliableforanylossofprofitoranyother commercialdamages,includingbutnotlimitedtospecial,incidental,consequential,orother damages.

ForgeneralinformationonourotherproductsandservicespleasecontactourCustomerCare DepartmentwithintheU.S.at877-762-2974,outsidetheU.S.at317-572-3993orfax317-572-4002. Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsin print,however,maynotbeavailableinelectronicformat.

LibraryofCongressCataloging-in-PublicationData:

Lee,ElisaT.

Statisticalmethodsforsurvivaldataanalysis.--3rded./ElisaT.LeeandJohnWenyuWang. p.cm.--(Wileyseriesinprobabilityandstatistics) Includesbibliographicalreferencesandindex. ISBN0-471-36997-7 (cloth:alk.paper)

1.Medicine--Research--Statisticalmethods.2.Failuretimedataanalysis.3. Prognosis--Statisticalmethods.I.Wang,JohnWenyu.II.Title.III.Series.

R853.S7L432003

610 .72--dc212002027025

PrintedintheUnitedStatesofAmerica. 10987654321

Tothememoryofourparents

Mr.Chi-LanTanandMrs.Hwei-ChiLeeTan (E.T.L.)

Mr.BeijunZhangandMrs.XiangyiWang (J.W.W.)

Prefacexi

1Introduction

1.1Preliminaries,1

1.2CensoredData,1

1.3ScopeoftheBook,5 BibliographicalRemarks,7

2FunctionsofSurvivalTime8

2.1Definitions,8

2.2RelationshipsoftheSurvivalFunctions,15 BibliographicalRemarks,17 Exercises,17

3ExamplesofSurvivalDataAnalysis19

3.1Example3.1:ComparisonofTwoTreatmentsandThree Diets,19

3.2Example3.2:ComparisonofTwoSurvivalPatterns UsingLifeTables,26

3.3Example3.3:FittingSurvivalDistributionstoRemission Data,29

3.4Example3.4:RelativeMortalityandIdentificationof PrognosticFactors,32

3.5Example3.5:IdentificationofRiskFactors,40 BibliographicalRemarks,47 Exercises,47

4NonparametricMethodsofEstimatingSurvivalFunctions64

4.1Product-LimitEstimatesofSurvivorshipFunction,65

4.2Life-TableAnalysis,77

4.3Relative,Five-Year,andCorrectedSurvivalRates,94

4.4StandardizedRatesandRatios,97

BibliographicalRemarks,102 Exercises,102

5NonparametricMethodsforComparingSurvivalDistributions106

5.1ComparisonofTwoSurvivalDistributions,106

5.2Mantel HaenszelTest,121

5.3Comparisonof K (K 2) Samples,125

BibliographicalRemarks,131 Exercises,131

6SomeWell-KnownParametricSurvivalDistributions andTheirApplications134

6.1ExponentialDistribution,134

6.2WeibullDistribution,138

6.3LognormalDistribution,143

6.4GammaandGeneralizedGammaDistributions,148

6.5Log-LogisticDistribution,154

6.6OtherSurvivalDistributions,155

BibliographicalRemarks,160 Exercises,160

7EstimationProceduresforParametricSurvivalDistributions withoutCovariates162

7.1GeneralMaximumLikelihoodEstimationProcedure,162

7.2ExponentialDistribution,166

7.3WeibullDistribution,178

7.4LognormalDistribution,180

7.5StandardandGeneralizedGammaDistributions,188

7.6Log-LogisticDistribution,195

7.7OtherParametricSurvivalDistributions,196

BibliographicalRemarks,196 Exercises,197

8GraphicalMethodsforSurvivalDistributionFitting198

8.1Introduction,198

8.2ProbabilityPlotting,200

8.3HazardPlotting,209

8.4Cox SnellResidualMethod,215

BibliographicalRemarks,219 Exercises,219

9TestsofGoodnessofFitandDistributionSelection221

9.1Goodness-of-FitTestStatisticsBasedonAsymptotic LikelihoodInferences,222

9.2TestsforAppropriatenessofaFamilyofDistributions,225

9.3SelectionofaDistributionUsingBIC orAICProcedures,230

9.4TestsforaSpecificDistributionwith KnownParameters,233

9.5HollanderandProschan’sTestforAppropriateness ofaGivenDistributionwithKnownParameters,236 BibliographicalRemarks,238 Exercises,240

10ParametricMethodsforComparingTwoSurvivalDistributions243

10.1LikelihoodRatioTestforComparingTwoSurvival Distributions,243

10.2ComparisonofTwoExponentialDistributions,246

10.3ComparisonofTwoWeibullDistributions,251

10.4ComparisonofTwoGammaDistributions,252

BibliographicalRemarks,254 Exercises,254

11ParametricMethodsforRegressionModelFittingand IdentificationofPrognosticFactors256

11.1PreliminaryExaminationofData,257

11.2GeneralStructureofParametricRegressionModels andTheirAsymptoticLikelihoodInference,259

11.3ExponentialRegressionModel,263

11.4WeibullRegressionModel,269

11.5LognormalRegressionModel,274

11.6ExtendedGeneralizedGammaRegressionModel,277

11.7Log-LogisticRegressionModel,280

11.8OtherParametricRegressionModels,283

11.9ModelSelectionMethods,286

BibliographicalRemarks,295 Exercises,295

12IdentificationofPrognosticFactorsRelatedtoSurvivalTime: CoxProportionalHazardsModel298

12.1PartialLikelihoodFunctionforSurvivalTimes,298

12.2IdentificationofSignificantCovariates,314

12.3EstimationoftheSurvivorshipFunctionwithCovariates,319

12.4AdequacyAssessmentoftheProportionalHazardsModel,326

BibliographicalRemarks,336 Exercises,337

13IdentificationofPrognosticFactorsRelatedtoSurvivalTime: NonproportionalHazardsModels339

13.1ModelswithTime-DependentCovariates,339

13.2StratifiedProportionalHazardsModels,348

13.3CompetingRisksModel,352

13.4RecurrentEventsModels,356

13.5ModelsforRelatedObservations,374

BibliographicalRemarks,376 Exercises,376

14IdentificationofRiskFactorsRelatedtoDichotomous andPolychotomousOutcomes377

14.1UnivariateAnalysis,378

14.2LogisticandConditionalLogisticRegressionModels forDichotomousResponses,385

14.3ModelsforPolychotomousOutcomes,413

BibliographicalRemarks,425 Exercises,425

Preface

Statisticalmethodsforsurvivaldataanalysishavecontinuedtoflourishinthe lasttwodecades.Applicationsofthemethodshavebeenwidenedfromtheir historicaluseincancerandreliabilityresearchtobusiness,criminology, epidemiology,andsocialandbehavioralsciences.Thethirdeditionof StatisticalMethodsforSurvivalDataAnalysis isintendedtoprovideacomprehensive introductionofthemostcommonlyusedmethodsforanalyzingsurvivaldata. Itbeginswithbasicdefinitionsandinterpretationsofsurvivalfunctions.From there,thereaderisguidedthroughmethods,parametricandnonparametric, forestimatingandcomparingthesefunctionsandthesearchforatheoretical distribution (ormodel) tofitthedata.Parametricandnonparametricapproachestotheidentificationofprognosticfactorsthatarerelatedtosurvival arethendiscussed.Finally,regressionmethods,primarilylinearlogisticregressionmodels,toidentifyriskfactorsfordichotomousandpolychotomous outcomesareintroduced.

Thethirdeditioncontinuestobeapplication-oriented,withaminimum levelofmathematics.Inafewchapters,someknowledgeofcalculusandmatrix algebraisneeded.Thefewsectionsthatintroducethegeneralmathematical structureforthemethodscanbeskippedwithoutlossofcontinuity.Alarge numberofpracticalexamplesaregiventoassistthereaderinunderstanding themethodsandapplicationsandininterpretingtheresults.Readerswithonly collegealgebrashouldfindthebookreadableandunderstandable.

Therearemanyexcellentbooksonclinicaltrials.Wethereforehavedeleted thetwochaptersonthesubjectthatwereinthesecondedition.Instead,we haveincludeddiscussionsofmorestatisticalmethodsforsurvivaldataanalysis. Abriefsummaryoftheimprovementsmadeforthethirdeditionisgiven below.

1.Twoadditionaldistributions,thelog-logisticdistributionandageneralizedgammadistribution,havebeenaddedtotheapplicationofparametricmodelsthatcanbeusedinmodelfittingandprognosticfactor identification (Chapters6,7,and11).

2.Inseveralsections (Sections7.1,9.1,10.1,11.2,and12.1),discussionsof theasymptoticlikelihoodinferenceofthemethodscoveredinthe chaptersaregiven.Thesesectionsareintendedtoprovideamoregeneral mathematicalstructureforstatisticians.

3.TheCox Snellresidualmethodhasbeenaddedtothechapteron graphicalmethodsforsurvivaldistributionfitting (Chapter8).Inaddition,thesectionsonprobabilityandhazardplottinghavebeenrevised sothatnospecialgraphicalpapersarerequiredtomaketheplots.

4.Moretestsofgoodnessoffitaregiven,includingtheBICandAIC procedures (Chapters9and11)

5.ForCox’sproportionalhazardsmodel (Chapter12),wehavenow includedmethodstoassessitsadequencyandprocedurestoestimatethe survivorshipfunctionwithcovariates.

6.Theconceptofnonproportionalhazardsmodelsisintroduced (Chapter 13),whichincludesmodelswithtime-dependentcovariates,stratified models,competingrisksmodels,recurrenteventmodels,andmodelsfor relatedobservations.

7.Thechapteronlinearlogisticregression (Chapter14) hasbeenexpanded tocoverregressionmodelsforpolychotomousoutcomes.Inaddition, methodsforageneral m : n matchingdesignhavebeenaddedtothe sectiononconditionallogisticregressionforcase controlstudies.

8.ComputerprogrammingcodesforsoftwarepackagesBMDP,SAS,and SPSSareprovidedformostexamplesinthetext.

Wewouldliketothankthemanyresearchers,teachers,andstudentswho haveusedthesecondeditionofthebook.Thesuggestionsforimprovement thatmanyofthemhaveprovidedareinvaluable.SpecialthanksgotoXing Wang,LindaHutton,TracyMankin,andImranAhmedfortypingthe manuscript.SteveQuigleyofJohnWileyconvincedustoworkonathird edition.Wethankhimforhisenthusiasm.

Finally,wearemostgratefultoourfamilies,Sam,Vivian,Benedict,Jennifer, andAnnelisa (E.T.L.),andAliceandXing (J.W.W.),fortheconstantjoy,love, andsupporttheyhavegivenus.

OklahomaCity,OK

April18,2001

CHAPTER1

Introduction

1.1PRELIMINARIES

Thisbookisforbiomedicalresearchers,epidemiologists,consultingstatisticians,studentstakingafirstcourseonsurvivaldataanalysis,andothers interestedinsurvivaltimestudy.Itdealswithstatisticalmethodsforanalyzing survivaldataderivedfromlaboratorystudiesofanimals,clinicalandepidemiologicstudiesofhumans,andotherappropriateapplications.

Survivaltime canbedefinedbroadlyasthetimetotheoccurrenceofagiven event.Thiseventcanbethedevelopmentofadisease,responsetoatreatment, relapse,ordeath.Therefore,survivaltimecanbetumor-freetime,thetimefrom thestartoftreatmenttoresponse,lengthofremission,andtimetodeath. Survivaldatacanincludesurvivaltime,responsetoagiventreatment,and patientcharacteristicsrelatedtoresponse,survival,andthedevelopmentofa disease.Thestudyofsurvivaldatahasfocusedonpredictingtheprobabilityof response,survival,ormeanlifetime,comparingthesurvivaldistributionsof experimentalanimalsorofhumanpatientsandtheidentificationofriskand/or prognosticfactorsrelatedtoresponse,survival,andthedevelopmentofa disease.Inthisbook,specialconsiderationisgiventothestudyofsurvivaldata inbiomedicalsciences,althoughallthemethodsaresuitableforapplications inindustrialreliability,socialsciences,andbusiness.Examplesofsurvivaldata inthesefieldsarethelifetimeofelectronicdevices,components,orsystems (reliabilityengineering);felons’timetoparole (criminology);durationoffirst marriage (sociology);lengthofnewspaperormagazinesubscription (marketing);andworker’scompensationclaims (insurance) andtheirvariousinfluencingriskorprognosticfactors.

1.2CENSOREDDATA

Manyresearchersconsidersurvivaldataanalysistobemerelytheapplication oftwoconventionalstatisticalmethodstoaspecialtypeofproblem: parametric ifthedistributionofsurvivaltimesisknowntobenormaland nonparametric

ifthedistributionisunknown.Thisassumptionwouldbetrueifthesurvival timesofallthesubjectswereexactandknown;however,somesurvivaltimes arenot.Further,thesurvivaldistributionisoftenskewed,orfarfrombeing normal.Thusthereisaneedfornewstatisticaltechniques.Oneofthemost importantdevelopmentsisduetoaspecialfeatureofsurvivaldatainthelife sciencesthatoccurswhensomesubjectsinthestudyhavenotexperiencedthe eventofinterestattheendofthestudyortimeofanalysis.Forexample,some patientsmaystillbealiveordisease-freeattheendofthestudyperiod.The exactsurvivaltimesofthesesubjectsareunknown.Thesearecalled censored observations or censoredtimes andcanalsooccurwhenpeoplearelostto follow-upafteraperiodofstudy.Whenthesearenotcensoredobservations, thesetofsurvivaltimesis complete. Therearethreetypesofcensoring.

TypeICensoring

Animalstudiesusuallystartwithafixednumberofanimals,towhichthe treatmentortreatmentsisgiven.Becauseoftimeand/orcostlimitations,the researcheroftencannotwaitforthedeathofalltheanimals.Oneoptionisto observeforafixedperiodoftime,saysixmonths,afterwhichthesurviving animalsaresacrificed.Survivaltimesrecordedfortheanimalsthatdiedduring thestudyperiodarethetimesfromthestartoftheexperimenttotheirdeath. Thesearecalled exact or uncensoredobservations.Thesurvivaltimesofthe sacrificedanimalsarenotknownexactlybutarerecordedasatleastthelength ofthestudyperiod.Thesearecalled censoredobservations. Someanimalscould belostordieaccidentally.Theirsurvivaltimes,fromthestartofexperiment tolossordeath,arealsocensoredobservations.In typeIcensoring,ifthereare noaccidentallosses,allcensoredobservationsequalthelengthofthestudy period.

Forexample,supposethatsixratshavebeenexposedtocarcinogensby injectingtumorcellsintotheirfootpads.Thetimestodevelopatumorofa givensizeareobserved.Theinvestigatordecidestoterminatetheexperiment after30weeks.Figure1.1isaplotofthedevelopmenttimesofthetumors. RatsA,B,andDdevelopedtumorsafter10,15,and25weeks,respectively. RatsCandEdidnotdeveloptumorsbytheendofthestudy;theirtumor-free timesarethus30-plusweeks.RatFdiedaccidentallywithouttumorsafter19 weeksofobservation.Thesurvivaldata (tumor-freetimes) are10,15,30 ,25, 30 ,and19 weeks. (Theplusindicatesacensoredobservation.)

TypeIICensoring

Anotheroptioninanimalstudiesistowaituntilafixedportionoftheanimals havedied,say80of100,afterwhichthesurvivinganimalsaresacrificed.In thiscase, typeIIcensoring,iftherearenoaccidentallosses,thecensored observationsequalthelargestuncensoredobservation.Forexample,inan experimentofsixrats (Figure1.2),theinvestigatormaydecidetoterminatethe studyafterfourofthesixratshavedevelopedtumors.Thesurvivalor tumor-freetimesarethen10,15,35 ,25,35,and19 weeks.

Figure1.1 ExampleoftypeIcensoreddata.

Figure1.2 ExampleoftypeIIcensoreddata.

TypeIIICensoring

Inmostclinicalandepidemiologicstudiestheperiodofstudyisfixedand patientsenterthestudyatdifferenttimesduringthatperiod.Somemaydie beforetheendofthestudy;theirexactsurvivaltimesareknown.Othersmay withdrawbeforetheendofthestudyandarelosttofollow-up.Stillothersmay bealiveattheendofthestudy.For‘‘lost’’patients,survivaltimesareatleast fromtheirentrancetothelastcontact.Forpatientsstillalive,survivaltimes areatleastfromentrytotheendofthestudy.Thelattertwokindsof observationsarecensoredobservations.Sincetheentrytimesarenotsimultaneous,thecensoredtimesarealsodifferent.Thisis typeIIIcensoring.For example,supposethatsixpatientswithacuteleukemiaenteraclinicalstudy

Figure1.3 ExampleoftypeIIIcensoreddata. duringatotalstudyperiodofoneyear.Supposealsothatallsixrespondto treatmentandachieveremission.TheremissiontimesareplottedinFigure1.3. PatientsA,C,andEachieveremissionatthebeginningofthesecond,fourth, andninthmonths,andrelapseafterfour,six,andthreemonths,respectively. PatientBachievesremissionatthebeginningofthethirdmonthbutislostto follow-upfourmonthslater;theremissiondurationisthusatleastfour months.PatientsDandFachieveremissionatthebeginningofthefifthand tenthmonths,respectively,andarestillinremissionattheendofthestudy; theirremissiontimesarethusatleasteightandthreemonths.Therespective remissiontimesofthesixpatientsare4,4 ,6,8 ,3,and3 months.

TypeIandtypeIIcensoredobservationsarealsocalled singlycensored data,andtypeIII, progressivelycensoreddata,byCohen (1965).Another commonlyusednamefortypeIIIcensoringis randomcensoring.Allofthese typesofcensoringare rightcensoring or censoringtotheright.Therearealso leftcensoringandintervalcensoringcases. Leftcensoring occurswhenitis knownthattheeventofinterestoccurredpriortoacertaintime t,buttheexact timeofoccurrenceisunknown.Forexample,anepidemiologistwishestoknow theageatdiagnosisinafollow-upstudyofdiabeticretinopathy.Atthetimeof theexamination,a50-year-oldparticipantwasfoundtohavealreadydevelopedretinopathy,butthereisnorecordoftheexacttimeatwhichinitialevidence wasfound.Thustheageatexamination (i.e.,50) isaleft-censoredobservation. Itmeansthattheageofdiagnosisforthispatientis atmost 50years.

Intervalcensoring occurswhentheeventofinterestisknowntohave occurredbetweentimes a and b.Forexample,ifmedicalrecordsindicatethat atage45,thepatientintheexampleabovedidnothaveretinopathy,hisage atdiagnosisisbetween45and50years.

Wewillstudydescriptiveandanalyticmethodsforcomplete,singlycensored,andprogressivelycensoredsurvivaldatausingnumericalandgraphical

techniques.Analyticmethodsdiscussedincludeparametricandnonparametric. Parametricapproachesareusedeitherwhenasuitablemodelordistribution isfittedtothedataorwhenadistributioncanbeassumedforthepopulation fromwhichthesampleisdrawn.Commonlyusedsurvivaldistributionsarethe exponential,Weibull,lognormal,andgamma.Ifasurvivaldistributionisfound tofitthedataproperly,thesurvivalpatterncanthenbedescribedbythe parametersinacompactway.Statisticalinferencecanbebasedonthe distributionchosen.Ifthesearchforanappropriatemodelordistributionis tootimeconsumingornoteconomicalornotheoreticaldistributionadequatelyfitsthedata,nonparametricmethods,whicharegenerallyeasytoapply, shouldbeconsidered.

1.3SCOPEOFTHEBOOK

Thisbookisdividedintofourparts.

PartI (Chapters1,2,and3) definessurvivalfunctionsandgivesexamples ofsurvivaldataanalysis.Survivaldistributionismostcommonlydescribedby threefunctions:thesurvivorshipfunction (alsocalledthecumulativesurvival rateorsurvivalfunction),theprobabilitydensityfunction,andthehazard function (hazardrateorage-specificrate).InChapter2wedefinethesethree functionsandtheirequivalencerelationships.Chapter3illustratessurvival dataanalysiswithfiveexamplestakenfromactualresearchsituations.Clinical andlaboratorydataaresystematicallyanalyzedinprogressivestepsandthe resultsareinterpreted.Sectionandchapternumbersaregivenforquick reference.Theactualcalculationsaregivenasexamplesorleftasexercisesin thechapterswherethemethodsarediscussed.Foursetsofdataareprovided intheexercisesectionforthereadertoanalyze.Thesedataarereferredtoin thevariouschapters.

InPartII (Chapters4and5) weintroducesomeofthemostwidelyused nonparametricmethodsforestimatingandcomparingsurvivaldistributions. Chapter4dealswiththenonparametricmethodsforestimatingthethree survivalfunctions:theKaplanandMeierproduct-limit (PL) estimateandthe life-tabletechnique (populationlifetablesandclinicallifetables).Alsocovered isstandardizationofratesbydirectandindirectmethods,includingthe standardizedmortalityratio.Chapter5isdevotedtononparametrictechniquesforcomparingsurvivaldistributions.Acommonpracticeistocompare thesurvivalexperiencesoftwoormoregroupsdifferingintheirtreatmentor inagivencharacteristic.Severalnonparametrictestsaredescribed.

PartIII (Chapters6to10) introducestheparametricapproachtosurvival dataanalysis.Althoughnonparametricmethodsplayanimportantrolein survivalstudies,parametrictechniquescannotbeignored.InChapter6we introduceanddiscusstheexponential,Weibull,lognormal,gamma,and log-logisticsurvivaldistributions.Practicalapplicationsofthesedistributions takenfromtheliteratureareincluded.

Animportantpartofsurvivaldataanalysisismodelordistributionfitting. Onceanappropriatestatisticalmodelforsurvivaltimehasbeenconstructed anditsparametersestimated,itsinformationcanhelppredictsurvival,develop optimaltreatmentregimens,planfutureclinicalorlaboratorystudies,andso on.Thegraphicaltechniqueisasimpleinformalwaytoselectastatistical modelandestimateitsparameters.Whenastatisticaldistributionisfoundto fitthedatawell,theparameterscanbeestimatedbyanalyticalmethods.In Chapter7wediscussanalyticalestimationproceduresforsurvivaldistributions.Mostoftheestimationproceduresarebasedonthemaximumlikelihood method.Mathematicalderivationsareomitted;onlyformulasfortheestimates andexamplesaregiven.InChapter8weintroducethreekindsofgraphical methods:probabilityplotting,hazardplotting,andtheCox Snellresidual methodforsurvivaldistributionfitting.InChapter9wediscussseveraltests ofgoodnessoffitanddistributionselection.InChapter10wedescribeseveral parametricmethodsforcomparingsurvivaldistributions.

Atopicthathasreceivedincreasingattentionistheidentificationof prognosticfactorsrelatedtosurvivaltime.Forexample,whoislikelyto survivelongestaftermastectomy,andwhatarethemostimportantfactorsthat influencethatsurvival?Anothersubjectimportanttobothbiomedicalresearchersandepidemiologistsisidentificationoftheriskfactorsrelatedtothe developmentofagivendiseaseandtheresponsetoagiventreatment.What arethefactorsmostcloselyrelatedtothedevelopmentofagivendisease?Who ismorelikelytodeveloplungcancer,diabetes,orcoronarydisease?Inmany diseases,suchascancer,patientswhorespondtotreatmenthaveabetter prognosisthanpatientswhodonot.Thequestion,then,relatestowhatthe factorsarethatinfluenceresponse.Whoismorelikelytorespondtotreatment andthusperhapssurvivelonger?

PartIV (Chapters11to14) dealswithprognostic/riskfactorsandsurvival times.InChapter11weintroduceparametricmethodsforidentifyingimportantprognosticfactors.Chapters12and13cover,respectively,theCox proportionalhazardsmodelandseveralnonproportionalhazardsmodelsfor theidentificationofprognosticfactors.Inthefinalchapter,Chapter14,we introducethelinearlogisticregressionmodelforbinaryoutcomevariablesand itsextensiontohandlepolychotomousoutcomes.

InAppendixAwedescribeanumericalprocedureforsolvingnonlinear equations,theNewton Raphsonmethod.ThismethodissuggestedinChapters7,11,12,and13.AppendixBcomprisesanumberofstatisticaltables.

Mostnonparametrictechniquesdiscussedhereareeasytounderstandand simpletoapply.Parametricmethodsrequireanunderstandingofsurvival distributions.Unfortunately,mostofsurvivaldistributionsarenotsimple. Readerswithoutcalculusmayfinditdifficulttoapplythemontheirown. However,ifthemainpurposeisnotmodelfitting,mostparametrictechniques canbesubstitutedforbytheirnonparametriccompetitors.Infact,alarge percentageofsurvivalstudiesinclinicalorepidemiologicaljournalsare analyzedbynonparametricmethods.Researchersnotinterestedinsurvival

modelfittingshouldreadthechaptersandsectionsonnonparametricmethods. Computerprogramsforsurvivaldataanalysisareavailableinseveralcommerciallyavailablesoftwarepackages:forexample,BMDP,SAS,andSPSS.These computerprogramsarereferredtoinvariouschapterswhenapplicable. Computerprogrammingcodesaregivenformanyoftheexamples.

BibliographicalRemarks

CrossandClark (1975) wasthefirstbooktodiscussparametricmodelsand nonparametricandgraphicaltechniquesforbothcompleteandcensored survivaldata.Sincethen,severalotherbookshavebeenpublishedinaddition tothefirsteditionofthisbook (Lee,1980,1992).Elandt-JohnsonandJohnson (1980) discussextensivelytheconstructionoflifetables,modelfitting,competingrisk,andmathematicalmodelsofbiologicalprocessesofdiseaseprogressionandaging.KalbfleischandPrentice (1980) focusonregression problemswithsurvivaldata,particularlyCox’sproportionalhazardsmodel. Miller (1981) coversanumberofparametricandnonparametricmethodsfor survivalanalysis.CoxandOakes (1984) alsocoverthetopicconciselywithan emphasisontheexaminationofexplanatoryvariables.

Nelson (1982) providesagooddiscussionofparametric,nonparametric,and graphicalmethods.Thebookismoresuitedforindustrialreliabilityengineers thanforbiomedicalresearchers,asareHahnandShapiro (1967) andMannet al. (1974).Inaddition,Lawless (1982) givesabroadcoverageoftheareawith applicationsinengineeringandbiomedicalsciences.

MorerecentpublicationsincludeMarubiniandValsecchi (1994),Kleinbaum (1995),KleinandMoeschberger (1997),andHosmerandLemeshow (1999).Mostofthesebookstakeamorerigorousmathematicalapproachand requireknowledgeofmathematicalstatistics.

CHAPTER2

FunctionsofSurvivalTime

Survivaltimedatameasurethetimetoacertainevent,suchasfailure,death, response,relapse,thedevelopmentofagivendisease,parole,ordivorce.These timesaresubjecttorandomvariations,andlikeanyrandomvariables,forma distribution.Thedistributionofsurvivaltimesisusuallydescribedorcharacterizedbythreefunctions: (1) thesurvivorshipfunction, (2) theprobability densityfunction,and (3) thehazardfunction.Thesethreefunctionsare mathematicallyequivalent—ifoneofthemisgiven,theothertwocanbe derived.

Inpractice,thethreefunctionscanbeusedtoillustratedifferentaspectsof thedata.Abasicprobleminsurvivaldataanalysisistoestimatefromthe sampleddataoneormoreofthesethreefunctionsandtodrawinferences aboutthesurvivalpatterninthepopulation.InSection2.1wedefinethethree functionsandinSection2.2,discusstheequivalencerelationshipamongthe threefunctions.

2.1DEFINITIONS

Let T denotethesurvivaltime.Thedistributionof T canbecharacterizedby threeequivalentfunctions.

SurvivorshipFunction(orSurvivalFunction)

Thisfunction,denotedby S(t),isdefinedastheprobabilitythatanindividual surviveslongerthan t: S(t) P (anindividualsurviveslongerthan t) P(T t )

Fromthedefinitionofthecumulativedistributionfunction F(t)of T, S(t) 1-P (anindividualfailsbefore t) 1 F(t)(2.1.2)

)

Here S(t)isanonincreasingfunctionoftime t withtheproperties

S(t) 1for t 0 0for t

Thatis,theprobabilityofsurvivingatleastatthetimezerois1andthatof survivinganinfinitetimeiszero.

Thefunction S(t)isalsoknownasthe cumulativesurvivalrate. Todepictthe courseofsurvival,Berkson (1942) recommendedagraphicpresentationof S(t). Thegraphof S(t)iscalledthe survivalcurve. Asteepsurvivalcurve,suchas theoneshowninFigure2.1a,representslowsurvivalrateorshortsurvival time.AgradualorflatsurvivalcurvesuchasinFigure2.1b representshigh survivalrateorlongersurvival.

Thesurvivorshipfunctionorthesurvivalcurveisusedtofindthe50th percentile (themedian) andotherpercentiles (e.g.,25thand75th) ofsurvival timeandtocomparesurvivaldistributionsoftwoormoregroups.Themedian survivaltimesinFigure2.1a and b areapproximately5and36unitsoftime, respectively.Themeanisgenerallyusedtodescribethecentraltendencyofa distribution,butinsurvivaldistributionsthemedianisoftenbetterbecausea smallnumberofindividualswithexceptionallylongorshortlifetimeswill causethemeansurvivaltimetobedisproportionatelylargeorsmall.

Inpractice,iftherearenocensoredobservations,thesurvivorshipfunction isestimatedastheproportionofpatientssurvivinglongerthan t :

numberofpatientssurvivinglongerthan t

S(t)

)

totalnumberofpatients

wherethecircumflexdenotesan estimate ofthefunction.Whencensored observationsarepresent,thenumeratorof (2.1.3) cannotalwaysbedetermined. Forexample,considerthefollowingsetofsurvivaldata:4,6,6 ,10 ,15,20.

Figure2.1 Twoexamplesofsurvivalcurves.

Using (2.1.3),wecancompute S(5) 5/6 0.833.However,wecannotobtain S(11)sincetheexactnumberofpatientssurvivinglongerthan11isunknown. Eitherthethirdorthefourthpatient (6 and10 ) couldsurvivelongerthan orlessthan11.Thus,whencensoredobservationsarepresent, (2.1.3) isno longerappropriateforestimating S(t).Nonparametricmethodsofestimating S(t)forcensoreddataarediscussedinChapter4.

ProbabilityDensityFunction(orDensityFunction)

Likeanyothercontinuousrandomvariable,thesurvivaltime T hasa probabilitydensityfunctiondefinedasthelimitoftheprobabilitythatan individualfailsintheshortinterval t to t t perunitwidth t,orsimplythe probabilityoffailureinasmallintervalperunittime.Itcanbeexpressedas

f (t) lim P[anindividualdyingintheinterval (t, t t)]

.1.4)

Thegraphof f (t)iscalledthe densitycurve. Figure2.2a and b givetwo examplesofthedensitycurve.Thedensityfunctionhasthefollowingtwo properties:

1. f (t)isanonnegativefunction: f (t) 0forall t 0 0for t 0

2.Theareabetweenthedensitycurveandthe t axisisequalto1.

Inpractice,iftherearenocensoredobservations,theprobabilitydensity function f (t)isestimatedastheproportionofpatientsdyinginanintervalper

Figure2.2 Twoexamplesofdensitycurves.

unitwidth:

f (t)

numberofpatientsdyingintheintervalbeginningattime t (totalnumberofpatients) (intervalwidth)

)

Similartotheestimationof S(t),whencensoredobservationsarepresent, (2.1.5) isnotapplicable.WediscussanappropriatemethodinChapter4.

Theproportionofindividualsthatfailinanytimeintervalandthepeaksof highfrequencyoffailurecanbefoundfromthedensityfunction.Thedensity curveinFigure2.2a givesapatternofhighfailurerateatthebeginningofthe studyanddecreasingfailurerateastimeincreases.InFigure2.2b,thepeakof highfailurefrequencyoccursatapproximately1.7unitsoftime.Theproportionofindividualsthatfailbetween1and2unitsoftimeisequaltotheshaded areabetweenthedensitycurveandtheaxis.Thedensityfunctionisalsoknown asthe unconditionalfailurerate.

HazardFunction

Thehazardfunction h(t)ofsurvivaltime T givesthe conditionalfailurerate. Thisisdefinedastheprobabilityoffailureduringaverysmalltimeinterval, assumingthattheindividualhassurvivedtothebeginningoftheinterval,or asthelimitoftheprobabilitythatanindividualfailsinaveryshortinterval, t t,giventhattheindividualhassurvivedtotime t:

P anindividualfailsinthetimeinterval

Thehazardfunctioncanalsobedefinedintermsofthecumulative distributionfunction F(t)andtheprobabilitydensityfunction f (t):

Thehazardfunctionisalsoknownasthe instantaneousfailurerate, forceof mortality, conditionalmortalityrate,and age-specificfailurerate. If t in (2.1.6) isage,itisameasureofthepronenesstofailureasafunctionoftheageofthe individualinthesensethatthequantity th(t)istheexpectedproportionof age t individualswhowillfailintheshorttimeinterval t t.Thehazard functionthusgivestheriskoffailureperunittimeduringtheagingprocess.It playsanimportantroleinsurvivaldataanalysis.

Inpractice,whentherearenocensoredobservationsthehazardfunctionis estimatedastheproportionofpatientsdyinginanintervalperunittime,given

thattheyhavesurvivedtothebeginningoftheinterval:

h(t)

numberofpatientsdyingintheintervalbeginningattime t (numberofpatientssurvivingat t) (intervalwidth)

numberofpatientsdyingperunittimeintheinterval

numberofpatientssurvivingat t (2.1.8)

Actuariesusuallyusetheaveragehazardrateoftheintervalinwhichthe numberofpatientsdyingperunittimeintheintervalisdividedbytheaverage numberofsurvivorsatthemidpointoftheinterval:

h(t)

numberofpatientsdyingperunittimeintheinterval (numberofpatientssurvivingat t) (numberofdeathsintheinterval)/2 (2.1.9)

Theactuarialestimatein (2.1.9) givesahigherhazardratethan (2.1.8) andthus amoreconservativeestimate.

Thehazardfunctionmayincrease,decrease,remainconstant,orindicatea morecomplicatedprocess.Figure2.3isaplotofseveralkindsofhazard function.Forexample,patientswithacuteleukemiawhodonotrespondto treatmenthaveanincreasinghazardrate, h (t), h (t)isadecreasinghazard functionthat,forexample,indicatestheriskofsoldierswoundedbybullets whoundergosurgery.Themaindangeristheoperationitselfandthisdanger decreasesifthesurgeryissuccessful.Anexampleofaconstanthazardfunction, h (t),istheriskofhealthypersonsbetween18and40yearsofagewhosemain risksofdeathareaccidents.The bathtubcurve, h (t),describestheprocessof

Figure2.3 Examplesofthehazardfunction.

humanlife.Duringaninitialperiod,theriskishigh (highinfantmortality). Subsequently, h(t)staysapproximatelyconstantuntilacertaintime,after whichitincreasesbecauseofwear-outfailures.Finally,patientswithtuberculosishaverisksthatincreaseinitially,thendecreaseaftertreatment.Suchan increasing,thendecreasinghazardfunctionisdescribedby h (t).

The cumulativehazardfunction isdefinedas

Thus,at t 0, S(t) 1, H(t) 0,andat t , S(t) 0, H(t) .The cumulativehazardfunctioncanbeanyvaluebetweenzeroandinfinity.Alllog functionsinthisbookarenaturallogs (base e) unlessotherwiseindicated.

Thefollowingexampleillustrateshowthesefunctionscanbeestimatedfrom acompletesampleofgroupedsurvivaltimeswithoutcensoredobservations.

Example2.1 ThefirstthreecolumnsofTable2.1givethesurvivaldataof 40patientswithmyeloma.Thesurvivaltimesaregroupedintointervalsoffive months.Theestimatedsurvivorshipfunction,densityfunction,andhazard functionarealsogiven,withthecorrespondinggraphsplottedinFigure 2.4a—c.

Table2.1SurvivalDataandEstimatedSurvivalFunctionsof40MyelomaPatients

Figure2.4 Estimatedsurvivalfunctionsofmyelomapatients.

Figure2.4 (Continued).

Theestimatedsurvivorshipfunction, S(t),iscalculatedfollowing (2.1.3) atthe beginningortheendofeachinterval.Forexample,atthebeginningofthefirst interval,all40patientsarealive, S(0) 1,andatthebeginningofthesecond interval,35ofthe40patientsarestillalive, S(5) 35/40 0.875.Similarly, S(10) 28/40 0.700.Theestimateddensityfunction f (t)iscomputedfollowing (2.1.5).Forexample,thedensityfunctionofthefirstinterval (0 5) is 5/(40 5) 0.025,andthatofthesecondinterval (5 10) is7/(40 5) 0.035. Theestimateddensityfunctionisplottedatthemidpointofeachinterval (Figure2.4b).Theestimatedhazardfunction, h(t),iscomputedfollowingthe actuarialmethodgivenin (2.1.9).Forexample,thehazardfunctionofthefirst interval5/[5(40 5/2)] 0.027andthatofthesecondintervalis7/[5(35 7/ 2)] 0.044.Theestimatedhazardfunctionisalsoplottedatthemidpointof eachinterval (Figure2.4c).

FromTable2.1orFigure2.4a,themediansurvivaltimeofmyeloma patientsisapproximately17.5months,andthepeakofhighfrequencyofdeath occursin5to10months.Inaddition,thehazardfunctionshowsanincreasing trendandreachesitspeakatapproximately32.5monthsandthenfluctuates.

2.2RELATIONSHIPSOFTHESURVIVALFUNCTIONS

ThethreefunctionsdefinedinSection2.1aremathematicallyequivalent.Given anyoneofthem,theothertwocanbederived.Readersnotinterestedinthe mathematicalrelationshipamongthethreesurvivalfunctionscanskipthis

sectionwithoutlossofcontinuity.

1.From (2.1.2) and (2.1.7),

Thisrelationshipcanalsobederivedfrom (2.1.6) usingbasicdefinitionsof conditionalprobabilities.

2.Sincetheprobabilitydensityfunctionisthederivativeofthecumulative distributionfunction,

3.Substituting (2.2.2) into (2.2.1) yields

4. Integrating (2.2.3) fromzeroto t andusing S(0) 1,wehave

or

or

5.From (2.2.1) and (2.2.4) weobtain

Hence,if f (t)isknown,thesurvivorshipfunctioncanbeobtainedfromthe basicrelationshipbetween f (t), F(t),and (2.1.2).Thehazardfunctioncanthen bedeterminedfrom (2.2.1).If S(t)isknown, f (t)and h(t)canbedetermined from (2.2.2) and (2.2.1),respectively,or h(t)canbederivedfirstfrom (2.2.3) and then f (t)from (2.2.1).If h(t)isgiven, S(t)and f (t)canbeobtained,respectively, from (2.2.4) and (2.2.5).Thus,givenanyoneofthethreesurvivalfunctions,the othertwocaneasilybederived.Thefollowingexampleillustratesthese equivalencerelationships.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.