Statistical Methods for Survival Data Analysis 3rd Edition Lee Visit to download the full and correct content document: https://ebookmass.com/product/statistical-methods-for-survival-data-analysis-3rd-editi on-lee/
More products digital (pdf, epub, mobi) instant download maybe you interests ...
An Introduction to Statistical Methods and Data Analysis 7th Edition, (Ebook PDF)
https://ebookmass.com/product/an-introduction-to-statisticalmethods-and-data-analysis-7th-edition-ebook-pdf/
Qualitative Data Analysis: A Methods Sourcebook 3rd Edition – Ebook PDF Version
https://ebookmass.com/product/qualitative-data-analysis-amethods-sourcebook-3rd-edition-ebook-pdf-version/
Hands On With Google Data Studio: A Data Citizen's Survival Guide Lee Hurst
https://ebookmass.com/product/hands-on-with-google-data-studio-adata-citizens-survival-guide-lee-hurst/
Qualitative Data Analysis: A Methods Sourcebook Third Edition
https://ebookmass.com/product/qualitative-data-analysis-amethods-sourcebook-third-edition/
Handbook of statistical analysis and data mining applications Second Edition Elder
https://ebookmass.com/product/handbook-of-statistical-analysisand-data-mining-applications-second-edition-elder/
Applied Modeling Techniques and Data Analysis 2: Financial, Demographic, Stochastic and Statistical Models and Methods, Volume 8 Yannis Dimotikalis
https://ebookmass.com/product/applied-modeling-techniques-anddata-analysis-2-financial-demographic-stochastic-and-statisticalmodels-and-methods-volume-8-yannis-dimotikalis/
Exact Statistical Inference for Categorical Data 1st Edition Shan
https://ebookmass.com/product/exact-statistical-inference-forcategorical-data-1st-edition-shan/
Numerical Methods Using Kotlin: For Data Science, Analysis, and Engineering 1st Edition Haksun Li
https://ebookmass.com/product/numerical-methods-using-kotlin-fordata-science-analysis-and-engineering-1st-edition-haksun-li-2/
Numerical Methods Using Kotlin: For Data Science, Analysis, and Engineering 1st Edition Haksun Li
https://ebookmass.com/product/numerical-methods-using-kotlin-fordata-science-analysis-and-engineering-1st-edition-haksun-li/
StatisticalMethodsfor SurvivalDataAnalysis StatisticalMethodsfor SurvivalDataAnalysis ThirdEdition ELISAT.LEE JOHNWENYUWANG DepartmentofBiostatisticsandEpidemiologyand CenterforAmericanIndianHealthResearch CollegeofPublicHealth UniversityofOklahomaHealthSciencesCenter OklahomaCity,Oklahoma
Copyright 2003byJohnWiley&Sons,Inc.Allrightsreserved.
PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey. PublishedsimultaneouslyinCanada.
Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinany formorbyanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise, exceptaspermittedunderSection107or108ofthe1976UnitedStatesCopyrightAct, withouteitherthepriorwrittenpermissionofthePublisher,orauthorizationthroughpaymentof theappropriateper-copyfeetotheCopyrightClearanceCenter,Inc.,222RosewoodDrive, Danvers,MA01923,978-750-8400,fax978-750-4470,oronthewebatwww.copyright.com. RequeststothePublisherforpermissionshouldbeaddressedtothePermissionsDepartment, JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030, (201) 748-6011,fax (201) 748-6008, e-mail:permreq wiley.com.
LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbest effortsinpreparingthisbook,theymakenorepresentationsorwarrantieswithrespecttothe accuracyorcompletenessofthecontentsofthisbookandspecificallydisclaimanyimplied warrantiesofmerchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedor extendedbysalesrepresentativesorwrittensalesmaterials.Theadviceandstrategiescontained hereinmaynotbesuitableforyoursituationYoushouldconsultwithaprofessionalwhere appropriate.Neitherthepublishernorauthorshallbeliableforanylossofprofitoranyother commercialdamages,includingbutnotlimitedtospecial,incidental,consequential,orother damages.
ForgeneralinformationonourotherproductsandservicespleasecontactourCustomerCare DepartmentwithintheU.S.at877-762-2974,outsidetheU.S.at317-572-3993orfax317-572-4002. Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsin print,however,maynotbeavailableinelectronicformat.
LibraryofCongressCataloging-in-PublicationData:
Lee,ElisaT.
Statisticalmethodsforsurvivaldataanalysis.--3rded./ElisaT.LeeandJohnWenyuWang. p.cm.--(Wileyseriesinprobabilityandstatistics) Includesbibliographicalreferencesandindex. ISBN0-471-36997-7 (cloth:alk.paper)
1.Medicine--Research--Statisticalmethods.2.Failuretimedataanalysis.3. Prognosis--Statisticalmethods.I.Wang,JohnWenyu.II.Title.III.Series.
R853.S7L432003
610 .72--dc212002027025
PrintedintheUnitedStatesofAmerica. 10987654321
Tothememoryofourparents Mr.Chi-LanTanandMrs.Hwei-ChiLeeTan (E.T.L.)
Mr.BeijunZhangandMrs.XiangyiWang (J.W.W.)
Prefacexi
1Introduction
1.1Preliminaries,1
1.2CensoredData,1
1.3ScopeoftheBook,5 BibliographicalRemarks,7
2FunctionsofSurvivalTime8
2.1Definitions,8
2.2RelationshipsoftheSurvivalFunctions,15 BibliographicalRemarks,17 Exercises,17
3ExamplesofSurvivalDataAnalysis19
3.1Example3.1:ComparisonofTwoTreatmentsandThree Diets,19
3.2Example3.2:ComparisonofTwoSurvivalPatterns UsingLifeTables,26
3.3Example3.3:FittingSurvivalDistributionstoRemission Data,29
3.4Example3.4:RelativeMortalityandIdentificationof PrognosticFactors,32
3.5Example3.5:IdentificationofRiskFactors,40 BibliographicalRemarks,47 Exercises,47
4NonparametricMethodsofEstimatingSurvivalFunctions64
4.1Product-LimitEstimatesofSurvivorshipFunction,65
4.2Life-TableAnalysis,77
4.3Relative,Five-Year,andCorrectedSurvivalRates,94
4.4StandardizedRatesandRatios,97
BibliographicalRemarks,102 Exercises,102
5NonparametricMethodsforComparingSurvivalDistributions106
5.1ComparisonofTwoSurvivalDistributions,106
5.2Mantel HaenszelTest,121
5.3Comparisonof K (K 2) Samples,125
BibliographicalRemarks,131 Exercises,131
6SomeWell-KnownParametricSurvivalDistributions andTheirApplications134
6.1ExponentialDistribution,134
6.2WeibullDistribution,138
6.3LognormalDistribution,143
6.4GammaandGeneralizedGammaDistributions,148
6.5Log-LogisticDistribution,154
6.6OtherSurvivalDistributions,155
BibliographicalRemarks,160 Exercises,160
7EstimationProceduresforParametricSurvivalDistributions withoutCovariates162
7.1GeneralMaximumLikelihoodEstimationProcedure,162
7.2ExponentialDistribution,166
7.3WeibullDistribution,178
7.4LognormalDistribution,180
7.5StandardandGeneralizedGammaDistributions,188
7.6Log-LogisticDistribution,195
7.7OtherParametricSurvivalDistributions,196
BibliographicalRemarks,196 Exercises,197
8GraphicalMethodsforSurvivalDistributionFitting198
8.1Introduction,198
8.2ProbabilityPlotting,200
8.3HazardPlotting,209
8.4Cox SnellResidualMethod,215
BibliographicalRemarks,219 Exercises,219
9TestsofGoodnessofFitandDistributionSelection221
9.1Goodness-of-FitTestStatisticsBasedonAsymptotic LikelihoodInferences,222
9.2TestsforAppropriatenessofaFamilyofDistributions,225
9.3SelectionofaDistributionUsingBIC orAICProcedures,230
9.4TestsforaSpecificDistributionwith KnownParameters,233
9.5HollanderandProschan’sTestforAppropriateness ofaGivenDistributionwithKnownParameters,236 BibliographicalRemarks,238 Exercises,240
10ParametricMethodsforComparingTwoSurvivalDistributions243
10.1LikelihoodRatioTestforComparingTwoSurvival Distributions,243
10.2ComparisonofTwoExponentialDistributions,246
10.3ComparisonofTwoWeibullDistributions,251
10.4ComparisonofTwoGammaDistributions,252
BibliographicalRemarks,254 Exercises,254
11ParametricMethodsforRegressionModelFittingand IdentificationofPrognosticFactors256
11.1PreliminaryExaminationofData,257
11.2GeneralStructureofParametricRegressionModels andTheirAsymptoticLikelihoodInference,259
11.3ExponentialRegressionModel,263
11.4WeibullRegressionModel,269
11.5LognormalRegressionModel,274
11.6ExtendedGeneralizedGammaRegressionModel,277
11.7Log-LogisticRegressionModel,280
11.8OtherParametricRegressionModels,283
11.9ModelSelectionMethods,286
BibliographicalRemarks,295 Exercises,295
12IdentificationofPrognosticFactorsRelatedtoSurvivalTime: CoxProportionalHazardsModel298
12.1PartialLikelihoodFunctionforSurvivalTimes,298
12.2IdentificationofSignificantCovariates,314
12.3EstimationoftheSurvivorshipFunctionwithCovariates,319
12.4AdequacyAssessmentoftheProportionalHazardsModel,326
BibliographicalRemarks,336 Exercises,337
13IdentificationofPrognosticFactorsRelatedtoSurvivalTime: NonproportionalHazardsModels339
13.1ModelswithTime-DependentCovariates,339
13.2StratifiedProportionalHazardsModels,348
13.3CompetingRisksModel,352
13.4RecurrentEventsModels,356
13.5ModelsforRelatedObservations,374
BibliographicalRemarks,376 Exercises,376
14IdentificationofRiskFactorsRelatedtoDichotomous andPolychotomousOutcomes377
14.1UnivariateAnalysis,378
14.2LogisticandConditionalLogisticRegressionModels forDichotomousResponses,385
14.3ModelsforPolychotomousOutcomes,413
BibliographicalRemarks,425 Exercises,425
Preface Statisticalmethodsforsurvivaldataanalysishavecontinuedtoflourishinthe lasttwodecades.Applicationsofthemethodshavebeenwidenedfromtheir historicaluseincancerandreliabilityresearchtobusiness,criminology, epidemiology,andsocialandbehavioralsciences.Thethirdeditionof StatisticalMethodsforSurvivalDataAnalysis isintendedtoprovideacomprehensive introductionofthemostcommonlyusedmethodsforanalyzingsurvivaldata. Itbeginswithbasicdefinitionsandinterpretationsofsurvivalfunctions.From there,thereaderisguidedthroughmethods,parametricandnonparametric, forestimatingandcomparingthesefunctionsandthesearchforatheoretical distribution (ormodel) tofitthedata.Parametricandnonparametricapproachestotheidentificationofprognosticfactorsthatarerelatedtosurvival arethendiscussed.Finally,regressionmethods,primarilylinearlogisticregressionmodels,toidentifyriskfactorsfordichotomousandpolychotomous outcomesareintroduced.
Thethirdeditioncontinuestobeapplication-oriented,withaminimum levelofmathematics.Inafewchapters,someknowledgeofcalculusandmatrix algebraisneeded.Thefewsectionsthatintroducethegeneralmathematical structureforthemethodscanbeskippedwithoutlossofcontinuity.Alarge numberofpracticalexamplesaregiventoassistthereaderinunderstanding themethodsandapplicationsandininterpretingtheresults.Readerswithonly collegealgebrashouldfindthebookreadableandunderstandable.
Therearemanyexcellentbooksonclinicaltrials.Wethereforehavedeleted thetwochaptersonthesubjectthatwereinthesecondedition.Instead,we haveincludeddiscussionsofmorestatisticalmethodsforsurvivaldataanalysis. Abriefsummaryoftheimprovementsmadeforthethirdeditionisgiven below.
1.Twoadditionaldistributions,thelog-logisticdistributionandageneralizedgammadistribution,havebeenaddedtotheapplicationofparametricmodelsthatcanbeusedinmodelfittingandprognosticfactor identification (Chapters6,7,and11).
2.Inseveralsections (Sections7.1,9.1,10.1,11.2,and12.1),discussionsof theasymptoticlikelihoodinferenceofthemethodscoveredinthe chaptersaregiven.Thesesectionsareintendedtoprovideamoregeneral mathematicalstructureforstatisticians.
3.TheCox Snellresidualmethodhasbeenaddedtothechapteron graphicalmethodsforsurvivaldistributionfitting (Chapter8).Inaddition,thesectionsonprobabilityandhazardplottinghavebeenrevised sothatnospecialgraphicalpapersarerequiredtomaketheplots.
4.Moretestsofgoodnessoffitaregiven,includingtheBICandAIC procedures (Chapters9and11)
5.ForCox’sproportionalhazardsmodel (Chapter12),wehavenow includedmethodstoassessitsadequencyandprocedurestoestimatethe survivorshipfunctionwithcovariates.
6.Theconceptofnonproportionalhazardsmodelsisintroduced (Chapter 13),whichincludesmodelswithtime-dependentcovariates,stratified models,competingrisksmodels,recurrenteventmodels,andmodelsfor relatedobservations.
7.Thechapteronlinearlogisticregression (Chapter14) hasbeenexpanded tocoverregressionmodelsforpolychotomousoutcomes.Inaddition, methodsforageneral m : n matchingdesignhavebeenaddedtothe sectiononconditionallogisticregressionforcase controlstudies.
8.ComputerprogrammingcodesforsoftwarepackagesBMDP,SAS,and SPSSareprovidedformostexamplesinthetext.
Wewouldliketothankthemanyresearchers,teachers,andstudentswho haveusedthesecondeditionofthebook.Thesuggestionsforimprovement thatmanyofthemhaveprovidedareinvaluable.SpecialthanksgotoXing Wang,LindaHutton,TracyMankin,andImranAhmedfortypingthe manuscript.SteveQuigleyofJohnWileyconvincedustoworkonathird edition.Wethankhimforhisenthusiasm.
Finally,wearemostgratefultoourfamilies,Sam,Vivian,Benedict,Jennifer, andAnnelisa (E.T.L.),andAliceandXing (J.W.W.),fortheconstantjoy,love, andsupporttheyhavegivenus.
OklahomaCity,OK
April18,2001
CHAPTER1 Introduction 1.1PRELIMINARIES Thisbookisforbiomedicalresearchers,epidemiologists,consultingstatisticians,studentstakingafirstcourseonsurvivaldataanalysis,andothers interestedinsurvivaltimestudy.Itdealswithstatisticalmethodsforanalyzing survivaldataderivedfromlaboratorystudiesofanimals,clinicalandepidemiologicstudiesofhumans,andotherappropriateapplications.
Survivaltime canbedefinedbroadlyasthetimetotheoccurrenceofagiven event.Thiseventcanbethedevelopmentofadisease,responsetoatreatment, relapse,ordeath.Therefore,survivaltimecanbetumor-freetime,thetimefrom thestartoftreatmenttoresponse,lengthofremission,andtimetodeath. Survivaldatacanincludesurvivaltime,responsetoagiventreatment,and patientcharacteristicsrelatedtoresponse,survival,andthedevelopmentofa disease.Thestudyofsurvivaldatahasfocusedonpredictingtheprobabilityof response,survival,ormeanlifetime,comparingthesurvivaldistributionsof experimentalanimalsorofhumanpatientsandtheidentificationofriskand/or prognosticfactorsrelatedtoresponse,survival,andthedevelopmentofa disease.Inthisbook,specialconsiderationisgiventothestudyofsurvivaldata inbiomedicalsciences,althoughallthemethodsaresuitableforapplications inindustrialreliability,socialsciences,andbusiness.Examplesofsurvivaldata inthesefieldsarethelifetimeofelectronicdevices,components,orsystems (reliabilityengineering);felons’timetoparole (criminology);durationoffirst marriage (sociology);lengthofnewspaperormagazinesubscription (marketing);andworker’scompensationclaims (insurance) andtheirvariousinfluencingriskorprognosticfactors.
1.2CENSOREDDATA Manyresearchersconsidersurvivaldataanalysistobemerelytheapplication oftwoconventionalstatisticalmethodstoaspecialtypeofproblem: parametric ifthedistributionofsurvivaltimesisknowntobenormaland nonparametric
ifthedistributionisunknown.Thisassumptionwouldbetrueifthesurvival timesofallthesubjectswereexactandknown;however,somesurvivaltimes arenot.Further,thesurvivaldistributionisoftenskewed,orfarfrombeing normal.Thusthereisaneedfornewstatisticaltechniques.Oneofthemost importantdevelopmentsisduetoaspecialfeatureofsurvivaldatainthelife sciencesthatoccurswhensomesubjectsinthestudyhavenotexperiencedthe eventofinterestattheendofthestudyortimeofanalysis.Forexample,some patientsmaystillbealiveordisease-freeattheendofthestudyperiod.The exactsurvivaltimesofthesesubjectsareunknown.Thesearecalled censored observations or censoredtimes andcanalsooccurwhenpeoplearelostto follow-upafteraperiodofstudy.Whenthesearenotcensoredobservations, thesetofsurvivaltimesis complete. Therearethreetypesofcensoring.
TypeICensoring Animalstudiesusuallystartwithafixednumberofanimals,towhichthe treatmentortreatmentsisgiven.Becauseoftimeand/orcostlimitations,the researcheroftencannotwaitforthedeathofalltheanimals.Oneoptionisto observeforafixedperiodoftime,saysixmonths,afterwhichthesurviving animalsaresacrificed.Survivaltimesrecordedfortheanimalsthatdiedduring thestudyperiodarethetimesfromthestartoftheexperimenttotheirdeath. Thesearecalled exact or uncensoredobservations.Thesurvivaltimesofthe sacrificedanimalsarenotknownexactlybutarerecordedasatleastthelength ofthestudyperiod.Thesearecalled censoredobservations. Someanimalscould belostordieaccidentally.Theirsurvivaltimes,fromthestartofexperiment tolossordeath,arealsocensoredobservations.In typeIcensoring,ifthereare noaccidentallosses,allcensoredobservationsequalthelengthofthestudy period.
Forexample,supposethatsixratshavebeenexposedtocarcinogensby injectingtumorcellsintotheirfootpads.Thetimestodevelopatumorofa givensizeareobserved.Theinvestigatordecidestoterminatetheexperiment after30weeks.Figure1.1isaplotofthedevelopmenttimesofthetumors. RatsA,B,andDdevelopedtumorsafter10,15,and25weeks,respectively. RatsCandEdidnotdeveloptumorsbytheendofthestudy;theirtumor-free timesarethus30-plusweeks.RatFdiedaccidentallywithouttumorsafter19 weeksofobservation.Thesurvivaldata (tumor-freetimes) are10,15,30 ,25, 30 ,and19 weeks. (Theplusindicatesacensoredobservation.)
TypeIICensoring Anotheroptioninanimalstudiesistowaituntilafixedportionoftheanimals havedied,say80of100,afterwhichthesurvivinganimalsaresacrificed.In thiscase, typeIIcensoring,iftherearenoaccidentallosses,thecensored observationsequalthelargestuncensoredobservation.Forexample,inan experimentofsixrats (Figure1.2),theinvestigatormaydecidetoterminatethe studyafterfourofthesixratshavedevelopedtumors.Thesurvivalor tumor-freetimesarethen10,15,35 ,25,35,and19 weeks.
Figure1.1 ExampleoftypeIcensoreddata.
Figure1.2 ExampleoftypeIIcensoreddata.
TypeIIICensoring Inmostclinicalandepidemiologicstudiestheperiodofstudyisfixedand patientsenterthestudyatdifferenttimesduringthatperiod.Somemaydie beforetheendofthestudy;theirexactsurvivaltimesareknown.Othersmay withdrawbeforetheendofthestudyandarelosttofollow-up.Stillothersmay bealiveattheendofthestudy.For‘‘lost’’patients,survivaltimesareatleast fromtheirentrancetothelastcontact.Forpatientsstillalive,survivaltimes areatleastfromentrytotheendofthestudy.Thelattertwokindsof observationsarecensoredobservations.Sincetheentrytimesarenotsimultaneous,thecensoredtimesarealsodifferent.Thisis typeIIIcensoring.For example,supposethatsixpatientswithacuteleukemiaenteraclinicalstudy
Figure1.3 ExampleoftypeIIIcensoreddata. duringatotalstudyperiodofoneyear.Supposealsothatallsixrespondto treatmentandachieveremission.TheremissiontimesareplottedinFigure1.3. PatientsA,C,andEachieveremissionatthebeginningofthesecond,fourth, andninthmonths,andrelapseafterfour,six,andthreemonths,respectively. PatientBachievesremissionatthebeginningofthethirdmonthbutislostto follow-upfourmonthslater;theremissiondurationisthusatleastfour months.PatientsDandFachieveremissionatthebeginningofthefifthand tenthmonths,respectively,andarestillinremissionattheendofthestudy; theirremissiontimesarethusatleasteightandthreemonths.Therespective remissiontimesofthesixpatientsare4,4 ,6,8 ,3,and3 months.
TypeIandtypeIIcensoredobservationsarealsocalled singlycensored data,andtypeIII, progressivelycensoreddata,byCohen (1965).Another commonlyusednamefortypeIIIcensoringis randomcensoring.Allofthese typesofcensoringare rightcensoring or censoringtotheright.Therearealso leftcensoringandintervalcensoringcases. Leftcensoring occurswhenitis knownthattheeventofinterestoccurredpriortoacertaintime t,buttheexact timeofoccurrenceisunknown.Forexample,anepidemiologistwishestoknow theageatdiagnosisinafollow-upstudyofdiabeticretinopathy.Atthetimeof theexamination,a50-year-oldparticipantwasfoundtohavealreadydevelopedretinopathy,butthereisnorecordoftheexacttimeatwhichinitialevidence wasfound.Thustheageatexamination (i.e.,50) isaleft-censoredobservation. Itmeansthattheageofdiagnosisforthispatientis atmost 50years.
Intervalcensoring occurswhentheeventofinterestisknowntohave occurredbetweentimes a and b.Forexample,ifmedicalrecordsindicatethat atage45,thepatientintheexampleabovedidnothaveretinopathy,hisage atdiagnosisisbetween45and50years.
Wewillstudydescriptiveandanalyticmethodsforcomplete,singlycensored,andprogressivelycensoredsurvivaldatausingnumericalandgraphical
techniques.Analyticmethodsdiscussedincludeparametricandnonparametric. Parametricapproachesareusedeitherwhenasuitablemodelordistribution isfittedtothedataorwhenadistributioncanbeassumedforthepopulation fromwhichthesampleisdrawn.Commonlyusedsurvivaldistributionsarethe exponential,Weibull,lognormal,andgamma.Ifasurvivaldistributionisfound tofitthedataproperly,thesurvivalpatterncanthenbedescribedbythe parametersinacompactway.Statisticalinferencecanbebasedonthe distributionchosen.Ifthesearchforanappropriatemodelordistributionis tootimeconsumingornoteconomicalornotheoreticaldistributionadequatelyfitsthedata,nonparametricmethods,whicharegenerallyeasytoapply, shouldbeconsidered.
1.3SCOPEOFTHEBOOK Thisbookisdividedintofourparts.
PartI (Chapters1,2,and3) definessurvivalfunctionsandgivesexamples ofsurvivaldataanalysis.Survivaldistributionismostcommonlydescribedby threefunctions:thesurvivorshipfunction (alsocalledthecumulativesurvival rateorsurvivalfunction),theprobabilitydensityfunction,andthehazard function (hazardrateorage-specificrate).InChapter2wedefinethesethree functionsandtheirequivalencerelationships.Chapter3illustratessurvival dataanalysiswithfiveexamplestakenfromactualresearchsituations.Clinical andlaboratorydataaresystematicallyanalyzedinprogressivestepsandthe resultsareinterpreted.Sectionandchapternumbersaregivenforquick reference.Theactualcalculationsaregivenasexamplesorleftasexercisesin thechapterswherethemethodsarediscussed.Foursetsofdataareprovided intheexercisesectionforthereadertoanalyze.Thesedataarereferredtoin thevariouschapters.
InPartII (Chapters4and5) weintroducesomeofthemostwidelyused nonparametricmethodsforestimatingandcomparingsurvivaldistributions. Chapter4dealswiththenonparametricmethodsforestimatingthethree survivalfunctions:theKaplanandMeierproduct-limit (PL) estimateandthe life-tabletechnique (populationlifetablesandclinicallifetables).Alsocovered isstandardizationofratesbydirectandindirectmethods,includingthe standardizedmortalityratio.Chapter5isdevotedtononparametrictechniquesforcomparingsurvivaldistributions.Acommonpracticeistocompare thesurvivalexperiencesoftwoormoregroupsdifferingintheirtreatmentor inagivencharacteristic.Severalnonparametrictestsaredescribed.
PartIII (Chapters6to10) introducestheparametricapproachtosurvival dataanalysis.Althoughnonparametricmethodsplayanimportantrolein survivalstudies,parametrictechniquescannotbeignored.InChapter6we introduceanddiscusstheexponential,Weibull,lognormal,gamma,and log-logisticsurvivaldistributions.Practicalapplicationsofthesedistributions takenfromtheliteratureareincluded.
Animportantpartofsurvivaldataanalysisismodelordistributionfitting. Onceanappropriatestatisticalmodelforsurvivaltimehasbeenconstructed anditsparametersestimated,itsinformationcanhelppredictsurvival,develop optimaltreatmentregimens,planfutureclinicalorlaboratorystudies,andso on.Thegraphicaltechniqueisasimpleinformalwaytoselectastatistical modelandestimateitsparameters.Whenastatisticaldistributionisfoundto fitthedatawell,theparameterscanbeestimatedbyanalyticalmethods.In Chapter7wediscussanalyticalestimationproceduresforsurvivaldistributions.Mostoftheestimationproceduresarebasedonthemaximumlikelihood method.Mathematicalderivationsareomitted;onlyformulasfortheestimates andexamplesaregiven.InChapter8weintroducethreekindsofgraphical methods:probabilityplotting,hazardplotting,andtheCox Snellresidual methodforsurvivaldistributionfitting.InChapter9wediscussseveraltests ofgoodnessoffitanddistributionselection.InChapter10wedescribeseveral parametricmethodsforcomparingsurvivaldistributions.
Atopicthathasreceivedincreasingattentionistheidentificationof prognosticfactorsrelatedtosurvivaltime.Forexample,whoislikelyto survivelongestaftermastectomy,andwhatarethemostimportantfactorsthat influencethatsurvival?Anothersubjectimportanttobothbiomedicalresearchersandepidemiologistsisidentificationoftheriskfactorsrelatedtothe developmentofagivendiseaseandtheresponsetoagiventreatment.What arethefactorsmostcloselyrelatedtothedevelopmentofagivendisease?Who ismorelikelytodeveloplungcancer,diabetes,orcoronarydisease?Inmany diseases,suchascancer,patientswhorespondtotreatmenthaveabetter prognosisthanpatientswhodonot.Thequestion,then,relatestowhatthe factorsarethatinfluenceresponse.Whoismorelikelytorespondtotreatment andthusperhapssurvivelonger?
PartIV (Chapters11to14) dealswithprognostic/riskfactorsandsurvival times.InChapter11weintroduceparametricmethodsforidentifyingimportantprognosticfactors.Chapters12and13cover,respectively,theCox proportionalhazardsmodelandseveralnonproportionalhazardsmodelsfor theidentificationofprognosticfactors.Inthefinalchapter,Chapter14,we introducethelinearlogisticregressionmodelforbinaryoutcomevariablesand itsextensiontohandlepolychotomousoutcomes.
InAppendixAwedescribeanumericalprocedureforsolvingnonlinear equations,theNewton Raphsonmethod.ThismethodissuggestedinChapters7,11,12,and13.AppendixBcomprisesanumberofstatisticaltables.
Mostnonparametrictechniquesdiscussedhereareeasytounderstandand simpletoapply.Parametricmethodsrequireanunderstandingofsurvival distributions.Unfortunately,mostofsurvivaldistributionsarenotsimple. Readerswithoutcalculusmayfinditdifficulttoapplythemontheirown. However,ifthemainpurposeisnotmodelfitting,mostparametrictechniques canbesubstitutedforbytheirnonparametriccompetitors.Infact,alarge percentageofsurvivalstudiesinclinicalorepidemiologicaljournalsare analyzedbynonparametricmethods.Researchersnotinterestedinsurvival
modelfittingshouldreadthechaptersandsectionsonnonparametricmethods. Computerprogramsforsurvivaldataanalysisareavailableinseveralcommerciallyavailablesoftwarepackages:forexample,BMDP,SAS,andSPSS.These computerprogramsarereferredtoinvariouschapterswhenapplicable. Computerprogrammingcodesaregivenformanyoftheexamples.
BibliographicalRemarks CrossandClark (1975) wasthefirstbooktodiscussparametricmodelsand nonparametricandgraphicaltechniquesforbothcompleteandcensored survivaldata.Sincethen,severalotherbookshavebeenpublishedinaddition tothefirsteditionofthisbook (Lee,1980,1992).Elandt-JohnsonandJohnson (1980) discussextensivelytheconstructionoflifetables,modelfitting,competingrisk,andmathematicalmodelsofbiologicalprocessesofdiseaseprogressionandaging.KalbfleischandPrentice (1980) focusonregression problemswithsurvivaldata,particularlyCox’sproportionalhazardsmodel. Miller (1981) coversanumberofparametricandnonparametricmethodsfor survivalanalysis.CoxandOakes (1984) alsocoverthetopicconciselywithan emphasisontheexaminationofexplanatoryvariables.
Nelson (1982) providesagooddiscussionofparametric,nonparametric,and graphicalmethods.Thebookismoresuitedforindustrialreliabilityengineers thanforbiomedicalresearchers,asareHahnandShapiro (1967) andMannet al. (1974).Inaddition,Lawless (1982) givesabroadcoverageoftheareawith applicationsinengineeringandbiomedicalsciences.
MorerecentpublicationsincludeMarubiniandValsecchi (1994),Kleinbaum (1995),KleinandMoeschberger (1997),andHosmerandLemeshow (1999).Mostofthesebookstakeamorerigorousmathematicalapproachand requireknowledgeofmathematicalstatistics.
CHAPTER2 FunctionsofSurvivalTime Survivaltimedatameasurethetimetoacertainevent,suchasfailure,death, response,relapse,thedevelopmentofagivendisease,parole,ordivorce.These timesaresubjecttorandomvariations,andlikeanyrandomvariables,forma distribution.Thedistributionofsurvivaltimesisusuallydescribedorcharacterizedbythreefunctions: (1) thesurvivorshipfunction, (2) theprobability densityfunction,and (3) thehazardfunction.Thesethreefunctionsare mathematicallyequivalent—ifoneofthemisgiven,theothertwocanbe derived.
Inpractice,thethreefunctionscanbeusedtoillustratedifferentaspectsof thedata.Abasicprobleminsurvivaldataanalysisistoestimatefromthe sampleddataoneormoreofthesethreefunctionsandtodrawinferences aboutthesurvivalpatterninthepopulation.InSection2.1wedefinethethree functionsandinSection2.2,discusstheequivalencerelationshipamongthe threefunctions.
2.1DEFINITIONS Let T denotethesurvivaltime.Thedistributionof T canbecharacterizedby threeequivalentfunctions.
SurvivorshipFunction(orSurvivalFunction)
Thisfunction,denotedby S(t),isdefinedastheprobabilitythatanindividual surviveslongerthan t: S(t) P (anindividualsurviveslongerthan t) P(T t )
Fromthedefinitionofthecumulativedistributionfunction F(t)of T, S(t) 1-P (anindividualfailsbefore t) 1 F(t)(2.1.2)
)
Here S(t)isanonincreasingfunctionoftime t withtheproperties
S(t) 1for t 0 0for t
Thatis,theprobabilityofsurvivingatleastatthetimezerois1andthatof survivinganinfinitetimeiszero.
Thefunction S(t)isalsoknownasthe cumulativesurvivalrate. Todepictthe courseofsurvival,Berkson (1942) recommendedagraphicpresentationof S(t). Thegraphof S(t)iscalledthe survivalcurve. Asteepsurvivalcurve,suchas theoneshowninFigure2.1a,representslowsurvivalrateorshortsurvival time.AgradualorflatsurvivalcurvesuchasinFigure2.1b representshigh survivalrateorlongersurvival.
Thesurvivorshipfunctionorthesurvivalcurveisusedtofindthe50th percentile (themedian) andotherpercentiles (e.g.,25thand75th) ofsurvival timeandtocomparesurvivaldistributionsoftwoormoregroups.Themedian survivaltimesinFigure2.1a and b areapproximately5and36unitsoftime, respectively.Themeanisgenerallyusedtodescribethecentraltendencyofa distribution,butinsurvivaldistributionsthemedianisoftenbetterbecausea smallnumberofindividualswithexceptionallylongorshortlifetimeswill causethemeansurvivaltimetobedisproportionatelylargeorsmall.
Inpractice,iftherearenocensoredobservations,thesurvivorshipfunction isestimatedastheproportionofpatientssurvivinglongerthan t :
numberofpatientssurvivinglongerthan t
S(t)
)
totalnumberofpatients
wherethecircumflexdenotesan estimate ofthefunction.Whencensored observationsarepresent,thenumeratorof (2.1.3) cannotalwaysbedetermined. Forexample,considerthefollowingsetofsurvivaldata:4,6,6 ,10 ,15,20.
Figure2.1 Twoexamplesofsurvivalcurves.
Using (2.1.3),wecancompute S(5) 5/6 0.833.However,wecannotobtain S(11)sincetheexactnumberofpatientssurvivinglongerthan11isunknown. Eitherthethirdorthefourthpatient (6 and10 ) couldsurvivelongerthan orlessthan11.Thus,whencensoredobservationsarepresent, (2.1.3) isno longerappropriateforestimating S(t).Nonparametricmethodsofestimating S(t)forcensoreddataarediscussedinChapter4.
ProbabilityDensityFunction(orDensityFunction) Likeanyothercontinuousrandomvariable,thesurvivaltime T hasa probabilitydensityfunctiondefinedasthelimitoftheprobabilitythatan individualfailsintheshortinterval t to t t perunitwidth t,orsimplythe probabilityoffailureinasmallintervalperunittime.Itcanbeexpressedas
f (t) lim P[anindividualdyingintheinterval (t, t t)]
.1.4)
Thegraphof f (t)iscalledthe densitycurve. Figure2.2a and b givetwo examplesofthedensitycurve.Thedensityfunctionhasthefollowingtwo properties:
1. f (t)isanonnegativefunction: f (t) 0forall t 0 0for t 0
2.Theareabetweenthedensitycurveandthe t axisisequalto1.
Inpractice,iftherearenocensoredobservations,theprobabilitydensity function f (t)isestimatedastheproportionofpatientsdyinginanintervalper
Figure2.2 Twoexamplesofdensitycurves.
unitwidth:
f (t)
numberofpatientsdyingintheintervalbeginningattime t (totalnumberofpatients) (intervalwidth)
)
Similartotheestimationof S(t),whencensoredobservationsarepresent, (2.1.5) isnotapplicable.WediscussanappropriatemethodinChapter4.
Theproportionofindividualsthatfailinanytimeintervalandthepeaksof highfrequencyoffailurecanbefoundfromthedensityfunction.Thedensity curveinFigure2.2a givesapatternofhighfailurerateatthebeginningofthe studyanddecreasingfailurerateastimeincreases.InFigure2.2b,thepeakof highfailurefrequencyoccursatapproximately1.7unitsoftime.Theproportionofindividualsthatfailbetween1and2unitsoftimeisequaltotheshaded areabetweenthedensitycurveandtheaxis.Thedensityfunctionisalsoknown asthe unconditionalfailurerate.
HazardFunction Thehazardfunction h(t)ofsurvivaltime T givesthe conditionalfailurerate. Thisisdefinedastheprobabilityoffailureduringaverysmalltimeinterval, assumingthattheindividualhassurvivedtothebeginningoftheinterval,or asthelimitoftheprobabilitythatanindividualfailsinaveryshortinterval, t t,giventhattheindividualhassurvivedtotime t:
P anindividualfailsinthetimeinterval
Thehazardfunctioncanalsobedefinedintermsofthecumulative distributionfunction F(t)andtheprobabilitydensityfunction f (t):
Thehazardfunctionisalsoknownasthe instantaneousfailurerate, forceof mortality, conditionalmortalityrate,and age-specificfailurerate. If t in (2.1.6) isage,itisameasureofthepronenesstofailureasafunctionoftheageofthe individualinthesensethatthequantity th(t)istheexpectedproportionof age t individualswhowillfailintheshorttimeinterval t t.Thehazard functionthusgivestheriskoffailureperunittimeduringtheagingprocess.It playsanimportantroleinsurvivaldataanalysis.
Inpractice,whentherearenocensoredobservationsthehazardfunctionis estimatedastheproportionofpatientsdyinginanintervalperunittime,given
thattheyhavesurvivedtothebeginningoftheinterval:
h(t)
numberofpatientsdyingintheintervalbeginningattime t (numberofpatientssurvivingat t) (intervalwidth)
numberofpatientsdyingperunittimeintheinterval
numberofpatientssurvivingat t (2.1.8)
Actuariesusuallyusetheaveragehazardrateoftheintervalinwhichthe numberofpatientsdyingperunittimeintheintervalisdividedbytheaverage numberofsurvivorsatthemidpointoftheinterval:
h(t)
numberofpatientsdyingperunittimeintheinterval (numberofpatientssurvivingat t) (numberofdeathsintheinterval)/2 (2.1.9)
Theactuarialestimatein (2.1.9) givesahigherhazardratethan (2.1.8) andthus amoreconservativeestimate.
Thehazardfunctionmayincrease,decrease,remainconstant,orindicatea morecomplicatedprocess.Figure2.3isaplotofseveralkindsofhazard function.Forexample,patientswithacuteleukemiawhodonotrespondto treatmenthaveanincreasinghazardrate, h (t), h (t)isadecreasinghazard functionthat,forexample,indicatestheriskofsoldierswoundedbybullets whoundergosurgery.Themaindangeristheoperationitselfandthisdanger decreasesifthesurgeryissuccessful.Anexampleofaconstanthazardfunction, h (t),istheriskofhealthypersonsbetween18and40yearsofagewhosemain risksofdeathareaccidents.The bathtubcurve, h (t),describestheprocessof
Figure2.3 Examplesofthehazardfunction.
humanlife.Duringaninitialperiod,theriskishigh (highinfantmortality). Subsequently, h(t)staysapproximatelyconstantuntilacertaintime,after whichitincreasesbecauseofwear-outfailures.Finally,patientswithtuberculosishaverisksthatincreaseinitially,thendecreaseaftertreatment.Suchan increasing,thendecreasinghazardfunctionisdescribedby h (t).
The cumulativehazardfunction isdefinedas
Thus,at t 0, S(t) 1, H(t) 0,andat t , S(t) 0, H(t) .The cumulativehazardfunctioncanbeanyvaluebetweenzeroandinfinity.Alllog functionsinthisbookarenaturallogs (base e) unlessotherwiseindicated.
Thefollowingexampleillustrateshowthesefunctionscanbeestimatedfrom acompletesampleofgroupedsurvivaltimeswithoutcensoredobservations.
Example2.1 ThefirstthreecolumnsofTable2.1givethesurvivaldataof 40patientswithmyeloma.Thesurvivaltimesaregroupedintointervalsoffive months.Theestimatedsurvivorshipfunction,densityfunction,andhazard functionarealsogiven,withthecorrespondinggraphsplottedinFigure 2.4a—c.
Table2.1SurvivalDataandEstimatedSurvivalFunctionsof40MyelomaPatients
Figure2.4 Estimatedsurvivalfunctionsofmyelomapatients.
Figure2.4 (Continued).
Theestimatedsurvivorshipfunction, S(t),iscalculatedfollowing (2.1.3) atthe beginningortheendofeachinterval.Forexample,atthebeginningofthefirst interval,all40patientsarealive, S(0) 1,andatthebeginningofthesecond interval,35ofthe40patientsarestillalive, S(5) 35/40 0.875.Similarly, S(10) 28/40 0.700.Theestimateddensityfunction f (t)iscomputedfollowing (2.1.5).Forexample,thedensityfunctionofthefirstinterval (0 5) is 5/(40 5) 0.025,andthatofthesecondinterval (5 10) is7/(40 5) 0.035. Theestimateddensityfunctionisplottedatthemidpointofeachinterval (Figure2.4b).Theestimatedhazardfunction, h(t),iscomputedfollowingthe actuarialmethodgivenin (2.1.9).Forexample,thehazardfunctionofthefirst interval5/[5(40 5/2)] 0.027andthatofthesecondintervalis7/[5(35 7/ 2)] 0.044.Theestimatedhazardfunctionisalsoplottedatthemidpointof eachinterval (Figure2.4c).
FromTable2.1orFigure2.4a,themediansurvivaltimeofmyeloma patientsisapproximately17.5months,andthepeakofhighfrequencyofdeath occursin5to10months.Inaddition,thehazardfunctionshowsanincreasing trendandreachesitspeakatapproximately32.5monthsandthenfluctuates.
2.2RELATIONSHIPSOFTHESURVIVALFUNCTIONS ThethreefunctionsdefinedinSection2.1aremathematicallyequivalent.Given anyoneofthem,theothertwocanbederived.Readersnotinterestedinthe mathematicalrelationshipamongthethreesurvivalfunctionscanskipthis
sectionwithoutlossofcontinuity.
1.From (2.1.2) and (2.1.7),
Thisrelationshipcanalsobederivedfrom (2.1.6) usingbasicdefinitionsof conditionalprobabilities.
2.Sincetheprobabilitydensityfunctionisthederivativeofthecumulative distributionfunction,
3.Substituting (2.2.2) into (2.2.1) yields
4. Integrating (2.2.3) fromzeroto t andusing S(0) 1,wehave
or
or
5.From (2.2.1) and (2.2.4) weobtain
Hence,if f (t)isknown,thesurvivorshipfunctioncanbeobtainedfromthe basicrelationshipbetween f (t), F(t),and (2.1.2).Thehazardfunctioncanthen bedeterminedfrom (2.2.1).If S(t)isknown, f (t)and h(t)canbedetermined from (2.2.2) and (2.2.1),respectively,or h(t)canbederivedfirstfrom (2.2.3) and then f (t)from (2.2.1).If h(t)isgiven, S(t)and f (t)canbeobtained,respectively, from (2.2.4) and (2.2.5).Thus,givenanyoneofthethreesurvivalfunctions,the othertwocaneasilybederived.Thefollowingexampleillustratesthese equivalencerelationships.