Statistical methods for survival data analysis 3rd edition lee - Download the ebook now for an unlim

Page 1


Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

An Introduction to Statistical Methods and Data Analysis 7th Edition, (Ebook PDF)

https://ebookmass.com/product/an-introduction-to-statistical-methodsand-data-analysis-7th-edition-ebook-pdf/

ebookmass.com

Qualitative Data Analysis: A Methods Sourcebook 3rd Edition – Ebook PDF Version

https://ebookmass.com/product/qualitative-data-analysis-a-methodssourcebook-3rd-edition-ebook-pdf-version/

ebookmass.com

Hands On With Google Data Studio: A Data Citizen's Survival Guide Lee Hurst

https://ebookmass.com/product/hands-on-with-google-data-studio-a-datacitizens-survival-guide-lee-hurst/

ebookmass.com

Introduction to Agricultural Economics (What’s New in Trades & Technology) 7th Edition, (Ebook PDF)

https://ebookmass.com/product/introduction-to-agricultural-economicswhats-new-in-trades-technology-7th-edition-ebook-pdf/

ebookmass.com

(eTextbook PDF) for Clinical Immunology and Serology: A Laboratory Perspective 4th Edition

https://ebookmass.com/product/etextbook-pdf-for-clinical-immunologyand-serology-a-laboratory-perspective-4th-edition/

ebookmass.com

Facilities Design – Ebook PDF Version

https://ebookmass.com/product/facilities-design-ebook-pdf-version/

ebookmass.com

A Culling Tide (Chronicles of an Urban Druid Book 14) Auburn Tempest & Michael Anderle

https://ebookmass.com/product/a-culling-tide-chronicles-of-an-urbandruid-book-14-auburn-tempest-michael-anderle/

ebookmass.com

Leopard's Hunt Christine Feehan

https://ebookmass.com/product/leopards-hunt-christine-feehan/

ebookmass.com

(eBook PDF) Single Variable Calculus: Early Transcendentals 9th Edition

https://ebookmass.com/product/ebook-pdf-single-variable-calculusearly-transcendentals-9th-edition/

ebookmass.com

Feminist Intersectionality: Centering the Margins in 21stcentury Medieval Studies Samantha Seal

https://ebookmass.com/product/feminist-intersectionality-centeringthe-margins-in-21st-century-medieval-studies-samantha-seal/

ebookmass.com

StatisticalMethodsfor SurvivalDataAnalysis

StatisticalMethodsfor SurvivalDataAnalysis

ThirdEdition

ELISAT.LEE JOHNWENYUWANG

DepartmentofBiostatisticsandEpidemiologyand CenterforAmericanIndianHealthResearch CollegeofPublicHealth UniversityofOklahomaHealthSciencesCenter OklahomaCity,Oklahoma

Copyright 2003byJohnWiley&Sons,Inc.Allrightsreserved.

PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey. PublishedsimultaneouslyinCanada.

Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinany formorbyanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise, exceptaspermittedunderSection107or108ofthe1976UnitedStatesCopyrightAct, withouteitherthepriorwrittenpermissionofthePublisher,orauthorizationthroughpaymentof theappropriateper-copyfeetotheCopyrightClearanceCenter,Inc.,222RosewoodDrive, Danvers,MA01923,978-750-8400,fax978-750-4470,oronthewebatwww.copyright.com. RequeststothePublisherforpermissionshouldbeaddressedtothePermissionsDepartment, JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030, (201) 748-6011,fax (201) 748-6008, e-mail:permreq wiley.com.

LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbest effortsinpreparingthisbook,theymakenorepresentationsorwarrantieswithrespecttothe accuracyorcompletenessofthecontentsofthisbookandspecificallydisclaimanyimplied warrantiesofmerchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedor extendedbysalesrepresentativesorwrittensalesmaterials.Theadviceandstrategiescontained hereinmaynotbesuitableforyoursituationYoushouldconsultwithaprofessionalwhere appropriate.Neitherthepublishernorauthorshallbeliableforanylossofprofitoranyother commercialdamages,includingbutnotlimitedtospecial,incidental,consequential,orother damages.

ForgeneralinformationonourotherproductsandservicespleasecontactourCustomerCare DepartmentwithintheU.S.at877-762-2974,outsidetheU.S.at317-572-3993orfax317-572-4002. Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsin print,however,maynotbeavailableinelectronicformat.

LibraryofCongressCataloging-in-PublicationData:

Lee,ElisaT.

Statisticalmethodsforsurvivaldataanalysis.--3rded./ElisaT.LeeandJohnWenyuWang. p.cm.--(Wileyseriesinprobabilityandstatistics) Includesbibliographicalreferencesandindex. ISBN0-471-36997-7 (cloth:alk.paper)

1.Medicine--Research--Statisticalmethods.2.Failuretimedataanalysis.3. Prognosis--Statisticalmethods.I.Wang,JohnWenyu.II.Title.III.Series.

R853.S7L432003

610 .72--dc212002027025

PrintedintheUnitedStatesofAmerica. 10987654321

Tothememoryofourparents

Mr.Chi-LanTanandMrs.Hwei-ChiLeeTan (E.T.L.)

Mr.BeijunZhangandMrs.XiangyiWang (J.W.W.)

Prefacexi

1Introduction

1.1Preliminaries,1

1.2CensoredData,1

1.3ScopeoftheBook,5 BibliographicalRemarks,7

2FunctionsofSurvivalTime8

2.1Definitions,8

2.2RelationshipsoftheSurvivalFunctions,15 BibliographicalRemarks,17 Exercises,17

3ExamplesofSurvivalDataAnalysis19

3.1Example3.1:ComparisonofTwoTreatmentsandThree Diets,19

3.2Example3.2:ComparisonofTwoSurvivalPatterns UsingLifeTables,26

3.3Example3.3:FittingSurvivalDistributionstoRemission Data,29

3.4Example3.4:RelativeMortalityandIdentificationof PrognosticFactors,32

3.5Example3.5:IdentificationofRiskFactors,40 BibliographicalRemarks,47 Exercises,47

4NonparametricMethodsofEstimatingSurvivalFunctions64

4.1Product-LimitEstimatesofSurvivorshipFunction,65

4.2Life-TableAnalysis,77

4.3Relative,Five-Year,andCorrectedSurvivalRates,94

4.4StandardizedRatesandRatios,97

BibliographicalRemarks,102 Exercises,102

5NonparametricMethodsforComparingSurvivalDistributions106

5.1ComparisonofTwoSurvivalDistributions,106

5.2Mantel HaenszelTest,121

5.3Comparisonof K (K 2) Samples,125

BibliographicalRemarks,131 Exercises,131

6SomeWell-KnownParametricSurvivalDistributions andTheirApplications134

6.1ExponentialDistribution,134

6.2WeibullDistribution,138

6.3LognormalDistribution,143

6.4GammaandGeneralizedGammaDistributions,148

6.5Log-LogisticDistribution,154

6.6OtherSurvivalDistributions,155

BibliographicalRemarks,160 Exercises,160

7EstimationProceduresforParametricSurvivalDistributions withoutCovariates162

7.1GeneralMaximumLikelihoodEstimationProcedure,162

7.2ExponentialDistribution,166

7.3WeibullDistribution,178

7.4LognormalDistribution,180

7.5StandardandGeneralizedGammaDistributions,188

7.6Log-LogisticDistribution,195

7.7OtherParametricSurvivalDistributions,196

BibliographicalRemarks,196 Exercises,197

8GraphicalMethodsforSurvivalDistributionFitting198

8.1Introduction,198

8.2ProbabilityPlotting,200

8.3HazardPlotting,209

8.4Cox SnellResidualMethod,215

BibliographicalRemarks,219 Exercises,219

9TestsofGoodnessofFitandDistributionSelection221

9.1Goodness-of-FitTestStatisticsBasedonAsymptotic LikelihoodInferences,222

9.2TestsforAppropriatenessofaFamilyofDistributions,225

9.3SelectionofaDistributionUsingBIC orAICProcedures,230

9.4TestsforaSpecificDistributionwith KnownParameters,233

9.5HollanderandProschan’sTestforAppropriateness ofaGivenDistributionwithKnownParameters,236 BibliographicalRemarks,238 Exercises,240

10ParametricMethodsforComparingTwoSurvivalDistributions243

10.1LikelihoodRatioTestforComparingTwoSurvival Distributions,243

10.2ComparisonofTwoExponentialDistributions,246

10.3ComparisonofTwoWeibullDistributions,251

10.4ComparisonofTwoGammaDistributions,252

BibliographicalRemarks,254 Exercises,254

11ParametricMethodsforRegressionModelFittingand IdentificationofPrognosticFactors256

11.1PreliminaryExaminationofData,257

11.2GeneralStructureofParametricRegressionModels andTheirAsymptoticLikelihoodInference,259

11.3ExponentialRegressionModel,263

11.4WeibullRegressionModel,269

11.5LognormalRegressionModel,274

11.6ExtendedGeneralizedGammaRegressionModel,277

11.7Log-LogisticRegressionModel,280

11.8OtherParametricRegressionModels,283

11.9ModelSelectionMethods,286

BibliographicalRemarks,295 Exercises,295

12IdentificationofPrognosticFactorsRelatedtoSurvivalTime: CoxProportionalHazardsModel298

12.1PartialLikelihoodFunctionforSurvivalTimes,298

12.2IdentificationofSignificantCovariates,314

12.3EstimationoftheSurvivorshipFunctionwithCovariates,319

12.4AdequacyAssessmentoftheProportionalHazardsModel,326

BibliographicalRemarks,336 Exercises,337

13IdentificationofPrognosticFactorsRelatedtoSurvivalTime: NonproportionalHazardsModels339

13.1ModelswithTime-DependentCovariates,339

13.2StratifiedProportionalHazardsModels,348

13.3CompetingRisksModel,352

13.4RecurrentEventsModels,356

13.5ModelsforRelatedObservations,374

BibliographicalRemarks,376 Exercises,376

14IdentificationofRiskFactorsRelatedtoDichotomous andPolychotomousOutcomes377

14.1UnivariateAnalysis,378

14.2LogisticandConditionalLogisticRegressionModels forDichotomousResponses,385

14.3ModelsforPolychotomousOutcomes,413

BibliographicalRemarks,425 Exercises,425

Preface

Statisticalmethodsforsurvivaldataanalysishavecontinuedtoflourishinthe lasttwodecades.Applicationsofthemethodshavebeenwidenedfromtheir historicaluseincancerandreliabilityresearchtobusiness,criminology, epidemiology,andsocialandbehavioralsciences.Thethirdeditionof StatisticalMethodsforSurvivalDataAnalysis isintendedtoprovideacomprehensive introductionofthemostcommonlyusedmethodsforanalyzingsurvivaldata. Itbeginswithbasicdefinitionsandinterpretationsofsurvivalfunctions.From there,thereaderisguidedthroughmethods,parametricandnonparametric, forestimatingandcomparingthesefunctionsandthesearchforatheoretical distribution (ormodel) tofitthedata.Parametricandnonparametricapproachestotheidentificationofprognosticfactorsthatarerelatedtosurvival arethendiscussed.Finally,regressionmethods,primarilylinearlogisticregressionmodels,toidentifyriskfactorsfordichotomousandpolychotomous outcomesareintroduced.

Thethirdeditioncontinuestobeapplication-oriented,withaminimum levelofmathematics.Inafewchapters,someknowledgeofcalculusandmatrix algebraisneeded.Thefewsectionsthatintroducethegeneralmathematical structureforthemethodscanbeskippedwithoutlossofcontinuity.Alarge numberofpracticalexamplesaregiventoassistthereaderinunderstanding themethodsandapplicationsandininterpretingtheresults.Readerswithonly collegealgebrashouldfindthebookreadableandunderstandable.

Therearemanyexcellentbooksonclinicaltrials.Wethereforehavedeleted thetwochaptersonthesubjectthatwereinthesecondedition.Instead,we haveincludeddiscussionsofmorestatisticalmethodsforsurvivaldataanalysis. Abriefsummaryoftheimprovementsmadeforthethirdeditionisgiven below.

1.Twoadditionaldistributions,thelog-logisticdistributionandageneralizedgammadistribution,havebeenaddedtotheapplicationofparametricmodelsthatcanbeusedinmodelfittingandprognosticfactor identification (Chapters6,7,and11).

2.Inseveralsections (Sections7.1,9.1,10.1,11.2,and12.1),discussionsof theasymptoticlikelihoodinferenceofthemethodscoveredinthe chaptersaregiven.Thesesectionsareintendedtoprovideamoregeneral mathematicalstructureforstatisticians.

3.TheCox Snellresidualmethodhasbeenaddedtothechapteron graphicalmethodsforsurvivaldistributionfitting (Chapter8).Inaddition,thesectionsonprobabilityandhazardplottinghavebeenrevised sothatnospecialgraphicalpapersarerequiredtomaketheplots.

4.Moretestsofgoodnessoffitaregiven,includingtheBICandAIC procedures (Chapters9and11)

5.ForCox’sproportionalhazardsmodel (Chapter12),wehavenow includedmethodstoassessitsadequencyandprocedurestoestimatethe survivorshipfunctionwithcovariates.

6.Theconceptofnonproportionalhazardsmodelsisintroduced (Chapter 13),whichincludesmodelswithtime-dependentcovariates,stratified models,competingrisksmodels,recurrenteventmodels,andmodelsfor relatedobservations.

7.Thechapteronlinearlogisticregression (Chapter14) hasbeenexpanded tocoverregressionmodelsforpolychotomousoutcomes.Inaddition, methodsforageneral m : n matchingdesignhavebeenaddedtothe sectiononconditionallogisticregressionforcase controlstudies.

8.ComputerprogrammingcodesforsoftwarepackagesBMDP,SAS,and SPSSareprovidedformostexamplesinthetext.

Wewouldliketothankthemanyresearchers,teachers,andstudentswho haveusedthesecondeditionofthebook.Thesuggestionsforimprovement thatmanyofthemhaveprovidedareinvaluable.SpecialthanksgotoXing Wang,LindaHutton,TracyMankin,andImranAhmedfortypingthe manuscript.SteveQuigleyofJohnWileyconvincedustoworkonathird edition.Wethankhimforhisenthusiasm.

Finally,wearemostgratefultoourfamilies,Sam,Vivian,Benedict,Jennifer, andAnnelisa (E.T.L.),andAliceandXing (J.W.W.),fortheconstantjoy,love, andsupporttheyhavegivenus.

OklahomaCity,OK

April18,2001

CHAPTER1

Introduction

1.1PRELIMINARIES

Thisbookisforbiomedicalresearchers,epidemiologists,consultingstatisticians,studentstakingafirstcourseonsurvivaldataanalysis,andothers interestedinsurvivaltimestudy.Itdealswithstatisticalmethodsforanalyzing survivaldataderivedfromlaboratorystudiesofanimals,clinicalandepidemiologicstudiesofhumans,andotherappropriateapplications.

Survivaltime canbedefinedbroadlyasthetimetotheoccurrenceofagiven event.Thiseventcanbethedevelopmentofadisease,responsetoatreatment, relapse,ordeath.Therefore,survivaltimecanbetumor-freetime,thetimefrom thestartoftreatmenttoresponse,lengthofremission,andtimetodeath. Survivaldatacanincludesurvivaltime,responsetoagiventreatment,and patientcharacteristicsrelatedtoresponse,survival,andthedevelopmentofa disease.Thestudyofsurvivaldatahasfocusedonpredictingtheprobabilityof response,survival,ormeanlifetime,comparingthesurvivaldistributionsof experimentalanimalsorofhumanpatientsandtheidentificationofriskand/or prognosticfactorsrelatedtoresponse,survival,andthedevelopmentofa disease.Inthisbook,specialconsiderationisgiventothestudyofsurvivaldata inbiomedicalsciences,althoughallthemethodsaresuitableforapplications inindustrialreliability,socialsciences,andbusiness.Examplesofsurvivaldata inthesefieldsarethelifetimeofelectronicdevices,components,orsystems (reliabilityengineering);felons’timetoparole (criminology);durationoffirst marriage (sociology);lengthofnewspaperormagazinesubscription (marketing);andworker’scompensationclaims (insurance) andtheirvariousinfluencingriskorprognosticfactors.

1.2CENSOREDDATA

Manyresearchersconsidersurvivaldataanalysistobemerelytheapplication oftwoconventionalstatisticalmethodstoaspecialtypeofproblem: parametric ifthedistributionofsurvivaltimesisknowntobenormaland nonparametric

ifthedistributionisunknown.Thisassumptionwouldbetrueifthesurvival timesofallthesubjectswereexactandknown;however,somesurvivaltimes arenot.Further,thesurvivaldistributionisoftenskewed,orfarfrombeing normal.Thusthereisaneedfornewstatisticaltechniques.Oneofthemost importantdevelopmentsisduetoaspecialfeatureofsurvivaldatainthelife sciencesthatoccurswhensomesubjectsinthestudyhavenotexperiencedthe eventofinterestattheendofthestudyortimeofanalysis.Forexample,some patientsmaystillbealiveordisease-freeattheendofthestudyperiod.The exactsurvivaltimesofthesesubjectsareunknown.Thesearecalled censored observations or censoredtimes andcanalsooccurwhenpeoplearelostto follow-upafteraperiodofstudy.Whenthesearenotcensoredobservations, thesetofsurvivaltimesis complete. Therearethreetypesofcensoring.

TypeICensoring

Animalstudiesusuallystartwithafixednumberofanimals,towhichthe treatmentortreatmentsisgiven.Becauseoftimeand/orcostlimitations,the researcheroftencannotwaitforthedeathofalltheanimals.Oneoptionisto observeforafixedperiodoftime,saysixmonths,afterwhichthesurviving animalsaresacrificed.Survivaltimesrecordedfortheanimalsthatdiedduring thestudyperiodarethetimesfromthestartoftheexperimenttotheirdeath. Thesearecalled exact or uncensoredobservations.Thesurvivaltimesofthe sacrificedanimalsarenotknownexactlybutarerecordedasatleastthelength ofthestudyperiod.Thesearecalled censoredobservations. Someanimalscould belostordieaccidentally.Theirsurvivaltimes,fromthestartofexperiment tolossordeath,arealsocensoredobservations.In typeIcensoring,ifthereare noaccidentallosses,allcensoredobservationsequalthelengthofthestudy period.

Forexample,supposethatsixratshavebeenexposedtocarcinogensby injectingtumorcellsintotheirfootpads.Thetimestodevelopatumorofa givensizeareobserved.Theinvestigatordecidestoterminatetheexperiment after30weeks.Figure1.1isaplotofthedevelopmenttimesofthetumors. RatsA,B,andDdevelopedtumorsafter10,15,and25weeks,respectively. RatsCandEdidnotdeveloptumorsbytheendofthestudy;theirtumor-free timesarethus30-plusweeks.RatFdiedaccidentallywithouttumorsafter19 weeksofobservation.Thesurvivaldata (tumor-freetimes) are10,15,30 ,25, 30 ,and19 weeks. (Theplusindicatesacensoredobservation.)

TypeIICensoring

Anotheroptioninanimalstudiesistowaituntilafixedportionoftheanimals havedied,say80of100,afterwhichthesurvivinganimalsaresacrificed.In thiscase, typeIIcensoring,iftherearenoaccidentallosses,thecensored observationsequalthelargestuncensoredobservation.Forexample,inan experimentofsixrats (Figure1.2),theinvestigatormaydecidetoterminatethe studyafterfourofthesixratshavedevelopedtumors.Thesurvivalor tumor-freetimesarethen10,15,35 ,25,35,and19 weeks.

Figure1.1 ExampleoftypeIcensoreddata.

Figure1.2 ExampleoftypeIIcensoreddata.

TypeIIICensoring

Inmostclinicalandepidemiologicstudiestheperiodofstudyisfixedand patientsenterthestudyatdifferenttimesduringthatperiod.Somemaydie beforetheendofthestudy;theirexactsurvivaltimesareknown.Othersmay withdrawbeforetheendofthestudyandarelosttofollow-up.Stillothersmay bealiveattheendofthestudy.For‘‘lost’’patients,survivaltimesareatleast fromtheirentrancetothelastcontact.Forpatientsstillalive,survivaltimes areatleastfromentrytotheendofthestudy.Thelattertwokindsof observationsarecensoredobservations.Sincetheentrytimesarenotsimultaneous,thecensoredtimesarealsodifferent.Thisis typeIIIcensoring.For example,supposethatsixpatientswithacuteleukemiaenteraclinicalstudy

Figure1.3 ExampleoftypeIIIcensoreddata. duringatotalstudyperiodofoneyear.Supposealsothatallsixrespondto treatmentandachieveremission.TheremissiontimesareplottedinFigure1.3. PatientsA,C,andEachieveremissionatthebeginningofthesecond,fourth, andninthmonths,andrelapseafterfour,six,andthreemonths,respectively. PatientBachievesremissionatthebeginningofthethirdmonthbutislostto follow-upfourmonthslater;theremissiondurationisthusatleastfour months.PatientsDandFachieveremissionatthebeginningofthefifthand tenthmonths,respectively,andarestillinremissionattheendofthestudy; theirremissiontimesarethusatleasteightandthreemonths.Therespective remissiontimesofthesixpatientsare4,4 ,6,8 ,3,and3 months.

TypeIandtypeIIcensoredobservationsarealsocalled singlycensored data,andtypeIII, progressivelycensoreddata,byCohen (1965).Another commonlyusednamefortypeIIIcensoringis randomcensoring.Allofthese typesofcensoringare rightcensoring or censoringtotheright.Therearealso leftcensoringandintervalcensoringcases. Leftcensoring occurswhenitis knownthattheeventofinterestoccurredpriortoacertaintime t,buttheexact timeofoccurrenceisunknown.Forexample,anepidemiologistwishestoknow theageatdiagnosisinafollow-upstudyofdiabeticretinopathy.Atthetimeof theexamination,a50-year-oldparticipantwasfoundtohavealreadydevelopedretinopathy,butthereisnorecordoftheexacttimeatwhichinitialevidence wasfound.Thustheageatexamination (i.e.,50) isaleft-censoredobservation. Itmeansthattheageofdiagnosisforthispatientis atmost 50years.

Intervalcensoring occurswhentheeventofinterestisknowntohave occurredbetweentimes a and b.Forexample,ifmedicalrecordsindicatethat atage45,thepatientintheexampleabovedidnothaveretinopathy,hisage atdiagnosisisbetween45and50years.

Wewillstudydescriptiveandanalyticmethodsforcomplete,singlycensored,andprogressivelycensoredsurvivaldatausingnumericalandgraphical

techniques.Analyticmethodsdiscussedincludeparametricandnonparametric. Parametricapproachesareusedeitherwhenasuitablemodelordistribution isfittedtothedataorwhenadistributioncanbeassumedforthepopulation fromwhichthesampleisdrawn.Commonlyusedsurvivaldistributionsarethe exponential,Weibull,lognormal,andgamma.Ifasurvivaldistributionisfound tofitthedataproperly,thesurvivalpatterncanthenbedescribedbythe parametersinacompactway.Statisticalinferencecanbebasedonthe distributionchosen.Ifthesearchforanappropriatemodelordistributionis tootimeconsumingornoteconomicalornotheoreticaldistributionadequatelyfitsthedata,nonparametricmethods,whicharegenerallyeasytoapply, shouldbeconsidered.

1.3SCOPEOFTHEBOOK

Thisbookisdividedintofourparts.

PartI (Chapters1,2,and3) definessurvivalfunctionsandgivesexamples ofsurvivaldataanalysis.Survivaldistributionismostcommonlydescribedby threefunctions:thesurvivorshipfunction (alsocalledthecumulativesurvival rateorsurvivalfunction),theprobabilitydensityfunction,andthehazard function (hazardrateorage-specificrate).InChapter2wedefinethesethree functionsandtheirequivalencerelationships.Chapter3illustratessurvival dataanalysiswithfiveexamplestakenfromactualresearchsituations.Clinical andlaboratorydataaresystematicallyanalyzedinprogressivestepsandthe resultsareinterpreted.Sectionandchapternumbersaregivenforquick reference.Theactualcalculationsaregivenasexamplesorleftasexercisesin thechapterswherethemethodsarediscussed.Foursetsofdataareprovided intheexercisesectionforthereadertoanalyze.Thesedataarereferredtoin thevariouschapters.

InPartII (Chapters4and5) weintroducesomeofthemostwidelyused nonparametricmethodsforestimatingandcomparingsurvivaldistributions. Chapter4dealswiththenonparametricmethodsforestimatingthethree survivalfunctions:theKaplanandMeierproduct-limit (PL) estimateandthe life-tabletechnique (populationlifetablesandclinicallifetables).Alsocovered isstandardizationofratesbydirectandindirectmethods,includingthe standardizedmortalityratio.Chapter5isdevotedtononparametrictechniquesforcomparingsurvivaldistributions.Acommonpracticeistocompare thesurvivalexperiencesoftwoormoregroupsdifferingintheirtreatmentor inagivencharacteristic.Severalnonparametrictestsaredescribed.

PartIII (Chapters6to10) introducestheparametricapproachtosurvival dataanalysis.Althoughnonparametricmethodsplayanimportantrolein survivalstudies,parametrictechniquescannotbeignored.InChapter6we introduceanddiscusstheexponential,Weibull,lognormal,gamma,and log-logisticsurvivaldistributions.Practicalapplicationsofthesedistributions takenfromtheliteratureareincluded.

Animportantpartofsurvivaldataanalysisismodelordistributionfitting. Onceanappropriatestatisticalmodelforsurvivaltimehasbeenconstructed anditsparametersestimated,itsinformationcanhelppredictsurvival,develop optimaltreatmentregimens,planfutureclinicalorlaboratorystudies,andso on.Thegraphicaltechniqueisasimpleinformalwaytoselectastatistical modelandestimateitsparameters.Whenastatisticaldistributionisfoundto fitthedatawell,theparameterscanbeestimatedbyanalyticalmethods.In Chapter7wediscussanalyticalestimationproceduresforsurvivaldistributions.Mostoftheestimationproceduresarebasedonthemaximumlikelihood method.Mathematicalderivationsareomitted;onlyformulasfortheestimates andexamplesaregiven.InChapter8weintroducethreekindsofgraphical methods:probabilityplotting,hazardplotting,andtheCox Snellresidual methodforsurvivaldistributionfitting.InChapter9wediscussseveraltests ofgoodnessoffitanddistributionselection.InChapter10wedescribeseveral parametricmethodsforcomparingsurvivaldistributions.

Atopicthathasreceivedincreasingattentionistheidentificationof prognosticfactorsrelatedtosurvivaltime.Forexample,whoislikelyto survivelongestaftermastectomy,andwhatarethemostimportantfactorsthat influencethatsurvival?Anothersubjectimportanttobothbiomedicalresearchersandepidemiologistsisidentificationoftheriskfactorsrelatedtothe developmentofagivendiseaseandtheresponsetoagiventreatment.What arethefactorsmostcloselyrelatedtothedevelopmentofagivendisease?Who ismorelikelytodeveloplungcancer,diabetes,orcoronarydisease?Inmany diseases,suchascancer,patientswhorespondtotreatmenthaveabetter prognosisthanpatientswhodonot.Thequestion,then,relatestowhatthe factorsarethatinfluenceresponse.Whoismorelikelytorespondtotreatment andthusperhapssurvivelonger?

PartIV (Chapters11to14) dealswithprognostic/riskfactorsandsurvival times.InChapter11weintroduceparametricmethodsforidentifyingimportantprognosticfactors.Chapters12and13cover,respectively,theCox proportionalhazardsmodelandseveralnonproportionalhazardsmodelsfor theidentificationofprognosticfactors.Inthefinalchapter,Chapter14,we introducethelinearlogisticregressionmodelforbinaryoutcomevariablesand itsextensiontohandlepolychotomousoutcomes.

InAppendixAwedescribeanumericalprocedureforsolvingnonlinear equations,theNewton Raphsonmethod.ThismethodissuggestedinChapters7,11,12,and13.AppendixBcomprisesanumberofstatisticaltables.

Mostnonparametrictechniquesdiscussedhereareeasytounderstandand simpletoapply.Parametricmethodsrequireanunderstandingofsurvival distributions.Unfortunately,mostofsurvivaldistributionsarenotsimple. Readerswithoutcalculusmayfinditdifficulttoapplythemontheirown. However,ifthemainpurposeisnotmodelfitting,mostparametrictechniques canbesubstitutedforbytheirnonparametriccompetitors.Infact,alarge percentageofsurvivalstudiesinclinicalorepidemiologicaljournalsare analyzedbynonparametricmethods.Researchersnotinterestedinsurvival

modelfittingshouldreadthechaptersandsectionsonnonparametricmethods. Computerprogramsforsurvivaldataanalysisareavailableinseveralcommerciallyavailablesoftwarepackages:forexample,BMDP,SAS,andSPSS.These computerprogramsarereferredtoinvariouschapterswhenapplicable. Computerprogrammingcodesaregivenformanyoftheexamples.

BibliographicalRemarks

CrossandClark (1975) wasthefirstbooktodiscussparametricmodelsand nonparametricandgraphicaltechniquesforbothcompleteandcensored survivaldata.Sincethen,severalotherbookshavebeenpublishedinaddition tothefirsteditionofthisbook (Lee,1980,1992).Elandt-JohnsonandJohnson (1980) discussextensivelytheconstructionoflifetables,modelfitting,competingrisk,andmathematicalmodelsofbiologicalprocessesofdiseaseprogressionandaging.KalbfleischandPrentice (1980) focusonregression problemswithsurvivaldata,particularlyCox’sproportionalhazardsmodel. Miller (1981) coversanumberofparametricandnonparametricmethodsfor survivalanalysis.CoxandOakes (1984) alsocoverthetopicconciselywithan emphasisontheexaminationofexplanatoryvariables.

Nelson (1982) providesagooddiscussionofparametric,nonparametric,and graphicalmethods.Thebookismoresuitedforindustrialreliabilityengineers thanforbiomedicalresearchers,asareHahnandShapiro (1967) andMannet al. (1974).Inaddition,Lawless (1982) givesabroadcoverageoftheareawith applicationsinengineeringandbiomedicalsciences.

MorerecentpublicationsincludeMarubiniandValsecchi (1994),Kleinbaum (1995),KleinandMoeschberger (1997),andHosmerandLemeshow (1999).Mostofthesebookstakeamorerigorousmathematicalapproachand requireknowledgeofmathematicalstatistics.

CHAPTER2

FunctionsofSurvivalTime

Survivaltimedatameasurethetimetoacertainevent,suchasfailure,death, response,relapse,thedevelopmentofagivendisease,parole,ordivorce.These timesaresubjecttorandomvariations,andlikeanyrandomvariables,forma distribution.Thedistributionofsurvivaltimesisusuallydescribedorcharacterizedbythreefunctions: (1) thesurvivorshipfunction, (2) theprobability densityfunction,and (3) thehazardfunction.Thesethreefunctionsare mathematicallyequivalent—ifoneofthemisgiven,theothertwocanbe derived.

Inpractice,thethreefunctionscanbeusedtoillustratedifferentaspectsof thedata.Abasicprobleminsurvivaldataanalysisistoestimatefromthe sampleddataoneormoreofthesethreefunctionsandtodrawinferences aboutthesurvivalpatterninthepopulation.InSection2.1wedefinethethree functionsandinSection2.2,discusstheequivalencerelationshipamongthe threefunctions.

2.1DEFINITIONS

Let T denotethesurvivaltime.Thedistributionof T canbecharacterizedby threeequivalentfunctions.

SurvivorshipFunction(orSurvivalFunction)

Thisfunction,denotedby S(t),isdefinedastheprobabilitythatanindividual surviveslongerthan t: S(t) P (anindividualsurviveslongerthan t) P(T t )

Fromthedefinitionofthecumulativedistributionfunction F(t)of T, S(t) 1-P (anindividualfailsbefore t) 1 F(t)(2.1.2)

)

Here S(t)isanonincreasingfunctionoftime t withtheproperties

S(t) 1for t 0 0for t

Thatis,theprobabilityofsurvivingatleastatthetimezerois1andthatof survivinganinfinitetimeiszero.

Thefunction S(t)isalsoknownasthe cumulativesurvivalrate. Todepictthe courseofsurvival,Berkson (1942) recommendedagraphicpresentationof S(t). Thegraphof S(t)iscalledthe survivalcurve. Asteepsurvivalcurve,suchas theoneshowninFigure2.1a,representslowsurvivalrateorshortsurvival time.AgradualorflatsurvivalcurvesuchasinFigure2.1b representshigh survivalrateorlongersurvival.

Thesurvivorshipfunctionorthesurvivalcurveisusedtofindthe50th percentile (themedian) andotherpercentiles (e.g.,25thand75th) ofsurvival timeandtocomparesurvivaldistributionsoftwoormoregroups.Themedian survivaltimesinFigure2.1a and b areapproximately5and36unitsoftime, respectively.Themeanisgenerallyusedtodescribethecentraltendencyofa distribution,butinsurvivaldistributionsthemedianisoftenbetterbecausea smallnumberofindividualswithexceptionallylongorshortlifetimeswill causethemeansurvivaltimetobedisproportionatelylargeorsmall.

Inpractice,iftherearenocensoredobservations,thesurvivorshipfunction isestimatedastheproportionofpatientssurvivinglongerthan t :

numberofpatientssurvivinglongerthan t

S(t)

)

totalnumberofpatients

wherethecircumflexdenotesan estimate ofthefunction.Whencensored observationsarepresent,thenumeratorof (2.1.3) cannotalwaysbedetermined. Forexample,considerthefollowingsetofsurvivaldata:4,6,6 ,10 ,15,20.

Figure2.1 Twoexamplesofsurvivalcurves.

Using (2.1.3),wecancompute S(5) 5/6 0.833.However,wecannotobtain S(11)sincetheexactnumberofpatientssurvivinglongerthan11isunknown. Eitherthethirdorthefourthpatient (6 and10 ) couldsurvivelongerthan orlessthan11.Thus,whencensoredobservationsarepresent, (2.1.3) isno longerappropriateforestimating S(t).Nonparametricmethodsofestimating S(t)forcensoreddataarediscussedinChapter4.

ProbabilityDensityFunction(orDensityFunction)

Likeanyothercontinuousrandomvariable,thesurvivaltime T hasa probabilitydensityfunctiondefinedasthelimitoftheprobabilitythatan individualfailsintheshortinterval t to t t perunitwidth t,orsimplythe probabilityoffailureinasmallintervalperunittime.Itcanbeexpressedas

f (t) lim P[anindividualdyingintheinterval (t, t t)]

.1.4)

Thegraphof f (t)iscalledthe densitycurve. Figure2.2a and b givetwo examplesofthedensitycurve.Thedensityfunctionhasthefollowingtwo properties:

1. f (t)isanonnegativefunction: f (t) 0forall t 0 0for t 0

2.Theareabetweenthedensitycurveandthe t axisisequalto1.

Inpractice,iftherearenocensoredobservations,theprobabilitydensity function f (t)isestimatedastheproportionofpatientsdyinginanintervalper

Figure2.2 Twoexamplesofdensitycurves.

unitwidth:

f (t)

numberofpatientsdyingintheintervalbeginningattime t (totalnumberofpatients) (intervalwidth)

)

Similartotheestimationof S(t),whencensoredobservationsarepresent, (2.1.5) isnotapplicable.WediscussanappropriatemethodinChapter4.

Theproportionofindividualsthatfailinanytimeintervalandthepeaksof highfrequencyoffailurecanbefoundfromthedensityfunction.Thedensity curveinFigure2.2a givesapatternofhighfailurerateatthebeginningofthe studyanddecreasingfailurerateastimeincreases.InFigure2.2b,thepeakof highfailurefrequencyoccursatapproximately1.7unitsoftime.Theproportionofindividualsthatfailbetween1and2unitsoftimeisequaltotheshaded areabetweenthedensitycurveandtheaxis.Thedensityfunctionisalsoknown asthe unconditionalfailurerate.

HazardFunction

Thehazardfunction h(t)ofsurvivaltime T givesthe conditionalfailurerate. Thisisdefinedastheprobabilityoffailureduringaverysmalltimeinterval, assumingthattheindividualhassurvivedtothebeginningoftheinterval,or asthelimitoftheprobabilitythatanindividualfailsinaveryshortinterval, t t,giventhattheindividualhassurvivedtotime t:

P anindividualfailsinthetimeinterval

Thehazardfunctioncanalsobedefinedintermsofthecumulative distributionfunction F(t)andtheprobabilitydensityfunction f (t):

Thehazardfunctionisalsoknownasthe instantaneousfailurerate, forceof mortality, conditionalmortalityrate,and age-specificfailurerate. If t in (2.1.6) isage,itisameasureofthepronenesstofailureasafunctionoftheageofthe individualinthesensethatthequantity th(t)istheexpectedproportionof age t individualswhowillfailintheshorttimeinterval t t.Thehazard functionthusgivestheriskoffailureperunittimeduringtheagingprocess.It playsanimportantroleinsurvivaldataanalysis.

Inpractice,whentherearenocensoredobservationsthehazardfunctionis estimatedastheproportionofpatientsdyinginanintervalperunittime,given

thattheyhavesurvivedtothebeginningoftheinterval:

h(t)

numberofpatientsdyingintheintervalbeginningattime t (numberofpatientssurvivingat t) (intervalwidth)

numberofpatientsdyingperunittimeintheinterval

numberofpatientssurvivingat t (2.1.8)

Actuariesusuallyusetheaveragehazardrateoftheintervalinwhichthe numberofpatientsdyingperunittimeintheintervalisdividedbytheaverage numberofsurvivorsatthemidpointoftheinterval:

h(t)

numberofpatientsdyingperunittimeintheinterval (numberofpatientssurvivingat t) (numberofdeathsintheinterval)/2 (2.1.9)

Theactuarialestimatein (2.1.9) givesahigherhazardratethan (2.1.8) andthus amoreconservativeestimate.

Thehazardfunctionmayincrease,decrease,remainconstant,orindicatea morecomplicatedprocess.Figure2.3isaplotofseveralkindsofhazard function.Forexample,patientswithacuteleukemiawhodonotrespondto treatmenthaveanincreasinghazardrate, h (t), h (t)isadecreasinghazard functionthat,forexample,indicatestheriskofsoldierswoundedbybullets whoundergosurgery.Themaindangeristheoperationitselfandthisdanger decreasesifthesurgeryissuccessful.Anexampleofaconstanthazardfunction, h (t),istheriskofhealthypersonsbetween18and40yearsofagewhosemain risksofdeathareaccidents.The bathtubcurve, h (t),describestheprocessof

Figure2.3 Examplesofthehazardfunction.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.