ExactStatisticalInference forCategoricalData
GuogenShan
UniversityofNevada,LasVegas,NV,USA
2.2TypeIerrorrateforthefiveexactapproacheswhen K = 3 andasamplesizeof30pergroupatthesignificancelevelof α = 0.05,theleftplotwiththescorevalue d = (0,1,2),and therightplotwiththescorevalue d
2.3Powercomparisonamongthefiveexactapproacheswhen K = 4,asamplesizeof30pergroup,andthescorevalue
2.4PowercomparisonsamongtheCapproach,theMapproach, andtheC+Mapproach,usingthe χ 2 testwithtotal samplesizesof25,50,100,and300fromthefirstrowto thefourthrow..................................................................
Table1.1A 2 × 2 Contingency
Table
ratebetweenthetwogroups.ThefollowingPhaseIIIstudyisusedto illustratethesettingofa2 × 2comparativestudy.
Example1.1. Arandomized,placebo-controlledtwo-armPhaseIII clinicaltrialwasconductedtoevaluateorallubiprostoneforconstipation associatedwithnon-methadoneopioidsinpatientswithchronicnoncancerrelatedpain[3].Patientswererandomizedintoeitherthetreatmentgroup treatedwithlubiprostoneortheplacebogroup,andtheywerefollowedfor 12weeksinthestudy.Thedataisdisplayedin Table1.2.Attheendof thestudy, x1 = 58respondersoutof n1 = 214wererecordedfromthe treatmentgroup,while x2 = 41outof n2 = 217patientswereobserved fromtheplacebogroup.
Theresponserates,theprimaryendpoint,areestimatedtobe27.1% and18.9%forthetreatmentgroupandtheplacebogroup,respectively. Tocomparetheresponsedifferencebetweentwogroups,Pearson’s χ 2 teststatistic
Table1.2ARandomizedPlacebo-Control Two-ArmStudyforPatientswithChronic Noncancer-RelatedPain
Source:FromJamaletal.[3],withpermission.
Ha : p1 = p2 .
Thistestcanbefoundinthefunction prop.test fromstatisticalsoftware R, andthe freq procedurefromSAStocomparetwoindependentproportions. Itshouldbenotedthatthe χ 2 teststatisticisequivalenttotheZteststatistic withapooledvarianceestimate,whichisgivenas
).Itisobvious thattheZteststatisticcanbeappliedtoaone-sidedproblem,butthe χ 2 test statistic Tχ 2 isonlyusedforatwo-sidedproblem.
Theasymptoticlimitingdistributionsofthe χ 2 testandtheZtestare oftenusedforstatisticalinference,andtheyareappropriateforusein practiceonlywhencellfrequenciesarelargeenough.The χ 2 testisnot recommendedforusewhenthelowestexpectedfrequenciesfromthefour cellsislessthan5[4, 5].However,Cochran[6]arguedthatthecutpoint value5ischosenarbitrarily,andthiscutpointmaybemodifiedwhennew evidencefromdatabecomesavailable.Fordatawithsmallcellfrequencies, exactapproaches(e.g.,Fisher’sexactconditionalapproach)aregenerally recommended[2, 7, 8].Severalexactapproaches[2, 8–15]willbediscussed laterin Section1.1.
DoubleDichotomyStudy
Inadoubledichotomystudy,onlythetotalsumisfixed,whichiscommonin across-sectionalstudyfortestinganassociationbetweentwodichotomous variables.Asampleofsize N isdrawnfroma population, andeachmember ofthesampleisclassifiedaccordingtothetwodichotomousvariables, A and B.Forsuchstudies,therowandcolumntotalsarenotfixedinadvance; onlythetotalsumisfixed.Onetypicalexampleisacross-sectionalstudyto testtheassociationbetweensmokingandcancer.
Example1.2.
Krishnatreyaetal.[16]reportedaretrospectivestudy ofupperaerodigestivetract(UADT)cancerpatientsfrom2010to2011 fromthehospitalcancerregistry.Forthe N = 56patientsdocumented withtheoccurrenceorpresenceofsynchronousprimaries,eachpatientwas askedabouthis/hersmokingstatus,andwastestedwhetherornotUADT appearsatbothindexandsynchronous.Datafromthisstudyispresented in Table1.3.Oneofthemainresearchquestionsfromthisstudywasto testtheassociationbetweenthesmokinghistoryandthepresenceofUADT synchronouscancers.
Table1.4Airway
Source:FromBenturetal.[18],withpermission.
Inadditiontoamatched-pairsstudywhereeachsubjectismeasured twice,itcouldbeastudyinwhicheachsubjectismatchedwithanequivalentfromanotherstudy.Inpractice,datafromanotherexperimentisalready existoreasytoobtain.Suchdesignscanbeusedtoreducetheinfluenceof possibleconfoundingfactors.Traditionally,the χ 2 testandthelikelihood ratiotestareusedfortestingtheassociationbetweentwodichotomous variables.
Let pij = nij /N beprobabilityforthe i-thlevelofthefactor A and j-thlevelofthefactor B.Suppose p1 = p11 + p21 and p2 = p11 + p12 are themarginalprobabilities.Thedifferencebetweenthesetwoproportionsis oftentheparameterofinterest, p1 p2 ,orequivalently p21 p12 .Tomakea statisticalinferenceforthisparameter,themostcommonlyusedteststatistic istheMcNemartest[19]:
MC = (n21 n12 )2 n21 + n12 .
Itshouldbenotedthatonlytheoff-diagonalnumbers, n12 and n21 ,froma 2 × 2tableareusedintheteststatistic,andthediagonalvalues, n11 and n22 , havenoinfluenceoncomputingtheteststatisticandthe p-valuecalculation. Therehasbeenalong-termdebateoverwhetherallvaluesshouldbeused intheteststatistic.
1.1EXACTTESTINGPROCEDURES
Whensamplesizeinastudyisincreasedfromsmalltolarge,asymptotic approachesaretraditionallyusedfordataanalysis.However,thesignificancevaluetheyprovideisonlyanapproximation,becausethesampling distributionoftheteststatisticisonlyapproximatelyequaltothetheoretical
limitingdistribution,forexample,a χ 2 distribution,astandardnormal distribution.Theapproximationisinadequateincaseswherethetotal samplesizeissmall,ortheexpectedvaluesforcellsinthetablearelow.
Indiscretedataanalysis,unsatisfiedtypeIerrorcontrolfromtraditionally usedasymptoticapproacheshasbeenobservedinmanystatisticalproblems. Inacomparativebinomialstudy,Pearson’s χ 2 testisoftenassociated withaninflatedtypeIerrorrate,whilethe χ 2 testbasedonYates’ correction[20]isalwaysconservative,withactualtypeIerrorratebeing lessthanthenominallevel,andoftenlessthanhalfofthenominallevel [7, 11, 21–23].UncontrolledtypeIerrorrateinastudycouldleadtoeither under-oroverestimatedsamplesizecalculation.Severalmodified χ 2 test statisticswereproposedtoincreasetheperformanceofthePearson’s χ 2 test,forexample,theuncorrected χ 2 test[24]andre-corrected χ 2 test[25]. UncontrolledtypeIerroroccursnotonlyina2 × 2table,butalsoinother typesofdata.Forexample,adose-responsestudytotestatrendfordatain a2 × K table,thetraditionallyusedteststatistic,theCochran-Armitagetest [4, 26]alwayshasaninflatedtypeIerrorrateasthetotalsamplesizegoes toinfinity[27].
InlightoftheproblemsoftypeIerrorcontrol,proceduresbasedonexact probabilitycalculationsmaybeconsideredinordertopreservethenominal levelofatest.Twobasicexactapproaches,theconditionalapproachand theunconditionalapproach,willbeintroducedfirst,followedbyanother threeexactunconditionalapproaches.Toavoidtoomanymathematical notationsandsymbols,weuseacomparativebinomialstudytoexplainthe computationofthesefiveexactapproaches.
1.1.1ConditionalApproach
Forthecaseswhereasymptoticapproachesarenotadequate(e.g.,thetotal samplesizeissmall,theexpectedsamplesizeforsomecellsistoosmall), exactapproachesshouldbeconsideredtomakeproperstatisticalinference. Fisher[2]wasamongthefirsttoproposeanexactapproachbyfixingboth marginaltotalstocontrolforanynuisanceparameterinthetailprobability. ThisisanexactconditionalapproachandisreferredtobeastheCapproach. Thisapproachwasoriginallydevelopedforanalyzinga2 × 2table,butit canbeappliedtoageneral R × C contingencytable.MehtaandPatel[28] developedanetwork-basedalgorithmtoimplementFisher’sexactapproach fordifferenttypesofcategoricaldata.But,themainapplicationofFisher’s approachliesinasimple2 × 2table.
1.1.2UnconditionalApproachBasedonMaximization
Theexactconditionalapproachisthealternativeoftraditionalasymptotic approacheswhenasymptoticapproachesdonotcontrolforthetypeI errorrate.However,theconditionalapproachisoftencriticizedforbeing conservativefromanunconditionalframework,whichisoftenbasedon resultsfromstudieswithsmallsamplesizes.Theconservativenessofthe exactconditionalapproachhasbeendiscussedinmanyresearcharticles. AndrésandTejedor[32]comparedtheconditionalapproachandtheunconditionalapproachforbinomialcomparativestudiesunderone-sidedand two-sidedalternativeswithvarioussamplesizeratiosbetweentwogroups. TheyfoundthatthelossofpowerfromtheCapproachascomparedwith theexactunconditionalapproach,isoftenslight.Later,CransandShuster [33]continuedthedebateonwhichexactapproachisthemostpowerful andthemostappropriateforuseinbinomialcomparativestudies.The resultsindicatedthattheCapproachisindeedconservativeastheactual significancelevelislessthan0.035forasamplesizeupto50atthenominal levelof0.05.
Toaddresstheconservativenessoftheexactconditionalapproachdue tothesmallsizeofthesamplespace,Barnard[34]wasthefirsttopropose anunconditionalapproachwhereonlythecolumntotals(n1 and n2 asin Table1.1)arefixed,fortestingthehypothesesas H0 : p1 = p2 against Ha : p1 = p2 .Thisstudybelongstothecomparativestudymentionedbefore. Underthenullhypothesis,itstatesthatbothgroupshavethesameresponse rate,forexample, p1 = p2 = p,whichisanuisanceparameterinthetable probability,specifically,
where b(x, y, z) = y x zx (1 z)y x istheprobabilitydensityfunctionofa binomialdistribution.
Beforecomputingtheexactunconditional p-value,thetailareaneeds tobedetermined,andateststatisticisoftenusedinthisprocedurefor orderingthesamplespace.Foragiventeststatistic T,suchas,the χ 2 test,thelikelihoodratiotest,orthescoretest,thetailareaiscalculatedas T (x∗ ) ={x : T(x) ≥ T(x∗ )},where x = (n11 , n12 , n21 , n22 ) isadatapoint, and x∗ istheobserveddata.Inabinomialcomparativestudy, x isequivalent to (n11 , n12 ) asthecolumntotals n1 and n2 aregiven.Theunconditional p-valueisthencomputedas
Proportiondifference
Z-unpooled
Z-pooled
Figure1.1Tailprobabilityplotsforabinomialcomparativestudybasedonthreeteststatistics.
).Themonotonicitypropertyissatisfiedforallthreeteststatistics[36].Therefore, theunconditional p-valuebasedontheMapproachisobtainedfromthe boundaryofthenullspace,whichisthecommonprobabilityofthetwo groups, p.
IntheMapproach,thefirststepistodeterminethetailareaforthegiven databasedonateststatistic.Differentteststatisticsmayleadtoadifferent tailarea.Thetailprobabilitycurveisdrawnasafunctionofthenuisance parameter, p. Figure1.1 presentsthethreetailprobabilityplotsasafunction of p basedonthethreeteststatistics.Itcanbeseenthattheplotbasedonthe TPD isverydifferentfromthosebasedonthetwo Z teststatistics.Thefinal p-valueiscomputedas themaximumofeachcurve:0.044,0.022,and0.022 basedontheteststatistic TPD , TZuP ,and TZP ,respectively.Thesemaximum valuesarefoundwhen p = 0.588,0.803,and0.602,respectively,andthey aremarkedinthefigurewithbigdots.TheRpackage, Exact,canbeused tocomputetheseM p-values.Itisnoticeablethatthetailprobabilitycurve isnotsmooth,withmultiplelocalspikesasseenintheplots.Thetraditional
(x|
), where C (x∗ ) ={x : PC (x) ≤ PC (x∗ )} isthetailarea,and PC istheC p-value.
When λ = α , P(PC ≤ λ) ≤ α isalwaystrueasfromtheCapproach.It followsthat λ ≥ α .Therefore,theC+Mapproachisatleastaspowerfulas theCapproach.
TheC+Mapproachhasnotbeenwidelyusedinpractice,possibly becauseoftheconfusionthatcomesfromusingtheC p-valueasthetest statistic.TheC p-valueisoftenusedasthe p-valueforstatisticalinference. TheC+MapproachwasshowntobeuniformlymorepowerfulthantheC approachinabinomialcomparativestudy[40],anditwasrecommended foruse.Forthebinomialcomparative Example1.1,the p-valuebasedon theC+Mapproachis0.023,whichleadstothesameconclusionasothers. TheC+M p-valuemaybecomputedfromtheRpackage, Exact.
1.1.5UnconditionalApproachBasedonEstimationand Maximization
TheexactunconditionalMapproachcouldbecomputationallyintensive whenmultiplenuisanceparametersarepresented.Forthisreason,Liddell [48]wasthefirsttoproposeanapproachbycomputingtheexactdistribution oftheproportionaldifferenceoftwoindependentbinomialdistributionsat asinglepoint,themaximumlikelihoodestimate(MLE)forthecommon proportionunderthenullhypothesis.InLiddell’sapproach,oneonlyneeds tofindtheexactdistributionatonepointinsteadofthewholeparameter spaceasintheMapproach.Itiscomputationallyeasytoobtainthe p-value. Later,StorerandKim[49]extendedLiddell’sapproachtoothercommonly usedteststatistics.Thisapproachisoftencalledtheapproximateunconditionalapproach.Sincethenuisanceparameterinthetableprobabilityis replacedbyanestimateoftheparameter,thisapproachisreferredtoasthe Eapproach.
Ifthenullhypothesisisrejectedforalargeteststatistic,thenthetailarea basedontheteststatistic, T,iscomputedas T (x∗ ) ={x : T(x) ≥ T(x∗ )} Itisoftenthecasethatateststatistichasaclosedformula,thereforeitis computationallyeasy.Foreachdatapointinthetailarea,itsprobabilityisa functionofthenuisanceparameter.Inabinomialcomparativestudy,theE p-valueiscomputedas
where ˆ p = (n11 + n12 )/(n1 + n2 ) istheMLEofthecommonproportionunder thenullhypothesis.ThetailareadoesnotdependontheMLEvalue.The Eapproachisageneralapproach,andithasbeenappliedtootherstudies [50, 51].
TheE p-valueonlyneedstoevaluatethetailprobabilityatasinglepoint. Forthisreason,thisapproachwasattractiveinthedayswhencomputational resourceswereaproblemformostpractitioners.TheEapproachguarantees thetestsizeatasingleestimatedvalue,butnotforallthepossiblevaluesof thenuisanceparameter.Forthisreason,theEapproachisnotexact.
Lloyd[13, 31, 50, 52–55]proposedanewapproachforthe p-value calculationbasedonestimationandmaximization.Theestimationstepis usedtoobtainaflatter p-valueplot,andthemaximizationstepisusedto guaranteethenominallevel.The p-valueplotbasedonateststatisticinthe Mapproachisgenerallyerratic,anditiscomputationallydifficulttosearch fortheglobalmaximum,especiallyforthecasewithmultiplespikes.Itwas showninLloyd[52]thatthe p-valueplotbyusingtheE p-valueasthetest statistictendstohaveamuchflatterplot.Thisimportantstepmayallowone toavoidthesituationwherethemaximumofthetailprobabilityisobtained fromunlikelyvaluesofthenuisanceparameter,suchasthevaluesoutsideof aconfidenceinterval.AlthoughtheE p-valueisonlyapproximatelyvalid, thefollowingmaximizationstepmakestheapproachexactwiththetype Ierrorrateguaranteed.TheapproachisreferredtoastheE+Mapproach. TheE+Mapproachhasbeenappliedtomanyimportantstatisticalproblems [15, 45, 50, 56–61].
Theestimationstepcouldbecomputationallydifficultwithalarge sizesamplespace,sincetheE p-valueforeachdatapointneedstobe computed.Parallelcomputingisausefultooltoreducethecomputational timesignificantlybycomputingtheE p-valuesatthesametimeforalltables. SomeofthepackageshavebeendevelopedinRtoconducttheparallel computing,forexample, multicore, parallel.Forastudywithasmallto moderatesamplesize,apersonalcomputermaybesufficienttoservethis purpose.
TheE p-valueisusedasateststatisticintheE+Mapproachtofindthe tailareaincludingthetableswhoseE p-valuesarelessthanorequaltothat
Figure1.3,Continued
Type I error rate
Type I error rate
One-sidedproblem
0.000.010.020.030.040.050.06 p
0.00.20.40.60.81.0
0.000.010.020.030.040.050.06 p
0.00.20.40.60.81.0
Type I error rate
0.00.20.40.60.81.0