https://ebookmass.com/product/using-statistics-in-thesocial-and-health-sciences-with-spss-excel-1st/
Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...
Using Basic Statistics in the Behavioral and Social Sciences
https://ebookmass.com/product/using-basic-statistics-in-thebehavioral-and-social-sciences/
ebookmass.com
Practical Statistics for Nursing Using SPSS 1st Edition, (Ebook PDF)
https://ebookmass.com/product/practical-statistics-for-nursing-usingspss-1st-edition-ebook-pdf/
ebookmass.com
Introductory Statistics Using SPSS 2nd Edition, (Ebook PDF)
https://ebookmass.com/product/introductory-statistics-using-spss-2ndedition-ebook-pdf/
ebookmass.com
Identification and evaluation of potent Middle East respiratory syndrome coronavirus (MERS-CoV) 3CLPro inhibitors Vathan Kumar
https://ebookmass.com/product/identification-and-evaluation-of-potentmiddle-east-respiratory-syndrome-coronavirus-mers-cov-3clproinhibitors-vathan-kumar/ ebookmass.com
Picture Perfect: A Somerset Harbor Novel (Cargill Brothers Book 3) Charlotte Byrd & Ronan Byrd
https://ebookmass.com/product/picture-perfect-a-somerset-harbor-novelcargill-brothers-book-3-charlotte-byrd-ronan-byrd/
ebookmass.com
Effective Communication & Soft Skills : Strategies For Success Nitin Bhatnagar
https://ebookmass.com/product/effective-communication-soft-skillsstrategies-for-success-nitin-bhatnagar/
ebookmass.com
Incomplete State-Building in Central Asia: The State as Social Practice Viktoria Akchurina
https://ebookmass.com/product/incomplete-state-building-in-centralasia-the-state-as-social-practice-viktoria-akchurina/
ebookmass.com
The Greatest Invention: A History of the World in Nine Mysterious Scripts Silvia Ferrara
https://ebookmass.com/product/the-greatest-invention-a-history-of-theworld-in-nine-mysterious-scripts-silvia-ferrara/
ebookmass.com
Coding Art: A Guide to Unlocking Your Creativity with the Processing Language and p5.js in Four Simple Steps 2 / converted Edition Mathias Funk
https://ebookmass.com/product/coding-art-a-guide-to-unlocking-yourcreativity-with-the-processing-language-and-p5-js-in-four-simplesteps-2-converted-edition-mathias-funk/
ebookmass.com
Behind
https://ebookmass.com/product/behind-the-hill-b-ivy-woods-woods/
ebookmass.com
USINGSTATISTICSIN THESOCIALAND HEALTHSCIENCES
WITHSPSS® AND EXCEL®
BetweenandWithinResearchDesigns,210
UsingDifferent T Tests,211
Independent T Test:TheProcedure,213
CreatingtheSamplingDistributionofDifferences,215
TheNatureoftheSamplingDistributionofDifferences,216
CalculatingtheEstimatedStandardErrorofDifferencewithEqualSample Size,218
UsingUnequalSampleSizes,219
TheIndependent T Ratio,221
Independent T TestExample,222
HypothesisTestElementsfortheExample,222
Before–AfterConventionwiththeIndependent T Test,226
ConfidenceIntervalsfortheIndependent T Test,227
EffectSize,228
TheAssumptionsfortheIndependent T Test,230
SPSS® ExploreforCheckingtheNormalDistributionAssumption, 231
ExcelProceduresforCheckingtheEqualVarianceAssumption,233
SPSS® ProcedureforCheckingtheEqualVarianceAssumption,237 UsingSPSS® andExcelwiththeIndependent T Test,239
SPSS® ProceduresfortheIndependent T Test,239
ExcelProceduresfortheIndependent T Test,243
EffectSizefortheIndependent T TestExample,245 PartingComments,245
NonparametricStatistics:TheMann–Whitney U Test,246
TermsandConcepts,249
DataLabandExamples(withSolutions),249
DataLab:Solutions,251
GraphicsintheDataSummary,254
9ANALYSISOFVARIANCE255
AHypotheticalExampleofANOVA,255
TheNatureofANOVA,257
TheComponentsofVariance,258
TheProcessofANOVA,259
CalculatingANOVA,260
EffectSize,268
PostHocAnalyses,269
AssumptionsofANOVA,274
AdditionalConsiderationswithANOVA,275
TheHypothesisTest:InterpretingANOVAResults,276
AretheAssumptionsMet?,276
UsingSPSS® andExcelwithOne-WayANOVA,282
TheNeedforDiagnostics,289
Non-ParametricANOVATests:TheKruskal–WallisTest,289 TermsandConcepts,292
DataLabandExamples(withSolutions),293
DataLab:Solutions,294
10FACTORIALANOVA297
ExtensionsofANOVA,297
ANCOVA,298
MANOVA,299
MANCOVA,299
FactorialANOVA,299
InteractionEffects,299
SimpleEffects,301
2XANOVA:AnExample,302
CalculatingFactorialANOVA,303
TheHypothesesTest:InterpretingFactorialANOVAResults,306
EffectSizefor2XANOVA:Partial �� 2 ,308
DiscussingtheResults,309 UsingSPSS® toAnalyze2XANOVA,311
SummaryChartfor2XANOVAProcedures,319 TermsandConcepts,319
DataLabandExamples(withSolutions),320
DataLab:Solutions,320
11CORRELATION329
TheNatureofCorrelation,330
TheCorrelationDesign,331
Pearson’sCorrelationCoefficient,332
PlottingtheCorrelation:TheScattergram,334 UsingSPSS® toCreateScattergrams,337 UsingExceltoCreateScattergrams,339
CalculatingPearson’s r ,341
The Z ScoreMethod,342
TheComputationMethod,344
TheHypothesisTestforPearson’s r ,345
EffectSize:theCoefficientofDetermination,347
Diagnostics:CorrelationProblems,349
CorrelationUsingSPSS® andExcel,352
NonparametricStatistics:Spearman’sRankOrderCorrelation(rs ),358 TermsandConcepts,363
DataLabandExamples(withSolutions),364 DataLab:Solutions,365
12BIVARIATEREGRESSION371
TheNatureofRegression,372
TheRegressionLine,374
CalculatingRegression,376
EffectSizeofRegression,379
The Z ScoreFormulaforRegression,380
TestingtheRegressionHypotheses,382
TheStandardErrorofEstimate,383
ConfidenceInterval,385
ExplainingVarianceThroughRegression,386
ANumericalExampleofPartitioningtheVariation,389
UsingExcelandSPSS® withBivariateRegression,390 TheSPSS® RegressionOutput,390
TheExcelRegressionOutput,396
CompleteExampleofBivariateLinearRegression,398
AssumptionsofBivariateRegression,398
TheOmnibusTestResults,404
EffectSize,404
TheModelSummary,405
TheRegressionEquationandIndividualPredictorTestofSignificance,405 AdvancedRegressionProcedures,406
DetectingProblemsinBivariateLinearRegression,408 TermsandConcepts,409
DataLabandExamples(withSolutions),410
DataLab:Solutions,411
13INTRODUCTIONTOMULTIPLELINEARREGRESSION417
TheElementsofMultipleLinearRegression,417
SameProcessasBivariateRegression,418
SomeDifferencesbetweenBivariateLinearRegressionandMultipleLinear Regression,419
StuffnotCovered,420
AssumptionsofMultipleLinearRegression,421
AnalyzingResidualstoCheckMLRAssumptions,422 DiagnosticsforMLR:CleaningandCheckingData,423 ExtremeScores,424
DistanceStatistics,428
InfluenceStatistics,429
MLRExtendedExampleData,430
AssumptionsMet?,431
AnalyzingResiduals:AreAssumptionsMet?,433 InterpretingtheSPSS® FindingsforMLR,436
EnteringPredictorsTogetherasaBlock,437 EnteringPredictorsSeparately,442
AdditionalEntryMethodsforMLRAnalyses,447
ExampleStudyConclusion,448
TermsandConcepts,448
DataLabandExample(withSolution),450
DataLab:Solution,450
14CHI-SQUAREANDCONTINGENCYTABLEANALYSIS455
ContingencyTables,455
TheChi-squareProcedureandResearchDesign,456
Chi-squareDesignOne:GoodnessofFit,457
AHypotheticalExample:GoodnessofFit,458
EffectSize:GoodnessofFit,462
Chi-squareDesignTwo:TheTestofIndependence,463
AHypotheticalExample:TestofIndependence,464
Special2 × 2Chi-square,468
EffectSizein2 × 2Tables:PHI,470
Cramer’s V :EffectSizefortheChi-squareTestofIndependence,471
RepeatedMeasuresChi-square:McnemarTest,472 UsingSPSS® andExcelwithChi-square,474 UsingSPSS® fortheChi-squareTestofIndependence,475 UsingExcelforChi-squareAnalyses,481
TermsandConcepts,483
DataLabandExamples(withSolutions),483
DataLab:Solutions,484
15REPEATEDMEASURESPROCEDURES: T dep ANDANOVAWS 489
IndependentandDependentSamplesinResearchDesigns,490 UsingDifferent T Tests,491
TheDependent T TestCalculation:The“Long”Formula,491 Example:TheLongFormula,492
TheDependent T TestCalculation:The“Difference”Formula,494 T dep andPower,496
ConductingThe T dep AnalysisUsingSPSS®,496 ConductingThe T dep AnalysisUsingExcel,498
Within-SubjectANOVA(ANOVAWS ),498
ExperimentalDesigns,499
PostFactoDesigns,500
Within-SubjectExample,501 UsingSPSS® forWithin-SubjectData,501
TheSPSS® Procedure,502
TheSPSS® Output,504
NonparametricStatistics,508
TermsandConcepts,508
APPENDICES
AppendixASPSS® BASICS509 UsingSPSS® ,509 GeneralFeatures,510 ManagementFunctions,513 AdditionalManagementFunctions,517
AppendixBEXCELBASICS531 DataManagement,531 TheExcelMenus,533 UsingStatisticalFunctions,541 DataAnalysisProcedures,543 MissingValuesand“0”ValuesinExcelAnalyses,544 UsingExcelwith“RealData”,544
AppendixCSTATISTICALTABLES545
TableC.1: Z -ScoreTable(ValuesShownarePercentages–%),545
TableC.2:ExclusionValuesforthe T -Distribution,547
TableC.3:Critical(Exclusion)ValuesfortheDistributionof F ,548
TableC.4:Tukey’sRangeTest(Upper5%Points),551
TableC.5:Critical(Exclusion)ValuesforPearson’sCorrelation Coefficient, r ,552
TableC.6:CriticalValuesofthe �� 2 (Chi-Square)Distribution,553
REFERENCES555 Index557
PREFACE
Thestudyofstatisticsisgainingrecognitioninagreatmanyfields.Inparticular, researchersinthesocialandhealthsciencesnoteitsimportanceforproblemsolving anditspracticalimportanceintheirareas.Statisticshasalwaysbeenimportant,for example,amongthosehopingtoentercareersinmedicinebutmoresonowdueto theincreasingemphasison“ScientificInquiry&ReasoningSkills”aspreparationfor theMedicalCollegeAdmissionTest(MCAT).Sociology,alwaysrelyingonstatistics andresearchforitscoreemphases,isnowincludedintheMCATaswell.
Thisbookfocusessquarelyontheproceduresimportanttoanessentialunderstandingofstatisticsandhowitisusedintherealworldforproblemsolving.Moreover,my discussioninthebookrepeatedlytiesstatisticalmethodologywithresearchdesign (seethe“companion”volumemycolleagueandIwrotetoemphasizeresearchand designskillsinsocialscience;AbbottandMcKinney,2013).
Iemphasizeappliedstatisticalanalysesandassuchwilluseexamplesthroughoutthebookdrawnfrommyownresearchaswellasfromnationaldatabaseslike GSSandBehavioralRiskFactorSurveillanceSystem(BRFSS).Usingdatafrom thesesourcesallowstudentstheopportunitytoseehowstatisticalproceduresapply toresearchintheirfieldsaswellastoexamine“realdata.”Acentralfeatureofthe bookismydiscussionanduseofSPSS® andMicrosoftExcel® toanalyzedatafor problemsolving.
Throughoutmyteachingandresearchcareer,Ihavedevelopedanapproachto helpingstudentsunderstanddifficultstatisticalconceptsinanewway.Ifindthatthe greatmajorityofstudentsarevisuallearners,soIdevelopeddiagramsandfigures overtheyearsthathelpcreateaconceptualpictureofthestatisticalproceduresthat areoftenproblematictostudents(likesamplingdistributions!).
Anotherreasonforwritingthisbookwastogivestudentsawaytounderstandstatisticalcomputingwithouthavingtorelyoncomprehensiveandexpensivestatistical softwareprograms.SincemoststudentshaveaccesstoMicrosoftExcel,Idevelopeda step-by-stepapproachtousingthepowerfulstatisticalproceduresinExceltoanalyze dataandconductresearchineachofthestatisticaltopicsIcoverinthebook.1
Ialsowantedtomakethosecomprehensivestatisticalprogramsmoreapproachabletostatisticsstudents,soIhavealsoincludeda“hands-on”guidetoSPSSin parallelwiththeExcelexamples.Insomecases,SPSShastheonlymeanstoperform somestatisticalprocedures,butinmostcases,bothExcelandSPSScanbeused.
Herearesomeofthefeaturesofthebook:
1.Emphasisonthe interpretationoffindings.
2.Useof clearexamplesfrommyexistingandformerresearchprojectsandlarge databasestoillustratestatisticalprocedures.“Real-world”datacanbecumbersome,soIintroducestraightforwardproceduresandexamplesinordertohelp studentsfocusmoreoninterpretationoffindings.
3.Inclusionofa datalabsectionineachchapterthatprovidesrelevant,clear examples.
4. Introductiontoadvancedstatisticalproceduresinchaptersections(e.g., regressiondiagnostics)andseparatechapters(e.g.,multiplelinearregression) forgreaterrelevancetoreal-worldresearchneeds.
5.Strengtheningofthe connectionbetweenstatisticalapplicationandresearch designs.
6.Inclusionofdetailedsectionsineachchapterexplaining applicationsfrom ExcelandSPSS.
IuseSPSS2 (versions22and23)screenshotsofmenusandtablesbypermission fromtheIBM® Company.IBM,theIBMlogo,ibm.com,andSPSSaretrademarks orregisteredtrademarksof InternationalBusinessMachinesCorporation,registeredinmanyjurisdictionsworldwide.Otherproductandservicenamesmightbe trademarksofIBMorothercompanies.AcurrentlistofIBMtrademarksisavailable ontheWebat“IBMCopyrightandtrademarkinformation”atwww.ibm.com/legal/
1 OnelimitationtoteachingstatisticsprocedureswithExcelisthatthedataanalysisfeaturesaredifferent dependingonwhethertheuserisa“Mac”userora“PC”user.IamusingthePCversion,whichfeatures a“DataAnalysis”suiteofstatisticaltools.ThisfeaturemaynolongerbeincludedintheMacversionof Excel.
2 SPSSscreenreprintsthroughoutthebookareusedcourtesyofInternationalBusinessMachinesCorporation,©InternationalBusinessMachinesCorporation.SPSSwasacquiredbyIBMinOctober2009.
PREFACE xvii
copytrade.shtml.MicrosoftExcelreferencesandscreenshotsinthisbookareused withpermissionfromMicrosoft.IuseMicrosoftExcel® 2013inthisbook.3
IuseGSS(2014)dataandcodebookforexamplesinthisbook.4 TheBRFSS SurveyQuestionnaireandDataareusedwithpermissionfromtheCDC.5
3 ExcelreferencesandscreenshotsinthisbookareusedwithpermissionfromMicrosoft®
4 Smith,TomW.,PeterMarsden,MichaelHout,andJibumKim.GeneralSocialSurveys,1972–2012 [machine-readabledatafile]/PrincipalInvestigator,TomW.Smith;CoprincipalInvestigator,PeterV.Marsden;CoprincipalInvestigator,MichaelHout;SponsoredbyNationalScienceFoundation.NORCed. Chicago:NationalOpinionResearchCenter[producer];Storrs,CT:TheRoperCenterforPublicOpinion Research,UniversityofConnecticut[distributor],2013.1datafile(57,061logicalrecords) + 1codebook (3432pp.).(NationalDataProgramfortheSocialSciences,No.21).
5 CentersforDiseaseControlandPrevention(CDC). BehavioralRiskFactorSurveillanceSystemSurveyQuestionnaire.Atlanta,Georgia:U.S.DepartmentofHealthandHumanServices,CentersforDisease ControlandPrevention,2013andCentersforDiseaseControlandPrevention(CDC). BehavioralRiskFactorSurveillanceSystemSurveyData.Atlanta,Georgia:U.S.DepartmentofHealthandHumanServices, CentersforDiseaseControlandPrevention,2013.
ACKNOWLEDGMENTS
IwishtothankmydaughterKristinHovaguimianforheroutstandingworkonthe Indextothisbook(andalltheothers!)–notaneasytaskwithabookofthisnature.
IthankmywifeKathleenAbbottforherdedicationandamazingcontributionsto theeditingprocess.
IthankmysonMatthewAbbottfortheinspirationhehasalwaysprovidedin mattersstatisticalandphilosophical.
ThankyouJonGurstelleandtheteamatWileyforyourcontinuingsupportofthis project.
1 INTRODUCTION
Theworldsuddenlyhasbecomeawashindata!Agreatmanypopularbookshave beenwrittenrecentlythatextol“bigdata”andtheinformationderivedfordecision makers.Thesedataareconsidered“big”becauseacertain“catalog”ofdatamaybeso largethattraditionalwaysofmanagingandanalyzingsuchinformationcannoteasily accommodateit.Thedataoriginatefromyouandmewheneverweusecertainsocial media,ormakepurchasesonline,orhaveinformationderivedfromusthroughradio frequencyidentification(RFID)readersattachedtoclothingandcars,evenimplanted inanimals,andsoon.Theresultisamassiveavalancheofinformationthatexists forbusinessesleaders,decisionmakers,andresearcherstouseforpredictingrelated behaviorsandattitudes.
BIGDATAANALYSIS
Decisionmakersaretryingtofigureouthowtomanageandusetheinformation available.Typicalcomputersoftwareusedforstatisticaldecisionmakingiscurrently limitedtoanumberofcasesfarbelowthatwhichisavailableforconsiderationofbig data.Atraditionalapproachtoaddressthisissueisknownas“datamining”inwhich anumberoftechniques,includingstatistics,areusedtodiscoverpatternsinalarge setofdata.
Researchersmaybeoverjoyedwiththeavailabilityofsuchrichdata,butitprovidesbothopportunitiesandchallenges.Ontheopportunityside,neverbeforehave
UsingStatisticsintheSocialandHealthScienceswithSPSS® andExcel®,FirstEdition. MartinLeeAbbott. ©2017JohnWiley&Sons,Inc.Published2017byJohnWiley&Sons,Inc.
suchlargeamountsofinformationbeenavailabletoassistresearchersandpolicy makersunderstandwidespreadpublicthinkingandbehavior.Onthechallengeside howeverareseveraldifficultquestions:
• Howaresuchdatatobeexamined?
• Docurrentsocialsciencemethodsandprocessesprovideguidancetoexamining datasetsthatsurpasshistoricaldata-gatheringcapacity?
• Arebigdatarepresentative?
• Dodatasetssolargeobviatetheneedforprobability-basedresearchanalyses?
• Dodecisionmakersunderstandhowtousesocialsciencemethodologytoassist intheiranalysesofemergingdata?
• Willthedecisionsemergingfrombigdatabeusedethically,withinthecontext tosocialscienceresearchguidelines?
• Willeffectsizeconsiderationsovershadowquestionsofsignificancetesting?
Socialscientistscanrelyonexistingstatisticalmethodstomanageandanalyze bigdata,butthe wayinwhichtheanalysesareusedfordecisionmakingwillchange. Onetrendisthatpredictionmaybehailedasamoreprominentmethodforunderstandingthedatathantraditionalhypothesistesting.Wewillhavemoretosayabout thisdistinctionlaterinthebook,butitisimportantatthispointtoseethatresearchers willneedtoadaptstatisticalapproachesforanalyzingbigdata.
VISUALDATAANALYSIS
Anotheremergingtrendforunderstandingandmanagingtheswellofdataistheuse ofvisuals.Ofcourse,visualdescriptionsofdatahavebeenusedforcenturies.Itis commonlyacknowledgedthatthefirst“piechart”waspublishedbyPlayfair(1801). Playfair’sexampleinFigure1.1comparesthedynamicsofnationsovertime.
Figure1.1comparednationsusingsize,color,andorientationovertime.Using thismethodforcomparinginformationhasbeenusefulforviewingthepatternsin datanotreadilyobservablefromnumericalanalysis.
Aswithnumericalmethods,however,thereareopportunitiesandchallengesin theuseofvisualanalyses:
• Canvisualmeansbeusedtoconveycomplexmeaning?
• Arethere“rules”thatwillhelptoinsureastandardwayofcreating,analyzing, andinterpretingsuchvisualinformation?
• Willvisualanalysesbecomedivorcedfromnumericalanalysissothatobservers havenowayofobjectivelyconfirmingthemeaningoftheimages?
Severalvisualdatasoftwareanalysisprogramshaveappearedoverthelastseveral years.Simplyrunninganonlinesearchwillyieldseveralpossibilitiesincludingmany thatofferfree(initial)programsforcatalogingandpresentingdatafromtheuser.I offeroneveryimportantcaveat(seethefinalbulletpointearlier),whichisthatitis
Figure1.1 WilliamPlayfair’spiechart. Source:https://commons.wikimedia.org/wiki/File :Playfair_piecharts.jpg.Publicdomain.
importanttoperformvisualdataanalysisinconcertwithnumericalanalysis.Aswe willseelaterinthebook,itiseasytointentionallyorunintentionallymisleadreaders usingvisualpresentationswhenthesearedivorcedfromnumericalstatisticalmeans thatdiscussthe“significance”and“meaningfulness”ofthevisualdata.
IMPORTANCEOFSTATISTICSFORTHESOCIALANDHEALTH SCIENCESANDMEDICINE
Thepresenceofsomuchrichinformationpresentsmeaningfulopportunitiesfor understandingmanyoftheprocessesthataffectthesocialworld.Whilemuchof thetimebigdataanalysesareusedforunderstandingbusinessdynamicsandeconomictrends,itisalsoimportanttofocusonthosedatapatternsthatcanaffectthe socialspherebeyondtheseindicators:socialandpsychologicalbehaviorandattitudes,changesinunderstandinghealthandmedicine,andeducationalprogress.These socialindicatorshavebeenthesubjectofagreatdealofanalysesoverthedecades andnowmaymakesignificantadvancesdependingonhowbigdataareanalyzedand managed.Onarelatednote,thesocialsciences(especiallysociologyandpsychology)arenowareasincludedinthenewMedicalCollegeAdmissionTest(MCAT), whichalsoincludesgreateremphasisupon“ScientificInquiry&ReasoningSkills.” Thematerialwewilllearnfromthisbookwillhelptosupportstudyintheseareas foraspiringhealthandmedicalprofessionals.
Inthisbook,Iintendtofocusonhowtouseandanalyzedataofallsizesand shapes.Whilewewillbelimitedinourabilitytodiveintotheworldofbigdatafully, wecanstudythebasicsofhowtorecognize,generate,interpret,andcritiqueanalyses ofdatafordecisionmaking.Oneofthefirstlessonsisthat datacanbeunderstood bothnumericallyandvisually.Whenwedescribeinformation,weareattemptingto
seeandconveyunderlyingmeaninginthenumbersandvisualexpressions.IfIhave acollectionofdata,Icannotrecognizeitsmeaningbysimplylookingatit.However, ifIapplycertainnumericalandvisualmethodsto organizethedata,Icanseewhat patternslaybelowthesurface.
HISTORICALNOTES:EARLYUSEOFSTATISTICS
Statisticsasafieldhashadalongandcolorfulhistory.Studentswillrecognizesome prominentnamesasthefielddevelopeditsmathematicalidentity:Pearson,Fisher, Bayes,Laplace,andothers.Butitisimportanttonotethatsomeoftheearlieststatisticalstudieswerebasedinsolvingsocialandpoliticalproblems.
OneoftheearliestofsuchstudieswasdevelopedbyJohnGrauntwhocompiled informationfromBillsofMortalitytodetect,amongotherthings,theimpactand originsofdeathsbyplague.Parishrecordsdocumentedchristenings,weddings,and burialsatthetime,soGraunt’sstudytrackedthenumberofdeathsintheparishesas
Figure1.2 JohnSnow’smapshowingdeathsintheLondoncholeraepidemicof1854. Source:https://commons.wikimedia.org/wiki/File:Snow-cholera-map-1.jpg.Publicdomain.
awaytounderstandthedynamicsoftheplague.Hisbroadergoalwastopredictthe populationofLondonusingextantdatafromtheparishrecords.
AnotherearlyuseofstatisticswasDrJohnSnow’smapshowingdeathsinthe housesofLondon’sSohoDistrictduringthe1854choleraepidemic,aspopularized byJohnson’sbook, TheGhostMap (2006).Inordertoinvestigatethereasonsforthe spreadofcholeraotherthanodor(“miasmatheory”),Snowcreatedamapshowing eachdeathasablacklineoutsideeachhousehold,alongwithfeaturesoftheneighborhoodincludingthewatersourceslocatedthroughoutthedistrict.Themapcreated avisualpictureoftheconcentrationofdeathsacrossthedistrictandledtohypothesesaboutcholeraspreadingbywaterbornecontaminationratherthansmell.(Ifyou weretowalkacrossthesameLondondistricttoday,youwillseethatthegreatsocial theoristKarlMarxlivedjustafewstreetsawayfromthecenterofthecholeradeaths.)
Figure1.2showsSnow’smap.Youcanseethatnearthecenterofthemapis the“BroadStreetPump”whichSnowdeterminedtobethesourceforthespreadof cholera.(Atthetime,KarlMarxlivedonDeanStreet,justtotheeastoftheBroad StreetPump.)Noticethatthehousesnearestthispumprecordedthehighestnumbers ofdeaths.
Figure1.2examplenotonlyshowshowdescriptivestatisticsunderscoredtheuse ofvisualmeansofrepresentingdata,butitalsohelpedtoclarifypossiblereasons foranepidemic.Graunt’stablesbasedontheBillsofMortalitywererudimentary visuals,butSnow’smapwasamoreeffectivemeansofportrayingcomplexdataby visualmeans.Astilllaterstatisticianmadeevengreateradvancementsinusingvisual informationtocommunicatetrendsindata.
Figure1.3 FlorenceNightingale’spolarchartcomparingbattlefieldandnonbattlefield deaths. Source:https://en.wikipedia.org/wiki/Pie_chart#/media/File:Nightingale-mortality.jpg. Publicdomain.
Nightingale(1858)ismostoftenrememberedasthefounderofmodernnursing. Sheisoftenrepresentedinpaintingsas“theladywiththelamp,”sinceshewas knowntowalkamongthebedsidescheckingonthesickandwoundedofthewar. ButNightingalewasalsoanastutestatisticianwhousedstatisticstocapturethedramaticneedinhospitalsduringtheCrimeanWar.Sheiscreditedasbeingoneofthe firsttousea“piechart”(moreaccurately,a“polarchart”).Figure1.3showscomparisonsinheroriginalpolarchartofdifferencesbetweensoldierswhodiedofbattlefield wounds(“red”wedgesnearthecenter)andthosewhodiedfromothercauses(“blue” wedgesmeasuredfromthecenterofthegraph)overtime.Therelationshipbetween thesegroupsfueledNightingale’seffortstoobtainfurtherfundingforsanitaryhospitalconditionssincethosewhodiedofinfectionsweregreaterinnumberthanthose dyingofbattlefieldwounds.
APPROACHOFTHEBOOK
Manystudentsandresearchersareintimidatedbystatisticalprocedures,whichmay beduetofearofmath,problematicmathteachersinearliereducation,orthelackof exposuretoa“discovery”methodforunderstandingdifficultprocedures.Thisbook isanintroductiontounderstandingstatisticsinawaythatallowsstudentstodiscover patternsindataanddevelopingskillatmakinginterpretationsfromdataanalyses.I describehowtousestatisticalprograms(SPSSandExcel)tomakethestudymore understandableandtoteachstudentshowtoapproachproblemsolving.Ordinarily,a firstcourseinstatisticsleadsstudentsthroughtheworldsofdescriptiveandinferential statisticsbyhighlightingtheformulasandsequentialproceduresthatleadtostatistical decisionmaking.Wewilldoallthisinthisbook,butIplaceagooddealmoreattention onconceptualunderstanding.Thus,ratherthanmemorizingaspecificformulaand usingitinaspecificwaytosolveaproblem,Iwanttomakesurethestudentfirst understandsthenatureoftheproblem,whyaspecificformulaisneeded,andhowit willresultintheappropriateinformationfordecisionmaking.
Byusingstatisticalsoftware,wecanplacemoreattentiononunderstandinghowto interpretfindings.Statisticscoursestaughtinmathematicsdepartments,andinsome socialsciencedepartments,oftenplaceprimaryemphasesontheformulas/processes themselves.Intheextreme,thiscanlimittheusefulnessoftheanalysestothepractitioner.Myapproachencouragesstudentstofocusmoreonhowtounderstandand makeapplicationsoftheresultsofstatisticalanalyses.SPSSandotherstatistical programsaremuchmoreefficientatperformingtheanalyses;thekeyissueinmy approachishowtointerprettheresultsinthecontextoftheresearchquestion.
Beginningwithmyfirstundergraduatecourseteachingstatisticswithconventional textbooks,Ihavespentcountlesshoursdemonstratinghowtoconductstatisticaltests manuallyandteachingstudentstodolikewise.Thisisnotalwaysabadstrategy; performingtheanalysismanuallycanleadthestudenttounderstandhowformulas treatdataandyieldvaluableinformation.However,itisoftenthecasethatthestudentgravitatestomemorizingtheformulaorthestepsinananalysis.Again,there isnothingwrongwiththisapproachaslongasthestudentdoesnotstopthere. The
outcomeoftheanalysisismoreimportantthanmemorizingthestepstotheoutcome. Examiningtheappropriateoutputderivedfromstatisticalsoftwareshiftstheattention fromthenuancesofaformulatothewealthofinformationobtainedbyusingit.
ItisimportanttounderstandthatIdoindeedteachthestudentthenuancesof formulas,understandingwhy,when,how,andunderwhatconditionstheyareused. Butinmyexperience,forcingthestudenttoscrutinizestatisticaloutputfilesaccomplishesthisandteachesthemtheappropriateuseandlimitationsoftheinformation derived.
Studentsinmyclassesarealwayssurprised(ecstatic)torealizetheycanusetheir textbooksandnotesonmyexams.Buttheyquicklyfindthat,unlesstheyreally understandtheprinciplesandhowtheyareappliedandinterpreted,anopenbook isnotgoingtohelpthem.Overtime,theycometorealizethattheanalysesandthe outcomesofstatisticalproceduresaresimplytheingredientsforwhatcomesnext: buildingsolutionstoresearchproblems.Therefore,theirroleismoredetectiveand constructorthannumberjuggler.
Thisapproachmirrorstherecentnationalandinternationaldebateaboutmath pedagogy.Inourrecentbook, WinningtheMathWars (2010),mycolleaguesand Iaddressedtheseissuesingreatdetail,suggestingthat,whiletraditionalwaysof teachingmathareusefulandimportant,theemphasesofreformapproachesarenotto bedismissed.Understandingandmemorizingdetailarecrucial,butproblemsolving requiresadifferentapproachtolearning.
CASESFROMCURRENTRESEARCH
Ifocusonusingreal-worlddatainthisbook.Thereareseveralreasonsfordoingso, primarilybecausestudentsneedtobegroundedinapproachesforusingdatafromthe realworldwithalltheirproblemsand“grittiness.”Whenpeoplerespondtosurveys orinterviews,theyinevitablyfilloutinformationinwaysnotaskedbyinterviewers(e.g.,respondentsmaychoosetwopossibleanswerswhenoneisrequired,etc.). Moreover,transferringdatatoelectronicformmayresultinmiscodedresponsesor categorizationproblems.Researchersalwaysconfronttheseissues,andIbelieveit isimportantforstudentstoleavetheclassroomawareoftherangeofpossibleproblemswithreal-worlddataandpreparedfordealingwiththem.Ofcourse,muchofthe datawewillexaminewillalreadyhavebeenputinstandardforms,butotherresearch issueswillarise(e.g.,howdoIrecategorizedata,assignmissingcases,computenew variables,etc.?).
AnotherreasonIusereal-worlddataistofamiliarizestudentswithcontemporary researchquestionsinthesocialandhealthsciencefields.Classroomdataoftenare contrivedtomakeacertainpointorshowaspecificprocedure,whicharebothhelpful. ButIbelieveitisimportanttodrawthefocusawayfromtheprocedureperseand understandhowtheprocedurewillhelptheresearcherresolvearesearchquestion. Theresearchquestionsareimportant.Policyreflectstheavailableinformationona researchtopic,tosomeextent,soitisimportantforstudentstobeabletogenerate thatinformationaswellastounderstandit.Thisisan“active”ratherthan“passive” learningapproachtounderstandingstatistics.
DataLabsareaveryimportantpartofthiscoursesincetheyallowstudentsto takechargeoftheirlearning.Thisistheheartofdiscoverylearning.Understandinga statisticalprocedureintheconfinesofaclassroomisnecessaryandhelpful.However, learningthatlastsisbestaccomplishedbystudentsdirectlyengagingtheprocesses withactualdataandobservingwhatpatternsemergeinthefindingsthatcanbeapplied torealresearchproblems.
Somepracticeproblemsmayusedatacreatedforclassroomuse,butreal-world datafromactualresearchdatabaseswillenableadeepeningofunderstanding.Inadditiontonationaldatabases,Iuseresultsfrommyownresearchforclassroomlearning. Ineverycase,researchersknowthattheywilldiscoverknottyproblemsandunusual, sometimesidiosyncratic,informationintheirdata.Ifstudentsarenotexposedtothis real-worldaspectofresearch,itwillbeconfusingwhentheyengageinactualresearch beyondtheconfinesoftheclassroom.
Inthiscourse,wewillhaveseveraloccasionstocompleteDataLabsthatpose researchproblemswithactualdata.StudentstakewhattheylearnfromthebookmaterialandconductastatisticalinvestigationusingSPSSandExcel.Then,theyhavethe opportunitytoexaminetheresults,writeresearchsummaries,andcomparefindings withthesolutionspresentedattheendofthebook.
Theprojectlabsalsointroducestudentstotwosoftwareapproachesforsolvingstatisticalproblems.Thesearequitedifferentinmanyregards,aswewillsee inthechaptersthatfollow.SPSSprovidesadditionaladvancedprocedureseducationalresearchersutilizeformorecomplexandextensiveresearchquestions.Excel iswidelyaccessibleandprovidesawealthofinformationtoresearchersaboutmany statisticalprocessestheyencounterinactualresearch.TheDataLabsprovidesolutionsinbothformatssothestudentcanlearnthecapabilitiesandapproachesofeach.
Thisbookmakesuseofpublicallyavailableresearchdata.TheGeneralSocialSurveyorGSS1 isanationallyrepresentativesurveydesignedtobepartofaprogramof socialresearchtomonitorchangesinAmericans’socialcharacteristicsandattitudes. FundedthroughtheNationalScienceFoundationandadministeredbytheNational OpinionResearchCenter(NORC),theGSShasbeenadministeredannuallyorbiannuallysince1972.Asageneralsurvey,theGSSasksavarietyofquestionsonaseries oftopicsdesignedtotracktheopinionsofAmericansoverthelastfourdecades. Otherdatabaseswewilluseinthebookincludethefollowing:
• TheCentersforDiseaseControlandPrevention(CDC)conductstheBehavioral RiskFactorSurveillanceSystem(BRFSS)asahealth-relatedtelephonesurvey tomeasureAmericanresidents’healthconditions,healthbehaviors,anduseof preventativeservices.2
1 TomW.Smith,PeterMarsden,MichaelHout,andJibumKim.GeneralSocialSurveys,1972–2012 [machine-readabledatafile]/PrincipalInvestigator,TomW.Smith;CoprincipalInvestigator,PeterV. Marsden;CoprincipalInvestigator,MichaelHout;SponsoredbyNationalScienceFoundation.–NORC ed.–Chicago:NationalOpinionResearchCenter[producer];Storrs,CT:TheRoperCenterforPublic OpinionResearch,UniversityofConnecticut[distributor],2013.1datafile(57,061logicalrecords) + 1 codebook(3432pp.).--(NationalDataProgramfortheSocialSciences,No.21).
2 CentersforDiseaseControlandPrevention(CDC)(2013). BehavioralRiskFactorSurveillanceSystemSurveyData.Atlanta,Georgia:U.S.DepartmentofHealthandHumanServices,CentersforDisease ControlandPrevention.
• AssociationofReligionDataArchives(ARDA)presentsaseriesofdatabases onavarietyofreligiontopicsfromthesociologicalperspective.Inadditionto otherdatabases,theARDApresentsGSSdatabasesonspecialmodules(setsof questions)relevanttoreligion.ByvisitingtheARDA(www.thearda.com),you canperusethecodebookforthelatestGSSfile(www.thearda.com/Archive/ GSS.asp)togetafullersenseofthetypesofquestionsageneralsurveyasks. YoucanalsovisittheARDA’s“LearningCenter”totakeasurveythatallows youtocompareyourselftoalargernationalprofile.The“CompareYourself totheNation”surveyallowsyoutoseehowyoucomparetoothersbasedon theresultsfromthe2005BaylorReligionSurvey(addressingreligiousidentity, beliefs,experiences,paranormalviews,etc.).
RESEARCHDESIGN
Researcherswhowritestatisticsbookshaveadilemmawithrespecttoresearch design.Typically,statisticsandresearchdesignaretaughtseparatelyinorderfor studentstounderstandeachingreaterdepth.Thedifficultywiththisapproachisthat thestudentisleftontheirowntosynthesizetheinformation;thisisoftennotdone successfully.
Collegesanduniversitiesattempttomanagethisproblemdifferently.Somerequire statisticsasaprerequisiteforaresearchdesigncourseorviceversa.Othersattemptto synthesizetheinformationintoonecourse,whichisdifficulttodogiventheeventual complexityofboth“sets”ofinformation.Addingsomewhattotheproblemisthe approachofmultiplecoursesinbothdomains.
Idonotofferaperfectsolutiontothisdilemma.Myapproachfocusesonan in-depthunderstandingofstatisticalproceduresforactualresearchproblems.What thismeansisthatIcannotdevoteagreatdealofattentioninthisbooktoresearch designapartfromthestatisticalprocedureswhichareanintegralpartofit.(Youmay wishtoconsultaseparatebookonresearchdesignIauthoredwithmycolleague JenniferMcKinney, UnderstandingandApplyingResearchDesign,2013.)
Itrytoaddresstheproblemintwoways.First,whereverpossible,Iconnectstatisticswithspecificresearchdesigns.Thisprovidesanadditionalcontextinwhichstudentscanfocusonusingstatisticstoanswerresearchquestions.Theresearchquestion drivesthedecisionaboutwhichstatisticalprocedurestouse;italsocallsfordiscussionofappropriatedesigninwhichtousethestatisticalprocedures.Wewillcover essentialinformationaboutresearchdesigninordertoshowhowthesemightbeused.
Second,Ihaveanonlinecourseinresearchdesignthatcanbeaccessedtocontinue yourexplorationfromthisbook.Inadditiontodatabasesandotherresearchresources, youcanfollowthewebaddressintheprefacetogainaccesstotheonlinecourseas additionalpreparationinresearchdesign.
FOCUSONINTERPRETATION
Icallattentiontoproblemsolvingandinterpretationastheimportantelementsof statisticalanalysis.Itistemptingforstudentstofocussomuchonusingstatistical
procedurestocreatemeaningfulresults(acriticalmatter!)thattheydonotfocuson whattheresultsmeanfortheresearchquestion.Theystopaftertheyuseaformulaand decidewhetherornotafindingisstatisticallysignificant.Istronglyencouragestudentsto thinkaboutthefindingsinthecontextandwordsoftheresearchquestion. Thisisnotaneasythingtodobecausethemeaningoftheresultsisnotalwayscut anddried.Itrequiresstudentstothinkbeyondtheformula.
Statisticiansandpractitionershavedevisedrulestohelpresearcherswiththis dilemmabycreatingcriteriafordecisionmaking.Forexample,aswewillseein Chapter11,squaringacorrelationyieldsthe“coefficientofdetermination,”which representstheamountofvarianceinonevariablethatisaccountedforbytheother variable(thisisknownas“effectsize,”atopicwhichwewillspendagreatdealof timewithinthisbook).Butthenextquestionis,howmuchofthe“accountedfor variance”ismeaningful?Thisconsiderationiskeytounderstandinghowtouseand makedecisionsonthebasisofbigdata.
Inmanyways,interpretationofresultsisanartundergirdedbythecannonsof science.Muchoftheabilitytodevelopexpertiseininterpretationcomesbylong hoursoftutelagewithresearcherswhohavedoneitformanyyears.Wecannothope toemergefromourstudywiththisexpertise,butthroughconstantfocusoninterpretation,wecanbecomeawareoftheacceptablewaysofunderstandingandusing statisticalresults.
Statisticianshavesuggesteddifferentwaysofhelpingwithinterpretation.For example,whendealingwiththe“accountingofvariance”examplepresentedearlier, statisticianshavecreatedcriteriathatdetermine0.01(1%)ofthevarianceaccounted forisconsidered“small”while0.05(5%)is“medium”andsoforth.(And,muchtothe dismayofmanystudents,therearemorethanonesetofthesecriteria.)Therefore,if wedeterminethatthecorrelationbetweentwovariablesreachthesecriterialevels,we canfeelsecureinstickingtogoodinterpretationguidelines.Problemsexisthowever inhowtoviewthesestatisticalresultswithinthecontextoftheresearchproblem.
Forexample,ifaresearchquestionis,“Doesclasssizeaffectmathachievement?” andtheresultssuggestthatclasssizeaccountsfor1%ofthevarianceinmathachievement,manyresearchersmightagreetheresultsrepresentasmallandperhapseven inconsequentialimpact.However,ifaresearchquestionis,“DoesdrugXaffectEbola survivalrates?,”researchersmightconsider1%ofthevariancetobemuchmore consequentialthan“small!”ThisisnottosaythatmathachievementisanylessimportantthanEbolasurvivalrates(althoughthatisanotherofthosedebatablequestions researchersface),buttheresearchermustconsiderarangeoffactorsindeterminingmeaningfulness:theintractabilityoftheresearchproblem,thediscoveryofnew dimensionsoftheresearchfocus,whetherornotthefindingsrepresentlifeanddeath, andsoon.Thematerialpointisthatstatisticalcriteriaareimportantforestablishing meaningfulnessofresults,butoverallinterpretationinvolvesthelargercontextwithin whichtheresearchtakesplace.
Ihavefoundthatstudentshavethemostdifficulttimewiththesematters.Usinga formulatocreatenumericalresultsisoftenmuchpreferabletounderstandingwhatthe resultsmeaninthecontextoftheresearchquestion.Studentshavebeenconditioned tostopaftertheygettherightnumericalanswer.Theytypicallydonotgettothe difficultworkofwhattherightanswer means becauseitisn’talwaysapparent.
Iemphasize“practicalsignificance”(effectsize)inthisbookaswellasstatistical significance.Inmanyways,thisisamorecomprehensiveapproachtouncertainty, sinceeffectsizeisameasureof“impact”intheresearchevaluation.Itisimportant tomeasurethelikelihoodofchancefindings(statisticalsignificance),buttheextent ofinfluencerepresentedintheanalysesaffordstheresearcheranothervantagepoint todeterminetherelationshipamongtheresearchvariables.
CoverageofStatisticalProcedures
Thestatisticalapplicationswewilldiscussinthisbookare“workhorses.”Thisisan introductorytreatment,soweneedtospendtimediscussingthenatureofstatisticsand basicproceduresthatallowyoutousemoresophisticatedprocedures.Wewillnotbe abletoexamineadvancedproceduresinmuchdetail.Iwillprovidesomereferences forstudentswhowishtocontinuetheirlearningintheseareas.Hopefully,asyou learnthecapabilityofSPSSandExcel,youcanexploremoreadvancedprocedures onyourown,beyondtheendofourdiscussions.
Somereadersmayhavetakenstatisticscourseworkpreviously.Ifso,myhopeis thattheyareabletoenrichwhattheypreviouslylearnedanddevelopamorenuanced understandingofhowtoaddressproblemsineducationalresearchthroughtheuseof SPSSandExcel.Whetherreadersarenewtothestudyorexperiencedpractitioners, myhopeisthatstatisticsbecomesmeaningfulasawayofexaminingproblemsand debunkingprevailingassumptionsinthesocialandhealthsciences.
Often,well-intentionedpeoplecan,throughignoranceofappropriateprocesses, promoteideasthatmaynotbetrue.Further,policiesmightbeofferedthatwouldhave anegativeimpacteventhoughthepolicywasnotbasedonsoundstatisticalanalyses. Statisticsaretoolsthatcanbemisusedandinfluencedbythevalueperspectiveofthe wielder.However,policiesareoftengeneratedintheabsenceofcompellingresearch. Studentsneedtobecome“researchliterate”inordertorecognizewhenstatistical processesshouldbeusedandwhentheyarebeingusedincorrectly.
2
DESCRIPTIVESTATISTICS:CENTRAL TENDENCY
WhenIteachstatistics,Itypicallybeginbyofferingaseriesofquestionsthat emphasizetheimportanceofstatisticsforsolvingrealresearchproblems.Statistical formulasandproceduresarelogicalandcrucial,buttheprimaryfunctionfor statisticalanalyses(atleast,inmymind)istobringclarityandunderstanding toaresearchquestion.AsIdiscussedinarecentbookdealingwithstatistics forprogramevaluation(Abbott,2010),statisticalproceduresarebestusedto discoverpatternsinthedatathatarenotdirectlyobservable.Bringinglighttothese patternsallowsthestudentandtheresearchertounderstandandengageinproblem solving.
WHATISTHEWHOLETRUTH?RESEARCHAPPLICATIONS (SPURIOUSNESS)
Findingthe“truth”isalaudablegoalandonethatshouldinformallresearchefforts. However,instatistics,itisnotlikelythatwewilleverreallydiscoverultimate truth.Thenatureofstatisticsisthat westrivetoobserveasfullyaspossiblewhat relationshipsexistamongvariablessothatwecanunderstandlikelycausallinkages. Doespoverty“cause”crime?Islongevityaffectedbyaccesstohealthcare?These questionsintimatevalidrelationshipsbetweentheresearchvariables.However,one ofthefirstlessonsinstatisticsandresearchisthatvalidandmeaningfulrelationships arenotalwayseasilyvisible.Certainlymostrealitiesincontemporarylifearemuch
UsingStatisticsintheSocialandHealthScienceswithSPSS® andExcel®,FirstEdition. MartinLeeAbbott. ©2017JohnWiley&Sons,Inc.Published2017byJohnWiley&Sons,Inc.
morecomplexthancanbeexplainedbytwovariables.Wethereforemustbeable to“see”patternsamongdatausingbothnumericalandvisualmeansthatunderlie seeminglysimplerelationships.
AswewilldiscussinChapter11,thereisabigdifferencebetween“correlation”and“causation.”Thisstatisticaladagehelpstopointoutthecomplexityof understandingthepatternsamongvariables.Justbecausetwovariablesarestrongly statisticallyrelateddoesnotmeanthatthereisacausalrelationshipbetweenthem. Causalityisdifficulttoprove.Inordertounderstandtheapparentcausalrelationship morefully,wemustlookat othervariablesthatmighthaveameaningfulbut“hidden” relationshipwithboth“visible”variables.Researchersusetheterm“spuriousness”to describewhetheranapparentrelationshipbetweentwovariablesmightbetheinfluenceofvariablesnotintheanalysis.Anexampleofspuriousnessistherelationship betweenicecreamconsumptionandcrime.1
Thereisapositiverelationshipbetweenratesoficecreamconsumptionandcrime; whenoneincreases,sodoestheother.Shouldweconcludethenthaticecreamconsumptionleadstocriminalbehaviorinacausalway?Spuriousnessmeansthatthere maynotatrueorgenuinerelationshipbetweenfactorsevenifitlookslikethereis. Some unobservedorunnoticedvariablemayberelatedtobothofthevariableswe can“see”(inthisexampleicecreamconsumptionandcrime),whichmaymakeit appearthatthe“visible”variableshaveacause–effectrelationship.
Inthisexample,icecreamconsumptionincreasesascrimeincreases;and, consequently,whencrimeincreases,sodoestheconsumptionoficecream.These twovariablesappeartobeconsistentlyrelatedtoeachother.Theyprobablydo nothaveacausalrelationship,however,sincebothicecreamconsumptionand crimearerelatedtoathirdfactor:temperature.Whentemperaturesrise,icecream consumptionincreases(peopleeatmoreicecreaminthesummerthanwinter).Also, whentemperaturesrise,crimeincreases.Ifweincludetheseadditionalrelationships inourstudy,thenwecanseethattheapparentcausalrelationshipbetweenicecream consumptionandcrimeisprobablyreallymoreanissueoftheweather;bothofthe variablesare“linked”bytemperature.
Withoutconsideringspuriousness,somemightbetemptedtoexplain whythereis acausalrelationshipbetweenicecreamconsumptionandcrime.Forexample,does icecreamleadtofeelingsofgrandeurorapropensityforaggression,whichcauses peopletocommitcrime?Orisitthatgoodicecreamissoexpensivethatpeople commitcrimesinordertosupporttheiricecreamhabit?Whichmakesmostsense? Althoughwecouldcomeupwithseveralreasons(mostlyfanciful)whyoneofthese variablesmightbecausallyrelatedtotheother,weneedtobecautious.
Thissituationleadstooneofthemostprofoundlessonsinsocialscience: objectivityisnecessarytopursueknowledgedispassionately.Ifweassumethereisarelationshipbetweenthingswithoutusingobjectivemeansofassessingthetruthofthe situation,thenwearesimplyimposingasubjectiveunderstandingofthesituationthat isnot“anchored”inscience.Somecallthisthe“procrusteanexercise”referencing themythologicalfigurewhoforcedpeopletoanironbedbyeitherstretchingthemto
1 ThisexampleandexplanationarediscussedinAbbottandMcKinney(2013).
Figure2.1 Thepossiblespuriousrelationshipbetweenicecreamconsumptionandcrime. fitorcuttingofftheexcess.Thus,bynottakinganobjectivestance,wemayhavea tendencytomakeapparentreality“fit”ourmentalpictureorsubjectiveassumptions.
Figure2.1showshowthepossiblerelationshipsamongicecreamconsumption, crime,andtemperature.Thetoppanelshowstheapparentrelationshipbetweenice creamconsumptionandcrime,withatwo-waylineconnectingthevariablesindicatingthatthetwoarehighlyrelatedtooneanother.Thebottompanelshowsthat, whenthethirdvariable(temperature)isintroduced,theapparentrelationshipbetween icecreamconsumptionandcrimedisappears,asindicatedbytheabsenceofaline connectingthem.
Identifyingpotentiallyspuriousrelationshipsisoftenquitedifficultandcomes onlyafterextendedresearch.Theresearchermustknowtheirdataintimatelyinorder tomakethediscovery.AnexampleofthisisastudyIconductedinastudyofindustrialdemocracyseveralyearsago.Itwasgenerallyacceptedinindustryatthetime that,ifworkersweregiventheabilitytoparticipateindecisionmaking,theywould havehigherjobsatisfaction(JS).Thiswasareasonableassumption,givensimilar findingsintheresearchliterature.However,themoreIexaminedmyowndatafrom workersinanelectronicindustry,themoreIquestionedthisassumptionanddecided toexplorethematterfurther.
Inoticedfrominterviewsthatmanyworkers didnotwanttoparticipateindecisionmaking,eventhoughtheyhadtheopportunitytodoso.Ithereforeanalyzed theoriginal“participation–jobsatisfaction”butthistimeaddedvariablesthatmeasuredworkers’attitudestowardtheirworkandadesireformanagement.Througha seriesofanalyses,Ifoundanumberofsurprisingresultsthat“modified”theoriginal assumptionofadirect(andcausal)relationshipbetweenparticipationandJS.Oneof thesefindingswasthataworker’s attitudetowardmanagementhadalottodowith theireventualsatisfactionlevels.Thoseworkerswhoparticipatedindecisionmakingandwhohadapositiveviewofmanagementshowedstrongersatisfactionthan thoseworkerswhodidnotsuchapositiveviewofmanagement.Thus,athirdvariable(viewofmanagement)thatwasnotoriginallyincludedinthesimplerelationship (participation–satisfaction)hadanimpactonthefindings.Thissubsequentanalysis discoveredapatterninthedatathatwasnot“visible”attheoutset.
Ice Cream Consumption
Ice Cream Consumption
Crime
Crime
Thepopularpressoftenpresentsresearchfindingsthataresomewhatbombastic butmightpossiblybespurious.Isstudentachievementreallyjustamatterofethnicity, orarethereotherfactorsinvolved(e.g.,familyincome)?Dolifestylechoicesdirectly impactlongevity,orarethereotherconsiderationsthatneedtobetakenintoaccount (e.g.,socialclass)?Thevalueofstatisticsisthatitequipsthestudentandresearcher withtheskillsnecessarytodebunksimplisticfindings.
DESCRIPTIVEANDINFERENTIALSTATISTICS
Statistics,likeothercoursesofstudy,ismultifaceted.Itincludes“divisions”that areeachimportantinunderstandingthewhole.Twomajordivisionsaredescriptive andinferentialstatistics. Descriptivestatisticsaremethodstosummarizeand“boil down”theessenceofasetofinformationsothatitcanbeunderstoodmorereadily andfromdifferentvantagepoints.Weliveinaworldrichwithdata;descriptive statisticaltechniquesarewaysofmakingsenseofit.Usingthesestraightforward methodsallowstheresearchertodetectnumericalandvisualpatternsindatathat arenotimmediatelyapparent.
Inferentialstatisticsareadifferentmatteraltogether.Thesemethodsallowyouto makepredictionsaboutattitudes,behaviors,andpatternsonalargescalebasedon smallsetsof“sample”values.Inreallife,wearepresentedwithsituationsthatcannot provideuswithcertainty:Wouldanationaltrainingmethodimprovepatients’satisfactionratingsoftheirphysicians?Canwepredictworkers’healthscoresorlongevity inavarietyofindustriesbasedontheirjobpositions?Inferentialstatisticsallowus toinferormakeanobservationaboutanunknownvaluefromsamplevaluesthat areknown.Obviously,wecannotdothiswithabsolutecertainty–wedonotlive inatotallypredictableworld.Butwecandoitwithincertainboundsofprobability. Hopefully,statisticalprocedureswillallowustogetclosertocertaintythanwecould getwithoutthem.
THENATUREOFDATA:SCALESOFMEASUREMENT
ThefirststepinunderstandingcomplexrelationshipsliketheonesIdescribedearlieristobeabletounderstandanddescribethenatureofwhatdataareavailable toaresearcher.Weoftenjumpintoaresearchanalysiswithouttrulyunderstanding thefeaturesofthedataweareusing.Understandingthedataisaveryimportant stepbecauseitcanrevealhiddenpatternsanditcansuggestcustom-madestatistical proceduresthatwillresultinthestrongestfindings.
Oneofthefirstrealizationsbyresearchersisthatdatacomeinavarietyofsizes andshapes.Thatis,researchershavetoworkwithavailableinformationtomake statisticaldecisionsandthatinformationtakesmanyforms.Studentsareidentifiedas either“qualified”or“notqualified”forfreeorreducedlunches:
1.Workerseither“desireparticipation”or“donotdesireparticipation.”
2.Jobsatisfactionismeasuredbyworkerresponsestoseveralquestionnaireitems askingthemto“AgreeStrongly,”“Agree,”“NeitherAgreenorDisagree,” “Disagree,”or“DisagreeStrongly.”
3.Medicalresearchersmeasureworkers’physicalhealthbyhowmanydaysduringthelastmonththeirphysicalhealthwasgood.
NominalData
Thefirstexampleshowsthatdatacanbe“either–or”inthesensethattheyrepresent mutuallyexclusivecategories.Ifaworkerindicatesthatthey“desireparticipation” onasurveyinstrument,forexample,theywouldnotfitthe“donotdesireparticipation”category.Otherexamplesof“categorical”dataaresex(maleandfemale)and experimentalgroups(treatmentorcontrol).
Thistypeofdata,called“nominal,”doesnotrepresentacontinuum,withintermediatevalues.Eachvalueisaseparatecategoryonlyrelatedbythefacttheyare categoriesofsomelargervalue(e.g.,maleandfemalearebothvaluesofsex).These dataarecallednominalsincetherootofthewordindicates“names”ofcategories. Theyarealsoappropriatelycalled“categorical”data.
Theexamplesofnominaldatajustmentionedcanalsobeclassifiedas“dichotomous”sincetheyarenominaldatathathaveonlytwocategories.Nominaldataalso includevariableswithmorethantwocategoriessuchasschooling(e.g.,public,private,homeschooling).Wewilldiscusslaterthatdichotomousdatacancomeina varietyofformsalso,like“truedichotomies”inwhichthecategoriesnaturallyoccur likesex,and“dichotomizedvariables”thathavebeencreatedbytheresearcherfrom somedifferentkindofdata(likesatisfiedandnotsatisfiedworkers).Inallcases, nominaldatarepresentmutuallyexclusivecategories.Educatorstypicallyconfront nominaldatainclassifyingstudentsbygenderorrace,or,iftheyareconducting research,theyclassifygroupsas“treatment”and“control.”
Inordertoquantifythevariables,researchersassign numericalvaluestothecategories.Forexample,“treatmentgroups”mightbeassignedavalueof“1”and“control groups”mightbeassignedavalueof“2.”Inthesecases,thenumbersareonlycategories; theydonotrepresentactualmeasurements.Thus,acontrolgroupisnottwice atreatmentgroup.Thenumbersareonlyaconvenientwayofidentifyingthedifferentcategories.
Becausenominaldataarecategorical,wecannotusethemathematicaloperations ofaddition,subtraction,multiplication,anddivision.Itwouldmakenosensetodivide thenumberofJeepsinaparkinglot(onecategory)bythenumberofTeslasinthe sameparkinglot(secondcategory)togetasinglemeasureoftheautomobiles.In ordertogetanideaoftheautomobilesntheparkinglot,researcherswouldneedto identifythecategoriesofautomobilesandfindthepercentageofeachcategoryinthe parkinglot.Thus,wemightsaythatthereare15%Jeeps,2%Teslas,29%Toyotas, andsoonintheparkinglot.
OrdinalData
Thesecondexamplelistedintheprevioussection(THENATUREOFDATA: SCALESOFMEASUREMENT)indicatesanotherkindofdata:ordinaldata.These aredatawithasecondcharacteristicofmeaning,position.Theredataarealso categories,asinnominaldata,butwiththe categoriesrelatedby“morethan”and