Statistics and probability with applications for engineers and scientists using minitab, r and jmp,

Page 1


Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP,

Second Edition Bhisham C. Gupta

Visit to download the full and correct content document: https://ebookmass.com/product/statistics-and-probability-with-applications-for-engine ers-and-scientists-using-minitab-r-and-jmp-second-edition-bhisham-c-gupta/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Probability & Statistics for Engineers & Scientists, 9th Edition

Ronald E. Walpole

https://ebookmass.com/product/probability-statistics-forengineers-scientists-9th-edition-ronald-e-walpole/

Statistics for Engineers and Scientists 5th Edition

William Navidi

https://ebookmass.com/product/statistics-for-engineers-andscientists-5th-edition-william-navidi-2/

Statistics for Engineers and Scientists, 6th Edition

William Navidi

https://ebookmass.com/product/statistics-for-engineers-andscientists-6th-edition-william-navidi-2/

Statistics For Engineers and Scientists 6th Edition

William Navidi

https://ebookmass.com/product/statistics-for-engineers-andscientists-6th-edition-william-navidi/

Statistics for Engineers and Scientists 5th Edition

William Navidi

https://ebookmass.com/product/statistics-for-engineers-andscientists-5th-edition-william-navidi/

Applied Numerical Methods with Python for Engineers and Scientists Steven C. Chapra

https://ebookmass.com/product/applied-numerical-methods-withpython-for-engineers-and-scientists-steven-c-chapra/

Physics for Scientists and Engineers with Modern Physics 4th Edition Douglas C. Giancoli

https://ebookmass.com/product/physics-for-scientists-andengineers-with-modern-physics-4th-edition-douglas-c-giancoli/

Applied Numerical Methods with MATLAB for Engineers and Scientists, 5th Edition Steven C. Chapra

https://ebookmass.com/product/applied-numerical-methods-withmatlab-for-engineers-and-scientists-5th-edition-steven-c-chapra/

Applied Univariate, Bivariate, and Multivariate Statistics: Understanding Statistics for Social and Natural Scientists, With Applications in SPSS and R 2nd Edition Daniel J. Denis

https://ebookmass.com/product/applied-univariate-bivariate-andmultivariate-statistics-understanding-statistics-for-social-andnatural-scientists-with-applications-in-spss-and-r-2nd-editiondaniel-j-denis/

STATISTICSAND

PROBABILITYWITH APPLICATIONSFOR ENGINEERSAND

SCIENTISTSUSING MINITAB,RANDJMP

SecondEdition

BhishamC.Gupta

ProfessorEmeritusofStatistics

UniversityofSouthernMaine

Portland,ME

IrwinGuttman

ProfessorEmeritusofStatistics SUNYatBuffaloand

UniversityofToronto,Canada

KalankaP.Jayalath

AssistantProfessorofStatistics

UniversityofHouston–ClearLake

Houston,TX

Thissecondeditionfirstpublished2020

c 2020JohnWiley&Sons,Inc.

EditionHistory

JohnWiley&Sons,Inc.(1e,2013)

Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmitted, inanyformorbyanymeans,electronic,mechanical,photocopying,recordingorotherwise,exceptaspermitted bylaw.Adviceonhowtoobtainpermissiontoreusematerialfromthistitleisavailableathttp://www.wiley .com/go/permissions.

TherightofBhishamC.Gupta,IrwinGuttman,KalankaP.Jayalathtobeidentifiedastheauthorofthis workhasbeenassertedinaccordancewithlaw.

RegisteredOffice

JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA

EditorialOffice

111RiverStreet,Hoboken,NJ07030,USA

Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWileyproductsvisit usatwww.wiley.com.

Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Somecontentthat appearsinstandardprintversionsofthisbookmaynotbeavailableinotherformats.

LimitofLiability/DisclaimerofWarranty

Whilethepublisherandauthorshaveusedtheirbesteffortsinpreparingthiswork,theymakeno representationsorwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthisworkand specificallydisclaimallwarranties,includingwithoutlimitationanyimpliedwarrantiesofmerchantabilityor fitnessforaparticularpurpose.Nowarrantymaybecreatedorextendedbysalesrepresentatives,writtensales materialsorpromotionalstatementsforthiswork.Thefactthatanorganization,website,orproductis referredtointhisworkasacitationand/orpotentialsourceoffurtherinformationdoesnotmeanthatthe publisherandauthorsendorsetheinformationorservicestheorganization,website,orproductmayprovideor recommendationsitmaymake.Thisworkissoldwiththeunderstandingthatthepublisherisnotengagedin renderingprofessionalservices.Theadviceandstrategiescontainedhereinmaynotbesuitableforyour situation.Youshouldconsultwithaspecialistwhereappropriate.Further,readersshouldbeawarethat websiteslistedinthisworkmayhavechangedordisappearedbetweenwhenthisworkwaswrittenandwhenit isread.Neitherthepublishernorauthorsshallbeliableforanylossofprofitoranyothercommercial damages,includingbutnotlimitedtospecial,incidental,consequential,orotherdamages.

LibraryofCongressCataloging-in-PublicationData

Names:Gupta,BhishamC.,1942-author. | Guttman,Irwin,author. | Jayalath,KalankaP.,author.

Title:Statisticsandprobabilitywithapplicationsforengineersand scientistsusingMinitab,RandJMP/BhishamC.Gupta,IrwinGuttman, KalankaP.Jayalath.

Othertitles:Statisticsandprobabilitywithapplicationsforengineers andscientists

Description:Secondedition. | Hoboken:Wiley,2020. | Revisionof: Statisticsandprobabilitywithapplicationsforengineersand scientists.2013. | Includesbibliographicalreferencesandindex. Identifiers:LCCN2019035384(print) | LCCN2019035385(ebook) | ISBN 9781119516637(cloth) | ISBN9781119516644(adobepdf) | ISBN 9781119516620(epub)

Subjects:LCSH:Probabilities. | Mathematicalstatistics. Classification:LCCQA273.G852020(print) | LCCQA273(ebook) | DDC 519.5–dc23

LCrecordavailableathttps://lccn.loc.gov/2019035384

LCebookrecordavailableathttps://lccn.loc.gov/2019035385

CoverDesign:Wiley

CoverImage: c ioat/Shutterstock

Setin10/12ptComputerModernbySPiGlobal,Chennai,India

PrintedintheUnitedStatesofAmerica

10987654321

Inthelovingmemoryofmyparents,RoshanLalandSodhanDevi -Bhisham

Inthelovingmemoryofmyparents,AnnaandSamuelGuttman -Irwin

Inthelovingmemoryofmyparents,PremadasaJayalathandChandraUnanthanna -Kalanka

1.1.1MotivationfortheStudy .....................2

1.1.2Investigation ............................3

1.1.3ChangingCriteria .........................3

1.1.4ASummaryoftheVariousPhasesoftheInvestigation

1.2ASurvey..................................6

1.3AnObservationalStudy ..........................6

1.4ASetofHistoricalData..........................7

1.5ABriefDescriptionofWhatisCoveredinthisBook...

PARTIFUNDAMENTALSOFPROBABILITYAND STATISTICS

2DESCRIBINGDATAGRAPHICALLYANDNUMERICALLY13

2.1GettingStartedwithStatistics ......................14

2.1.1WhatIsStatistics?

2.1.2PopulationandSampleinaStatisticalStudy...

2.2ClassificationofVariousTypesofData

2.2.1NominalData

2.2.2OrdinalData

2.2.3IntervalData

2.2.4RatioData .............................19

2.3FrequencyDistributionTablesforQualitativeandQuantitativeData.20

2.3.1QualitativeData ..........................21

2.3.2QuantitativeData

2.4GraphicalDescriptionofQualitativeandQuantitativeData

2.4.1DotPlot.

2.4.2PieChart

2.4.3BarChart

2.4.4Histograms

2.4.5LineGraph

2.4.6Stem-and-LeafPlot

2.5NumericalMeasuresofQuantitativeData

2.5.1MeasuresofCentrality .......................51

2.5.2MeasuresofDispersion

2.6NumericalMeasuresofGroupedData..................67

2.6.1MeanofaGroupedData

2.6.2MedianofaGroupedData ....................68

2.6.3ModeofaGroupedData .....................69

2.6.4VarianceofaGroupedData ....................69

2.7MeasuresofRelativePosition .......................70

2.7.1Percentiles ..............................71

2.7.2Quartiles ..............................72

2.7.3InterquartileRange(IQR)

2.7.4CoefficientofVariation ......................73

2.8Box-WhiskerPlot..............................75

2.8.1ConstructionofaBoxPlot ....................75

2.8.2HowtoUsetheBoxPlot .....................76

2.9MeasuresofAssociation ..........................80

2.10CaseStudies.. ..............................84

2.10.1AboutSt.Luke’sHospital .....................85

2.11UsingJMP... ..............................86

3ELEMENTSOFPROBABILITY97

3.1Introduction... ..............................97

3.2RandomExperiments,SampleSpaces,andEvents...........98

3.2.1RandomExperimentsandSampleSpaces

3.2.2Events.. ..............................99

3.3ConceptsofProbability ..........................103

3.4TechniquesofCountingSamplePoints..................108

3.4.1TreeDiagram ............................108

3.4.2Permutations ............................110

3.4.3Combinations ............................110

3.4.4Arrangementsof n ObjectsInvolvingSeveralKindsofObjects111

3.5ConditionalProbability ..........................113

3.6Bayes’sTheorem..............................116

3.7IntroducingRandomVariables ......................120 ReviewPracticeProblems.........................122

4DISCRETERANDOMVARIABLESANDSOMEIMPORTANT DISCRETEPROBABILITYDISTRIBUTIONS128

4.1GraphicalDescriptionsofDiscreteDistributions .............129

4.2MeanandVarianceofaDiscreteRandomVariable...........130

4.2.1ExpectedValueofDiscreteRandomVariablesandTheirFunctions130

4.2.2TheMoment-GeneratingFunction-ExpectedValueofaSpecial Functionof X ............................133

4.3TheDiscreteUniformDistribution

4.4TheHypergeometricDistribution

4.5TheBernoulliDistribution

4.6TheBinomialDistribution

4.7TheMultinomialDistribution .......................146

4.8ThePoissonDistribution ..........................147

4.8.1DefinitionandPropertiesofthePoissonDistribution ......147

4.8.2PoissonProcess ...........................148

4.8.3PoissonDistributionasaLimitingFormoftheBinomial ....148

4.9TheNegativeBinomialDistribution ...................153

4.10SomeDerivationsandProofs(Optional) .................156

4.11ACaseStudy.. ..............................156

4.12UsingJMP... ..............................157 ReviewPracticeProblems.........................157

5CONTINUOUSRANDOMVARIABLESANDSOMEIMPORTANT CONTINUOUSPROBABILITYDISTRIBUTIONS164

5.1ContinuousRandomVariables.......................165

5.2MeanandVarianceofContinuousRandomVariables..........168

5.2.1ExpectedValueofContinuousRandomVariablesandTheir Functions ..............................168

5.2.2TheMoment-GeneratingFunctionandExpectedValueofaSpecial Functionof X ............................171

5.3Chebyshev’sInequality ...........................173

5.4TheUniformDistribution .........................175

5.4.1DefinitionandProperties .....................175

5.4.2MeanandStandardDeviationoftheUniformDistribution..178

5.5TheNormalDistribution ..........................180

5.5.1DefinitionandProperties .....................180

5.5.2TheStandardNormalDistribution ................182

5.5.3TheMoment-GeneratingFunctionoftheNormalDistribution.187

5.6DistributionofLinearCombinationofIndependentNormalVariables.189

5.7ApproximationoftheBinomialandPoissonDistributionsbytheNormal Distribution... ..............................193

5.7.1ApproximationoftheBinomialDistributionbytheNormal Distribution .............................193

5.7.2ApproximationofthePoissonDistributionbytheNormal Distribution .............................196

5.8ATestofNormality ............................196

5.9ProbabilityModelsCommonlyusedinReliabilityTheory .......201

5.9.1TheLognormalDistribution ....................202

5.9.2TheExponentialDistribution ...................206

5.9.3TheGammaDistribution .....................211

5.9.4TheWeibullDistribution .....................214

5.10ACaseStudy.. ..............................218

5.11UsingJMP... ..............................219 ReviewPracticeProblems.........................220

6DISTRIBUTIONOFFUNCTIONSOFRANDOMVARIABLES228

6.1Introduction... ..............................229

6.2DistributionFunctionsofTwoRandomVariables ............229

6.2.1CaseofTwoDiscreteRandomVariables .............229

6.2.2CaseofTwoContinuousRandomVariables ...........232

6.2.3TheMeanValueandVarianceofFunctionsofTwoRandom Variables..............................233

6.2.4ConditionalDistributions .....................235

6.2.5CorrelationbetweenTwoRandomVariables ...........238

6.2.6BivariateNormalDistribution ...................241

6.3ExtensiontoSeveralRandomVariables.................244

6.4TheMoment-GeneratingFunctionRevisited ...............245 ReviewPracticeProblems.........................249

7SAMPLINGDISTRIBUTIONS253

7.1RandomSampling.............................253

7.1.1RandomSamplingfromanInfinitePopulation..

7.1.2RandomSamplingfromaFinitePopulation ...........256

7.2TheSamplingDistributionoftheSampleMean

7.2.1NormalSampledPopulation

7.2.2NonnormalSampledPopulation

7.2.3TheCentralLimitTheorem

7.3SamplingfromaNormalPopulation ...................264

7.3.1TheChi-SquareDistribution ...................264

7.3.2TheStudent t-Distribution

7.3.3Snedecor’s F -Distribution .....................276

7.4OrderStatistics. ..............................279

7.4.1DistributionoftheLargestElementinaSample. .......280

7.4.2DistributionoftheSmallestElementinaSample. .......281

7.4.3DistributionoftheMedianofaSampleandofthe kth Order Statistic. ..............................282

7.4.4OtherUsesofOrderStatistics ..................284

7.5UsingJMP.................................286 ReviewPracticeProblems.........................286

8ESTIMATIONOFPOPULATIONPARAMETERS289

8.1Introduction... ..............................290

8.2PointEstimatorsforthePopulationMeanandVariance. .......290

8.2.1PropertiesofPointEstimators ..................292

8.2.2MethodsofFindingPointEstimators ..............295

8.3IntervalEstimatorsfortheMean μ ofaNormalPopulation ......301

8.3.1 σ 2

8.3.2 σ 2

8.3.3SampleSizeIsLarge ........................306

8.4IntervalEstimatorsforTheDifferenceofMeansofTwoNormal Populations... ..............................313

8.4.1VariancesAreKnown .......................313

8.4.2VariancesAreUnknown ......................314

8.5IntervalEstimatorsfortheVarianceofaNormalPopulation ......322

8.6IntervalEstimatorfortheRatioofVariancesofTwoNormal Populations... ..............................327

8.7PointandIntervalEstimatorsfortheParametersofBinomialPopulations331

8.7.1OneBinomialPopulation .....................331

8.7.2TwoBinomialPopulations .....................334

8.8DeterminationofSampleSize .......................338

8.8.1OnePopulationMean .......................339

8.8.2DifferenceofTwoPopulationMeans ...............339

8.8.3OnePopulationProportion ....................340

8.8.4DifferenceofTwoPopulationProportions ............341

8.9SomeSupplementalInformation ......................343

8.10ACaseStudy.. ..............................343

8.11UsingJMP... ..............................343 ReviewPracticeProblems.........................344

9HYPOTHESISTESTING352

9.1Introduction... ..............................353

9.2BasicConceptsofTestingaStatisticalHypothesis

9.2.1HypothesisFormulation ......................353

9.2.2RiskAssessment ..........................355

9.3TestsConcerningtheMeanofaNormalPopulationHavingKnown Variance...................................358

9.3.1CaseofaOne-Tail(Left-Sided)Test

9.3.2CaseofaOne-Tail(Right-Sided)Test ..............362

9.3.3CaseofaTwo-TailTest ......................363

9.4TestsConcerningtheMeanofaNormalPopulationHavingUnknown Variance...................................372

9.4.1CaseofaLeft-TailTest ......................372

9.4.2CaseofaRight-TailTest .....................373

9.4.3TheTwo-TailCase .........................374

9.5LargeSampleTheory............................378

9.6TestsConcerningtheDifferenceofMeansofTwoPopulationsHaving DistributionswithKnownVariances ...................380

9.6.1TheLeft-TailTest .........................380

9.6.2TheRight-TailTest ........................381

9.6.3TheTwo-TailTest .........................383

9.7TestsConcerningtheDifferenceofMeansofTwoPopulationsHaving NormalDistributionswithUnknownVariances .............388

9.7.1TwoPopulationVariancesareEqual ...............388

9.7.2TwoPopulationVariancesareUnequal ..............392

9.7.3ThePaired t-Test..........................395

9.8TestingPopulationProportions ......................401

9.8.1TestConcerningOnePopulationProportion ...........401

9.8.2TestConcerningtheDifferenceBetweenTwoPopulation Proportions .............................405

9.9TestsConcerningtheVarianceofaNormalPopulation..

9.10TestsConcerningtheRatioofVariancesofTwoNormalPopulations.414

9.11TestingofStatisticalHypothesesusingConfidenceIntervals ......418

9.12SequentialTestsofHypotheses ......................422

9.12.1AOne-TailSequentialTestingProcedure

9.12.2ATwo-TailSequentialTestingProcedure

9.13CaseStudies..

9.14UsingJMP...

PARTIISTATISTICSINACTIONS

10.1.1TheHazardRateFunction

10.4Estimation:WeibullDistribution

10.5CaseStudies..

11ONDATAMINING476

11.1Introduction...

11.2WhatisDataMining?

11.2.1BigData

11.3DataReduction.

11.4DataVisualization

11.5DataPreparation

11.5.1MissingData

11.5.2OutlierDetectionandRemedialMeasures

11.6.1EvaluatingaClassificationModel

11.7DecisionTrees..

11.7.1ClassificationandRegressionTrees(CART)

11.7.2FurtherReading

11.8CaseStudies..

11.9UsingJMP...

12.1Introduction...

12.2SimilarityMeasures

12.2.1CommonSimilarityCoefficients

12.3HierarchicalClusteringMethods

12.3.1SingleLinkage

12.3.2CompleteLinkage

12.3.3AverageLinkage

12.3.4Ward’sHierarchicalClustering

12.4NonhierarchicalClusteringMethods

12.4.1 K

12.5Density-BasedClustering

12.7ACaseStudy..

12.8UsingJMP...

13ANALYSISOFCATEGORICALDATA558

13.1Introduction...

13.2TheChi-SquareGoodness-of-FitTest

13.3ContingencyTables

13.3.1The2 × 2CasewithKnownParameters

13.3.2The2 × 2CasewithUnknownParameters

13.3.3The r × s ContingencyTable

13.4Chi-SquareTestforHomogeneity

13.5CommentsontheDistributionoftheLack-of-FitStatistics

13.6CaseStudies..

13.7UsingJMP... ..............................584

14NONPARAMETRICTESTS591

14.1Introduction... ..............................591

14.2TheSignTest.. ..............................592

14.2.1One-SampleTest ..........................592

14.2.2TheWilcoxonSigned-RankTest .................595

14.2.3Two-SampleTest ..........................598

14.3Mann–Whitney(Wilcoxon) W TestforTwoSamples..........604

14.4RunsTest ..................................608

14.4.1RunsaboveandbelowtheMedian ................608

14.4.2TheWald–WolfowitzRunTest ..................611

14.5SpearmanRankCorrelation ........................614

14.6UsingJMP... ..............................618 ReviewPracticeProblems.........................618

15SIMPLELINEARREGRESSIONANALYSIS622

15.1Introduction... ..............................623

15.2FittingtheSimpleLinearRegressionModel ...............624

15.2.1SimpleLinearRegressionModel .................624

15.2.2FittingaStraightLinebyLeastSquares .............627

15.2.3SamplingDistributionoftheEstimatorsofRegressionCoefficients631

15.3UnbiasedEstimatorof σ 2 .........................637

15.4FurtherInferencesConcerningRegressionCoefficients(β0 , β 1 ), E (Y ),and Y .......................................639

15.4.1ConfidenceIntervalfor β1 withConfidenceCoefficient(1 α).639

15.4.2ConfidenceIntervalfor β0 withConfidenceCoefficient(1 α).640

15.4.3ConfidenceIntervalfor E (Y |X )withConfidenceCoefficient (1 α)...............................642

15.4.4PredictionIntervalforaFutureObservation Y withConfidence Coefficient(1 α).........................645

15.5TestsofHypothesesfor β0 and β1 .....................652

15.5.1TestofHypothesesfor β1 .....................652

15.5.2TestofHypothesesfor β0 .....................652

15.6AnalysisofVarianceApproachtoSimpleLinearRegressionAnalysis.659

15.7ResidualAnalysis ..............................665

15.8Transformations ..............................674

15.9InferenceAbout ρ .............................681

15.10ACaseStudy.. ..............................683

15.11UsingJMP...

16MULTIPLELINEARREGRESSIONANALYSIS693

16.1Introduction... ..............................694

16.2MultipleLinearRegressionModels ....................694

16.3EstimationofRegressionCoefficients ...................699

16.3.1EstimationofRegressionCoefficientsUsingMatrixNotation.701

16.3.2PropertiesoftheLeast-SquaresEstimators ...........703

16.3.3TheAnalysisofVarianceTable ..................704

16.3.4MoreInferencesaboutRegressionCoefficients... .......706

16.4MultipleLinearRegressionModelUsingQuantitativeandQualitative PredictorVariables.............................714

16.4.1SingleQualitativeVariablewithTwoCategories. .......714

16.4.2SingleQualitativeVariablewithThreeorMoreCategories...716

16.5StandardizedRegressionCoefficients

16.5.1Multicollinearity ..........................728

16.5.2ConsequencesofMulticollinearity

16.6BuildingRegressionTypePredictionModels ..............730

16.6.1FirstVariabletoEnterintotheModel ..............730

16.7ResidualAnalysisandCertainCriteriaforModelSelection ......734

16.7.1ResidualAnalysis ..........................734

16.7.2CertainCriteriaforModelSelection

16.8LogisticRegression .............................740

16.9CaseStudies.. ..............................745

17.2.1EstimableParameters

17.2.2EstimableFunctions ........................760

17.3One-WayExperimentalLayouts

17.3.1TheModelandItsAnalysis

17.3.2ConfidenceIntervalsforTreatmentMeans ............767

17.3.3MultipleComparisons .......................773

17.3.4DeterminationofSampleSize ...................780

17.3.5TheKruskal–WallisTestforOne-WayLayouts(Nonparametric Method)...............................781

17.4RandomizedCompleteBlock(RCB)Designs ..............785

17.4.1TheFriedman Fr -TestforRandomizedCompleteBlockDesign (NonparametricMethod).....................792

17.4.2ExperimentswithOneMissingObservationinanRCB-Design Experiment.............................794

17.4.3ExperimentswithSeveralMissingObservationsinanRCB-Design Experiment.............................795

17.5Two-WayExperimentalLayouts .....................798

17.5.1Two-WayExperimentalLayoutswithOneObservationperCell800

17.5.2Two-WayExperimentalLayoutswith r> 1ObservationsperCell801

17.5.3BlockinginTwo-WayExperimentalLayouts ...........810

17.5.4ExtendingTwo-WayExperimentalDesignsto n-Way ExperimentalLayouts.......................811

17.6LatinSquareDesigns ............................813

17.7Random-EffectsandMixed-EffectsModels ................820

17.7.1Random-EffectsModel .......................820

17.7.2Mixed-EffectsModel ........................822

17.7.3Nested(Hierarchical)Designs ...................824

18.1Introduction...

18.2TheFactorialDesigns

18.3The2k FactorialDesigns..........................850

18.4Unreplicated2k FactorialDesigns.....................859

18.5Blockinginthe2k FactorialDesign....................867

18.5.1Confoundinginthe2k FactorialDesign.............867

18.5.2Yates’sAlgorithmforthe2k FactorialDesigns.........875

18.6The2k FractionalFactorialDesigns ....................877

18.6.1One-halfReplicateofa2k FactorialDesign...........877

18.6.2One-quarterReplicateofa2k FactorialDesign.........882

18.7CaseStudies..

18.8UsingJMP...

19.1Introduction...

19.1.1BasicConceptsofResponseSurfaceMethodology. .......898 19.2First-OrderDesigns

19.3Second-OrderDesigns

19.3.1CentralCompositeDesigns(CCDs)

19.3.2SomeOtherFirst-OrderandSecond-OrderDesigns .......928

19.4DeterminationofOptimumorNear-OptimumPoint...

19.4.1TheMethodofSteepestAscent

19.4.2AnalysisofaFittedSecond-OrderResponseSurface

19.5AnovaTableforaSecond-OrderModel

19.6CaseStudies..

19.7UsingJMP...

Modernstatisticiansarefamiliarwiththenotionthatanyfinitebodyofdatacontainsonly alimitedamountofinformationonanypointunderexamination;thatthislimitissetby thenatureofthedatathemselves,andcannotbeincreasedbyanyamountofingenuity expendedintheirstatisticalexamination:thatthestatistician’stask,infact,islimitedto theextractionofthewholeoftheavailableinformationonanyparticularissue.

Preface

AUDIENCE

Thisisanintroductorytextbookinappliedstatisticsandprobabilityforundergraduate studentsinengineeringandthenaturalsciences.Itbeginsatalevelsuitableforthosewith nopreviousexposuretoprobabilityandstatisticsandcarriesthereaderthroughtoalevel ofproficiencyinvarioustechniquesofstatistics.Thistextisdividedintotwoparts:PartI discussesdescriptivestatistics,conceptsofprobability,probabilitydistributions,sampling distributions,estimation,andtestingofhypotheses,andPartIIdiscussesvarioustopics ofappliedstatistics,includingsomereliabilitytheory,datamining,clusteranalysis,some nonparametrictechniques,categoricaldataanalysis,simpleandmultiplelinearregression analysis,designandanalysisofvariancewithemphasison2k factorialdesigns,response surfacemethodology,andstatisticalqualitycontrolchartsofphaseIandphaseII.

Thistextissuitableforaone-ortwo-semesterundergraduatecoursesequence.The presentationofmaterialgivesinstructorsalotofflexibilitytopickandchoosetopics theyfeelshouldmakeupthecoverageofmaterialfortheircourses.However,wefeelthat inthefirstcourseforengineersandsciencemajors,onemaycoverChapter1and2,a briefdiscussionofprobabilityinChapter3,selecteddiscreteandcontinuousdistributions fromChapter4and5withmoreemphasisonnormaldistribution,Chapter7–9,and coupleoftopicsfromPartIIthatmeettheneedsandinterestsoftheparticulargroupof students.Forexample,somediscussionofthematerialonregressionanalysisanddesign ofexperimentsinChapter15and17mayservewell.Chapter11and12maybeadequate tomotivatestudents’interestindatascienceanddataanalytics.Atwo-semestercourse maycovertheentirebook.Theonlyprerequisiteisafirstcourseincalculus,whichall engineeringandsciencestudentsarerequiredtotake.Becauseofspaceconsiderations, someproofsandderivations,certainadvancedleveltopicsofinterest,includingChapter 20and21onstatisticalqualitycontrolchartsofphaseIandphaseII,arenotincludedin thetextbutareavailablefordownloadviathebook’swebsite:www.wiley.com/college/ gupta/statistics2e.

MOTIVATION

Studentsencounterdata-analysisproblemsinmanyareasofengineeringornaturalscience curricula.Engineersandscientistsintheirprofessionallivesoftenencountersituations requiringanalysisofdataarisingfromtheirareasofpractice.Veryoften,theyhaveto plantheinvestigationthatgeneratesdata(anactivityeuphemisticallycalledthedesign ofexperiments),analyzesthedataobtained,andinterpretstheresults.Otherproblems andinvestigationsmaypertaintothemaintenanceofqualityofexistingproductsorthe developmentofnewproductsortoadesiredoutcomeinaninvestigationoftheunderlying mechanismsgoverningacertainprocess.Knowinghowto“design”aparticularinvestigationtoobtainreliabledatamustbecoupledwithknowledgeofdescriptiveandinferential statisticaltoolstoanalyzeproperlyandinterpretsuchdata.Theintentofthistextbookis

toexposetheuninitiatedtostatisticalmethodsthatdealwiththegenerationofdatafor different(butfrequentlymet)typesofinvestigationsandtodiscusshowtoanalyzeand interpretthegenerateddata.

HISTORY

ThistexthasitsrootsinthethreeeditionsofIntroductoryEngineeringStatistics,first co-authoredbyIrwinGuttmanandthelate,greatSamuelWilks.ProfessorJ.Stuart Hunter(PrincetonUniversity),oneofthefinestexpositorsinthestatisticsprofession,a notedresearcher,andacolleagueofProfessorWilks,joinedProfessorGuttmantoproduce editionstwoandthree.AlleditionswerepublishedbyJohnWiley&Sons,withthethird editionappearingin1982.Thefirsteditionofthecurrenttextwaspublishedin2013.

APPROACH

Inthistext,weemphasizebothdescriptiveandinferentialstatistics.Wefirstgivedetailsof descriptivestatisticsandthencontinuewithanelementarydiscussionofthefundamentals ofprobabilitytheoryunderlyingmanyofthestatisticaltechniquesdiscussedinthistext. Wenextcoverawiderangeofstatisticaltechniquessuchasstatisticalestimation,regressionmethods,nonparametricmethods,elementsofreliabilitytheory,statisticalquality control(withemphasisonphaseIandphaseIIcontrolcharts),andprocesscapability indices,andthelike.Afeatureofthesediscussionsisthatallstatisticalconceptsaresupportedbyalargenumberofexamplesusingdataencounteredinreal-lifesituations.We alsoillustratehowthestatisticalpackagesMINITAB R Version18,R R Version3.5.1,and JMP R Version9,maybeusedtoaidintheanalysisofvariousdatasets.

Anotherfeatureofthistextisthecoverageatanadequateandunderstandablelevel ofthedesignofexperiments.Thisincludesadiscussionofrandomizedblockdesigns, one-andtwo-waydesigns,Latinsquaredesigns,2k factorialdesigns,responsesurface designs,amongothers.Thelatestversionofthistextcoversmaterialsonsupervisedand unsupervisedlearningtechniquesusedindataminingandclusteranalysiswithagreat exposureinstatisticalcomputingusingRsoftwareandMINITAB.Aspreviouslyindicated, allthisisillustratedwithreal-lifesituationsandaccompanyingdatasets,supportedby MINITAB,R,andJMP.Weknowofnootherbookinthemarketthatcoversallthese softwarepackages.

WHATISNEWINTHISEDITION

Afteracarefulinvestigationofthecurrenttechnologicaladvancementinstatistical softwareandrelatedapplicationsaswellasthefeedbackreceivedfromthecurrent usersofthetext,wehavesuccessfullyincorporatedmanychangesinthisnewedition.

• RsoftwareexhibitsalongwiththeirRcodeareincluded.

• AdditionalRsoftwarehelpforbeginnersisincludedinAppendixD.

• MINITABsoftwareinstructionsandcontentsareupdatedtoitslatestedition.

• JMPsoftwareinstructionsandcontentsareupdatedtoitslatestedition.

• NewchaptersonDataMiningandClusteranalysisareincluded.

• AnimprovedchapteronResponseSurfaceDesignhasbroughtbacktotheprinted copyfromthebookwebsite.

• The p-valueapproachisemphasized,andrelatedpracticalinterpretationsare included.

• Thevisibilityofthetheoremsanddefinitionsareimprovedandwellformatted.

• Graphicalexhibitsareprovidedtoimprovethevisualizations.

Aspreviouslyindicated,weincorporateMINITABandRthroughoutthetextand completeRexhibitswiththeiroutputs(AppendixD)andassociatedJMPexhibits areavailableonthebook’swebsite:www.wiley.com/college/gupta/statistics2e.Our step-by-stepapproachtotheuseofthesoftwarepackagesmeansnopriorknowledge oftheiruseisrequired.Aftercompletingacoursethatusesthistext,studentswill beabletousethesesoftwarepackagestoanalyzestatisticaldataintheirfieldsof interest.

BreadthofCoverage

Besidesthecoverageofmanypopularstatisticaltechniques,weincludediscussion ofcertainaspectsofsamplingdistributions,nonparametrictests,reliabilitytheory, datamining,clusteranalysis,analysisofcategoricaldata,simpleandmultiplelinear regression,designofexperiments,responsesurfacemethodology,andphaseIand phaseIIcontrolcharts.

Designofexperiments,responsesurfacemethodology,regressionanalysis aretreatedinsufficientbreadthanddepthtobeappropriateforatwo-coursesequencein engineeringstatisticsthatincludesprobabilityandthedesignofexperiments.

Realdata inexamplesandhomeworkproblemsillustratetheimportanceofstatistics andprobabilityasatoolforengineersandscientistsintheirprofessionallives.Allthedata setswith20ormoredatapointsareavailableonthewebsiteinthreeformats:MINITAB, MicrosoftExcel,andJMP.

Casestudies inmostchaptersfurtherillustratetheimportanceofstatisticaltechniquesinprofessionalpractice.

STUDENTRESOURCES

Datasetsforallexamplesandhomeworkexercises fromthetextareavailable tostudentsonthewebsiteinMINITAB,MicrosoftExcel,andJMPformat.The sampledatasetsweregeneratedusingwell-knownstatisticalsamplingprocedures,

SoftwareIntegration

ensuringthatwearedealingwithrandomsamples.Aninklingofwhatthismayentail isgiventhroughoutthetext(see,forexample,Section7.1.2).Thefieldofsampling isanactivetopicamongresearchstatisticiansandpractitioners,andreferencesto samplingtechniquesarewidelyavailableinbooksandjournalarticles.Someofthese referencesareincludedinthebibliographysection.

Otherresourcesonthebookwebsitewww.wiley.com/college/gupta/statistics2e availablefordownloadinclude:

SolutionsManual toalloddnumberedhomeworkexercisesinthetext.

INSTRUCTORRESOURCES

Thefollowingresourcesareavailabletoadoptinginstructorsonthetextbook website:www.wiley.com/college/gupta/statistics2e.

SolutionsManual toallhomeworkexercisesinthetext. Lectureslides toaidinstructorspreparingforlectures. Datasetsforallexamplesandhomeworkexercises fromthebook,inthree formats:Minitab,MicrosoftExcel,andJMP.

Errata Wehavethoroughlyreviewedthetexttomakesureitisaserror-freeas possible.However,anyerrorsdiscoveredwillbelistedonthetextbookwebsite.Ifyou encounteranyerrorsasyouareusingthebook,pleasesendthemdirectlytotheauthors bcgupta@maine.edu,sothattheerrorscanbecorrectedinatimelymanneronthewebsite, andforfutureeditions.Wealsowelcomeanysuggestionsforimprovementyoumayhave, andthankyouinadvanceforhelpingusimprovethebookforfuturereaders.

Acknowledgments

Wearegratefultothefollowingreviewersandcolleagueswhosecommentsandsuggestions wereinvaluableinimprovingthetext:

ZaidAbdo,UniversityofIdaho

ErinBaker,UniversityofMassachusetts

BobBarnet,UniversityofWisconsin-Platteville

RajChhikara,UniversityofHouston,ClearLake PremGoel,OhioStateUniversity

Boetticher,GaryD,UniversityofHouston,ClearLake

MarkGebert,UniversityofKentucky

SubirGhosh,UniversityofCalifornia,Riverside RameshGupta,UniversityofMaine

RameshwarGupta,UniversityofNewBrunswick,Canada XiaochunJiang,NorthCarolinaAgriculturalandTechnicalStateUniversity

DennisJohnston,BaylorUniversity

GeraldKeller,JosephL.RotmanSchoolofManagement,UniversityofToronto KyungdukKo,BoiseStateUniversity

PaulKvam,GeorgiaInstituteofTechnology

BinLi.LouisianaStateUniversity

ThunshunLiao,LouisianaStateUniversity

Jye-ChyiLu,GeorgiaInstituteofTechnology

SumonaMondal,ClarksonUniversity

JanbiaoPan,CaliforniaPolyStateUniversity

AnastassiosPerakis,UniversityofMichigan

DavidPowers,ClarksonUniversity

AliTouran,NortheasternUniversity

LeighWilliams,VirginiaPolytechnicandStateUniversity TianZheng,ColumbiaUniversity JingyiZhu,UniversityofUtah

WethankWilliamBelcher,DarwinDavis,JulieEllis,PushpaGupta,MohamadIbourk, JamesLucas,MaryMcShane-Vaughn,LouisNeveux,andPhilRamseywhohelpedfind suitabledatasetsforthecasestudies.WealsothankLaurieMcDermottforherhelpin typingsomepartsofthismanuscript.SpecialthanksareduetoEricLaflammeforhelping writeJMP/ExcelproceduresandcreatingPowerPointpresentations,GeorgeBernierfor helpingwriteExcelworkbooksandmacros,andPatriciaMillerandBrendaTownsendfor editingPowerPointSlidesandsomepartsofthemanuscript.WeappreciateTerryScott andChamilaMeetiyagodaforreviewingRcodesandneweditionofthemanuscript. WeacknowledgeMinitabInc.andSASInstituteInc.forpermittingustoprint MINITABandJMPscreenshotsinthisbook.WealsoacknowledgetheRCoreteamfor allowingustouseopenaccessRsoftware.

PortionsofthetextarereproducedwithpermissionfromtheAmericanSocietyfor Quality(ASQ),AppliedStatisticsfortheSixSigmaGreenBeltandStatisticalQuality ControlfortheSixSigmaGreenBeltbyGuptaandFredWalker(2005,2007).

WewouldalsoliketoexpressourthanksandappreciationtotheindividualsatJohn Wiley,fortheirsupport,confidence,andguidanceaswehaveworkedtogethertodevelop thisproject.

Theauthorswouldliketogratefullythanktheirfamilies.Bhishamacknowledges thepatienceandsupportofhiswife,Swarn;daughters,AnitaandAnjali;son,Shiva; sons-in-law,PrajayandMark;daughter-in-law,Aditi;andwonderfulgrandchildren,Priya, Kaviya,Ayush,Amari,Sanvi,AvniandDylan.Fortheirpatienceandsupport,Irwinis gratefultohiswife,Mary;son,Daniel;daughters,KarenandShaun;wonderfulgrandchildren,Liam,Teia,andSebastian;brothersandtheirbetterhalves,AlvinandRita, andStanleyandGloria.Kalankaappreciatesthesupportofhiswife,Chamila;daughters, NesandiandMinudi.

BHISHAMGUPTA IRWINGUTTMAN KALANKAJAYALATH

AboutTheCompanionSite

Thisbookisaccompaniedbyacompanionwebsite:

www.wiley.com/college/gupta/statistics2e

Thewebsiteincludesmaterialsforstudentsandinstructors:

Instructors

Chapters20and21

Datasets

PowerPointpresentations

Completesolutionsmanual

Certainproofsandderivations

Somestatisticaltables

JMPfiles

Rexhibits

Students

Chapters20and21

Datasets

PartialsolutionsManual

Certainproofsandderivations

Somestatisticaltables

JMPfiles

Rexhibits

Chapter1 INTRODUCTION

Statistics,thediscipline,isthestudyofthescientificmethod.Inpursuingthisdiscipline, statisticianshavedevelopedasetoftechniquesthatareextensivelyusedtosolveproblems inanyfieldofscientificendeavor,suchasintheengineeringsciences,biologicalsciences, andthechemical,pharmaceutical,andsocialsciences.

Thisbookisconcernedwithdiscussingthesetechniquesandtheirapplicationsfor certainexperimentalsituations.Itbeginsatalevelsuitableforthosewithnoprevious exposuretoprobabilityandstatisticsandcarriesthereaderthroughtoalevelofproficiency invarioustechniquesofstatistics.

Inallscientificareas,whetherengineering,biologicalsciences,medicine,chemical, pharmaceutical,orsocialsciences,scientistsareinevitablyconfrontedwithproblemsthat needtobeinvestigated.Considersomeexamples:

• Anengineerwantstodeterminetheroleofanelectroniccomponentneededtodetect themalfunctionoftheengineofaplane.

• Abiologistwantstostudyvariousaspectsofwildlife,theoriginofadisease,orthe geneticaspectsofawildanimal.

• Amedicalresearcherisinterestedindeterminingthecauseofacertaintypeofcancer.

• Amanufactureroflenseswantstostudythequalityofthefinishingonintraocularlenses.

• Achemistisinterestedindeterminingtheeffectofacatalystintheproductionof low-densitypolyethylene.

• Apharmaceuticalcompanyisinterestedindevelopingavaccinationforswineflu.

• Asocialscientistisinterestedinexploringaparticularaspectofhumansociety.

Inalloftheexamples,thefirstandforemostworkistodefineclearlytheobjective ofthestudyandpreciselyformulatetheproblem.Thenextimportantstepistogather informationtohelpdeterminewhatkeyfactorsareaffectingtheproblem.Remember thattodeterminethesefactorssuccessfully,youshouldunderstandnotmerelystatistical methodologybutrelevantnonstatisticalknowledgeaswell.Oncetheproblemisformulatedandthekeyfactorsoftheproblemareidentified,thenextstepistocollectthe

StatisticsandProbabilitywithApplicationsforEngineersandScientistsusingMINITAB,RandJMP, SecondEdition.BhishamC.Gupta,IrwinGuttman,andKalankaP.Jayalath. c 2020JohnWiley&Sons,Inc.Published2020byJohnWiley&Sons,Inc. Companionwebsite:www.wiley.com/college/gupta/statistics2e

data.Therearevariousmethodsofdatacollecting.Fourbasicmethodsofstatisticaldata collectingareasfollows:

• Adesignedexperiment

• Asurvey

• Anobservationalstudy

• Asetofhistoricaldata,thatis,datacollectedbyanorganizationoranindividualinan earlierstudy

1.1DESIGNEDEXPERIMENT

Wediscusstheconceptofadesignedexperimentwithanexample,“DevelopmentofScreeningFacilityforStormWaterOverflows”(takenfromBoxetal.,1978,andusedwith permission).Theexampleillustrateshowasequenceofexperimentscanenablescientists togainknowledgeofthevarious importantfactors affectingtheproblemandgiveinsight intotheobjectivesoftheinvestigation.Italsoindicateshowunexpectedfeaturesofthe problemcanbecomedominant,andhowexperimentaldifficultiescanoccursothatcertain plannedexperimentscannotberunatall.Mostofall,thisexampleshowstheimportance ofcommonsenseintheconductofanyexperimentalinvestigation.Thereadermayrightly concludefromthisexamplethatthecourseofarealinvestigation,likethatoftruelove, seldomrunssmoothly,althoughtheeventualoutcomemaybesatisfactory.

1.1.1MotivationfortheStudy

Duringheavyrainstorms,thetotalflowcomingtoasewagetreatmentplantmayexceed itscapacity,makingitnecessarytobypasstheexcessflowaroundthetreatmentplant, asshowninFigure1.1.1a.Unfortunately,thestormoverflowofuntreatedsewagecauses pollutionofthereceivingbodyofwater.Apossiblealternative,sketchedinFigure1.1.1b,is toscreenmostofthesolidsoutoftheoverflowinsomewayandreturnthemtotheplantfor treatment.Onlythelessobjectionablescreenedoverflowisdischargeddirectlytotheriver.

Todeterminewhetheritwaseconomicaltoconstructandoperatesuchascreening facility,theFederalWaterPollutionControlAdministrationoftheDepartmentofthe InteriorsponsoredaresearchprojectattheSullivanGulchpumpstationinPortland, Oregon.Usually,theflowtothepumpstationwas20milliongallonsperday(mgd),but duringastorm,theflowcouldexceed50mgd.

Figure1.1.2ashowstheoriginalversionoftheexperimentalscreeningunit,which couldhandleapproximately1000gallonsperminute(gpm).Figure1.1.2aisaperspective view,andFigure1.1.2bisasimplifiedschematicdiagram.Asingleunitwasaboutsevenft highandsevenftindiameter.Theflowofrawsewagestruckarotatingcollarscreenata velocityoffiveto15ft/s.Thisspeedwasafunctionoftheflowrateintotheunitandhence afunctionofthediameteroftheinfluentpipe.Dependingonthespeedoftherotationof thisscreenanditsfineness,upto90%ofthefeedpenetratedthecollarscreen.Therest ofthefeeddroppedtothehorizontalscreen,whichvibratedtoremoveexcesswater.The solidsconcentrate,whichpassedthroughneitherscreen,wassenttothesewagetreatment plant.Unfortunately,duringoperation,thescreensbecamecloggedwithsolidmatter,not onlysewagebutalsooil,paint,andfish-packingwastes.Backwashspraysweretherefore installedforbothscreenstopermitcleaningduringoperation.

Sewage

Screened overflow (F1, S1)

Screening facility River

(F0, S0)

Sewage

Solids concentrate

Treatment plant

Figure1.1.1 Operationofthesewagetreatmentplant:(a)standardmodeofoperationand(b)modifiedmodeofoperation,withscreeningfacility, F =flow; S =settleable solids.

1.1.2Investigation

Theobjectiveoftheinvestigationwastodeterminegoodoperatingconditions.

1.1.3ChangingCriteria

Whataregoodoperatingconditions?Initially,itwasbelievedtheywerethoseresulting inthehighestpossibleremovalofsolids.ReferringtoFigures1.1.1band1.1.2a,settleable solidsintheinfluentaredenotedby S0 andthesettleablesolidsintheeffluentby S1 .The percentsolidsremoved bythescreenistherefore y =100(S0 S1 )/S0 .Thus,initially,it wasbelievedthatgoodoperationmeantachievingahighvaluefor y.However,itbecame evidentafterthefirstsetofexperimentsweremade,thatthe percentageoftheflowretreated (flowreturnedtotreatmentplant),whichwedenoteby z,alsohadtobetakenintoaccount. ReferringtoFigures1.1.1band1.1.2a,influentflowtothescreensisdenotedby F0 and effluentflowfromthescreenstotheriverby F1 .Thus, z =100(F0 F1 )/F0 .

Vibrating horizontal screen

Unscreened effluent

Screened effluent (a)

Raw sewage influent

Raw sewage influent

Rotating collar screen

Vibrating horizontal screen

Solids concentrate to sewage treatment plant (passed through neither screen)

Rotating collar screen

Screened bypass stream to effluent (passed through one screen)

Figure1.1.2 Originalversionofthescreeningunit(a)detaileddiagramand(b)simplifieddiagram.

1.1.4ASummaryoftheVariousPhasesofthe Investigation

Phasea

Inthisinitialphase,anexperimentwasruninwhichtherolesofthreevariableswere studied:collarscreenmeshsize(fine,coarse),horizontalscreenmeshsize(fine,coarse), andflowrate(gpm).Atthisstage,

1.Theexperimenterswereencouragedbythegenerallyhighvaluesachievedfor y 2.Highestvaluesfor y wereapparentlyachievedbyusingahorizontalscreenwitha coarsemeshandacollarscreenwithfinemesh.

3.Contrarytoexpectation,flowratedidnotshowupasanimportantvariable affecting y

4.Mostimportant,theexperimentwasunexpectedlydominatedbythe z values,which measuretheflowretreated.Thesewereuniformlyverylow,withabout0.01%ofthe flowbeingreturnedtothetreatmentplantand99.9%leavingthescreenfordischarge intotheriver.Althoughitwasdesirablethattheretreatedflowbesmall,the z values wereembarrassinglylow.Astheexperimentersremarked,“[T]hehorizontalscreen producedasolidconcentrate dryenoughtoshovel .Thisrepresentedawaste ofeffortofconcentratingbecausetheconcentratedsolidswereintendedto flow from theunits.”

Phaseb

Itwasnowclear(i)that z aswellas y wereimportantand(ii)that z wastoolow.It wasconjecturedthatthemattersmightbeimprovedbyremovingthehorizontalscreen altogether.Anotherexperimentwasthereforeperformedwithnohorizontalscreen.The speedofrotationofthecollarscreenwasintroducedasanewvariable.

Unfortunately,afteronlytworunsofthisexperiment,thisparticularphasehadtobe terminatedbecauseoftheexcessivetearingoftheclothscreens.Fromthescantyresults obtaineditappeared,however,thatwithnohorizontalscreenhighsolidremovalcouldbe achievedwithahigherportionoftheflowretreated.Itwasthereforedecidedtorepeat theserunswithscreensmadeofstainlesssteelinsteadofcloth.

Phasec

Athirdexperiment,usingstainlesssteelcollarscreensoftwomeshsizes,similartothat attemptedinphaseb,wasperformedwiththesamecollarscreenmeshsize,collarscreen speed(rpm),andflowrate(gpm)usedbefore.

Inthisphase,withastainlesssteelcollarscreen,highremovalrates y werepossible foreightsetsofconditionsforthefactorsjustmentioned.However,thesehigh y values wereobtainedwithretreatedflow z atundesirablyhighvalues(before,theyhadbeentoo low).Theobjectivewastogetreasonablysmallvaluesfor z,butnotsosmallastomake shovelingnecessary;valuesbetween5%and20%weredesirable.Itwasbelievedthatby varyingflowrateandspeedofrotationofthecollarscreen,thisobjectivecouldbeachieved withoutsacrificingsolidremoval.

Phased

Again,usingastainlesssteelcollarscreen,anotherexperiment,withtwofactors,namely collarscreenspeed(rpm)andflowrate(gpm),setattwolevelseach,wasrun.Thistime, highvaluesofsolidremovalweremaintained,butunfortunately,flowretreatedvalueswere evenhigherthanbefore.

Phasee

Itwasnowconjecturedthatintermittentbackwashingcouldovercomethedifficulties. Thisprocedurewasnowintroducedwithinfluentflowrateandcollarscreenmeshvaried.

Theresultsofthisexperimentleadtoaremovalefficiencyof y =89%witharetreated flowofonly z =8%.Thiswasregardedasasatisfactoryandpracticalsolution,andthe investigationwasterminatedatthatpoint.

Fordetailedanalysisofthisexperiment,thereadershouldrefertoBoxetal.(1978, p.354).Ofcourse,thesetypesofexperimentsandtheiranalysesarediscussedinthistext (seeChapter18).

1.2ASURVEY

Thepurposeofasamplesurveyistomakeinferencesaboutcertaincharacteristicsofa populationfromwhichsamplesaredrawn.Theinferencestobemadeforapopulation usuallyentailstheestimationofpopulationparameters,suchasthepopulationtotal,the mean,orthepopulationproportionofacertaincharacteristicofinterest.Inanysample survey,aclearstatementofitsobjectiveisveryimportant.Withoutaclearstatement abouttheobjectives,itisveryeasytomisspertinentinformationwhileplanningthe surveythatcancausedifficultiesattheendofthestudy.

Inanysamplesurvey,onlyrelevantinformationshouldbecollected.Sometimestrying tocollecttoomuchinformationmaybecomeveryconfusingandconsequentlyhinderthe determinationofthefinalgoal.Moreover,collectinginformationinsamplesurveyscosts money,sothattheinterestedpartymustdeterminewhichandhowmuchinformation shouldbeobtained.Forexample,itisimportanttodescribehowmuchprecisioninthe finalresultsisdesired.Toolittleinformationmaypreventobtaininggoodestimateswith desiredprecision,whiletoomuchinformationmaynotbeneededandmayunnecessarily costtoomuchmoney.Onewaytoavoidsuchproblemsistoselectanappropriatemethod ofsamplingthepopulation.Inotherwords,thesamplesurveyneedstobeappropriately designed.AbriefdiscussionofsuchdesignsisgiveninChapter2.Formoredetailson thesedesigns,thereadermayrefertoCochran(1977),SukhatmeandSukhatme(1970), orScheafferetal.(2006).

1.3ANOBSERVATIONALSTUDY

Anobservationalstudyisonethatdoesnotinvolveanyexperimentalstudies.Consequently,observationalstudiesdonotcontrolanyvariables.Forexample,arealtorwishes toappraiseahousevalue.Allthedatausedforthispurposeareobservationaldata.Many psychiatricstudiesinvolveobservationaldata.

1.5ABriefDescriptionofWhatisCoveredinthisBook7

Frequently,infittingaregressionmodel(seeChapters15and16),weuseobservational data.Similarly,inqualitycontrol(seeChapters20and21),mostofthedatausedin studyingcontrolchartsforattributesareobservationaldata.Notethatcontrolcharts forattributesusuallydonotprovideanycause-and-effectrelationships.Thisisbecause observationaldatagiveusverylimitedinformationaboutcause-and-effectrelationships.

Asanotherexample,manypsychiatricstudiesinvolveobservationaldata,andsuch datadonotprovidethecauseofpatient’spsychiatricproblems.Anadvantageofobservationalstudiesisthattheyareusuallymorecost-effectivethanexperimentalstudies. Thedisadvantageofobservationalstudiesisthatthedatamaynotbeasinformativeas experimentaldata.

1.4ASETOFHISTORICALDATA

Historicaldataarenotcollectedbytheexperimenter.Thedataaremadeavailableto him/her.

Manyfieldsofstudysuchasthemanybranchesofbusinessstudies,usehistoricaldata. Afinancialadvisorforplanningpurposesusessetsofhistoricaldata.Manyinvestment servicesprovidefinancialdataonacompany-by-companybasis.

1.5ABRIEFDESCRIPTIONOFWHATIS COVEREDINTHISBOOK

Datacollectionisveryimportantsinceitcangreatlyinfluencethefinaloutcomeofsubsequentdataanalyses.Aftercollectionofthedata,itisimportanttoorganize,summarize, presentthepreliminaryoutcomes,andinterpretthem.Varioustypesoftablesandgraphs thatsummarizethedataarepresentedinChapter2.Alsointhatchapter,wegivesome methodsusedtodeterminecertainquantities,called statistics,whichareusedtosummarizesomeofthekeypropertiesofthedata.

Thebasicprinciplesofprobabilityarenecessarytostudyvariousprobabilitydistributions.WepresentthebasicprinciplesofelementaryprobabilitytheoryinChapter3. Probabilitydistributionsarefundamentalinthedevelopmentofthevarioustechniquesof statisticalinference.TheconceptofrandomvariablesisalsodiscussedinChapter3.

Chapters4and5aredevotedtosomeoftheimportantdiscretedistributions,continuousdistributions,andtheirmoment-generatingfunctions.Inaddition,westudyin Chapter5somespecialdistributionsthatareusedinreliabilitytheory.

InChapter6,westudyjointdistributionsoftwoormorediscreteandcontinuous randomvariablesandtheirmoment-generatingfunctions.IncludedinChapter6isthe studyofthebivariatenormaldistribution.

Chapter7isdevotedtotheprobabilitydistributionsofsomesamplestatistics,such asthesamplemean,sampleproportions,andsamplevariance.Inthischapter,wealso studyafundamentalresultofprobabilitytheory,knownastheCentralLimitTheorem. Thistheoremcanbeusedtoapproximatetheprobabilitydistributionofthesamplemean whenthesamplesizeislarge.Inthischapter,wealsostudysomesamplingdistributions ofsomesamplestatisticsforthespecialcaseinwhichthepopulationdistributionisthe so-callednormaldistribution.Inaddition,wepresentprobabilitydistributionsofvarious

“orderstatistics,”suchasthelargestelementinasample,smallestelementinasample, andsamplemedian.

Chapter8discussestheuseofsampledataforestimatingtheunknownpopulation parametersofinterest,suchasthepopulationmean,populationvariance,andpopulationproportion.Chapter8alsodiscussesthemethodsofestimatingthedifferenceoftwo populationmeans,thedifferenceoftwopopulationproportions,andtheratiooftwopopulationvariancesandstandarddeviations.Twotypesofestimatorsareincluded,namely pointestimatorsandintervalestimators(confidenceintervals).

Chapter9dealswiththeimportanttopicofstatisticaltestsofhypothesesanddiscusses testprocedureswhenconcernedwiththepopulationmeans,populationvariance,and populationproportionforoneandtwopopulations.Methodsoftestinghypothesesusing theconfidenceintervalsstudiedinChapter8arealsopresented.

Chapter10givesanintroductiontothetheoryofreliability.Methodsofestimation andhypothesistestingusingtheexponentialandWeibulldistributionsarepresented.

InChapter11,weintroducethetopicofdatamining.Itincludesconceptsofbigdata andstartingstepsindatamining.Classification,machinelearning,andinferenceversus predictionarealsodiscussed.

InChapter12,weintroducetopicofclusteranalysis.Clusteringconceptsandsimilaritymeasuresareintroduced.Thehierarchicalandnonhierarchicalclusteringtechniques andmodel-basedclusteringmethodsarediscussedindetail.

Chapter13isconcernedwiththechi-squaregoodness-of-fittest,whichisusedtotest whetherasetofsampledatasupportthehypothesisthatthesampledpopulationfollows somespecifiedprobabilitymodel.Inaddition,weapplythechi-squaregoodness-of-fittest fortestinghypothesesofindependenceandhomogeneity.Thesetestsinvolvemethodsof comparingobservedfrequencieswiththosethatareexpectedifacertainhypothesisistrue.

Chapter14givesabrieflookattestsknownas“nonparametrictests,”whichareused whentheassumptionabouttheunderlyingdistributionhavingsomespecifiedparametric formcannotbemade.

Chapter15introducesanimportanttopicofappliedstatistics:simplelinearregressionanalysis.Linearregressionanalysisisfrequentlyusedbyengineers,socialscientists, healthresearchers,andbiologicalscientists.Thisstatisticaltechniqueexplorestherelationbetweentwovariablessothatonevariablecanbepredictedfromtheother.Inthis chapter,wediscusstheleastsquaresmethodforestimatingthesimplelinearregression model,calledthefittingofthisregressionmodel.Also,wediscusshowtoperformaresidual analysis,whichisusedtochecktheadequacyoftheregressionmodel,andstudycertain transformationsthatareusedwhenthemodelisnotadequate.

Chapter16extendstheresultsofChapter15tomultiplelinearregressions.Similar tothesimplelinearregressionmodel,multiplelinearregressionanalysisiswidelyused.It providesstatisticaltechniquesthatexploretherelationsamongmorethantwovariables, sothatonevariablecanbepredictedfromtheuseoftheothervariables.Inthischapter, wegiveadiscussionofmultiplelinearregression,includingthematrixapproach.Finally, abriefdiscussionoflogisticregressionisgiven.

InChapter17,weintroducethedesignandanalysisofexperimentsusingone,two, ormorefactors.Designsforeliminatingtheeffectsofoneortwonuisancevariablesalong withamethodofestimatingoneormoremissingobservationsaregiven.Weincludetwo nonparametrictests,theKruskal–WallisandtheFriedmantest,foranalyzingone-wayand randomizedcompleteblockdesigns.Finally,modelswithfixedeffects,mixedeffects,and randomeffectsarealsodiscussed.

Chapter18introducesaspecialclassofdesigns,theso-called2k factorialdesigns. Thesedesignsarewidelyusedinvariousindustrialandscientificapplications.Anextensive discussionofunreplicated2k factorialdesigns,blockingof2k factorialdesigns,confoundinginthe2k factorialdesigns,andYates’salgorithmforthe2k factorialdesignsisalso included.Wealsodevoteasectiontofractionalfactorialdesigns,discussingone-halfand one-quarterreplicationsof2k factorialdesigns.

InChapter19,weintroducethetopicofresponsesurfacemethodology(RSM). First-orderandsecond-orderdesignsusedinRSMarediscussed.Methodsofdetermining optimumornearoptimumpointsusingthe“methodofsteepestascent”andtheanalysis ofafittedsecond-orderresponsesurfacearealsopresented.

Chapters20and21aredevotedtocontrolchartsforvariablesandattributesusedin phaseIandphaseIIofaprocess.“PhaseI”referstotheinitialstageofanewprocess, and“phaseII”referstoamaturedprocess.Controlchartsareusedtodeterminewhether aprocessinvolvingmanufacturingorserviceis“understatisticalcontrol”onthebasisof informationcontainedinasequenceofsmallsamplesofitemsofinterest.Duetolackof space,thesetwochaptersarenotincludedinthetextbutisavailablefordownloadfrom thebookwebsite:www.wiley.com/college/gupta/statistics2e.

Allthechaptersaresupportedbythreepopularstatisticalsoftwarepackages, MINITAB,R,andJMP.TheMINITABandRarefullyintegratedintothetextofeach chapter,whereasJMPisgiveninanindependentsection,whichisnotincludedinthe textbutisavailablefordownloadfromthebookwebsite:www.wiley.com/college/gupta/ statistics2e.Frequently,weusethesameexamplesforthediscussionofJMPasareused inthediscussionofMINITABandR.Fortheuseofeachofthesesoftwarepackages,no priorknowledgeisassumed,sincewegiveeachstep,fromenteringthedatatothefinal analysisofsuchdataunderinvestigation.Finally,asectionofcasestudiesisincludedin almostallthechapters.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.