Applied statistics for environmental science with r 1st edition abbas f. m. alkarkhi - Quickly downl

Page 1


https://ebookmass.com/product/applied-statistics-forenvironmental-science-with-r-1st-edition-abbas-f-m-alkarkhi/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Applications of Hypothesis Testing for Environmental Science Abbas F.M. Alkarkhi

https://ebookmass.com/product/applications-of-hypothesis-testing-forenvironmental-science-abbas-f-m-alkarkhi/

ebookmass.com

Easy Statistics for Food Science with R Abdulraheem Alqaraghuli

https://ebookmass.com/product/easy-statistics-for-food-science-with-rabdulraheem-alqaraghuli/

ebookmass.com

Applied Statistics with R: A Practical Guide for the Life Sciences Justin C. Touchon

https://ebookmass.com/product/applied-statistics-with-r-a-practicalguide-for-the-life-sciences-justin-c-touchon/

ebookmass.com

Economia applicata all'Ingegneria Tecla De Luca

https://ebookmass.com/product/economia-applicata-allingegneria-teclade-luca/

ebookmass.com

The Terraformers 1st Edition Annalee Newitz

https://ebookmass.com/product/the-terraformers-1st-edition-annaleenewitz-2/

ebookmass.com

Easy Italian Reader, Premium 3rd Edition Riccarda Saggese

https://ebookmass.com/product/easy-italian-reader-premium-3rd-editionriccarda-saggese/

ebookmass.com

My First Knock-Knock Jokes Jimmy Niro

https://ebookmass.com/product/my-first-knock-knock-jokes-jimmy-niro/

ebookmass.com

Freud and Said: Contrapuntal Psychoanalysis as Liberation Praxis 1st ed. Edition Robert K. Beshara

https://ebookmass.com/product/freud-and-said-contrapuntalpsychoanalysis-as-liberation-praxis-1st-ed-edition-robert-k-beshara/

ebookmass.com

Feelings Transformed: Philosophical Theories of the Emotions, 1270-1670 Dominik Perler

https://ebookmass.com/product/feelings-transformed-philosophicaltheories-of-the-emotions-1270-1670-dominik-perler/

ebookmass.com

Diagnostic Ultrasound for Sonographers Aya Kamaya

https://ebookmass.com/product/diagnostic-ultrasound-for-sonographersaya-kamaya/

ebookmass.com

APPLIEDSTATISTICSFORENVIRONMENTAL SCIENCEWITHR

APPLIED STATISTICSFOR ENVIRONMENTAL SCIENCEWITHR

WASIN A.A.ALQARAGHULI

Elsevier

Radarweg29,POBox211,1000AEAmsterdam,Netherlands TheBoulevard,LangfordLane,Kidlington,OxfordOX51GB,UnitedKingdom 50HampshireStreet,5thFloor,Cambridge,MA02139,UnitedStates

© 2020ElsevierInc.Allrightsreserved.

Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans,electronicormechanical,includingphotocopying, recording,oranyinformationstorageandretrievalsystem,withoutpermissioninwritingfromthepublisher.Detailsonhowtoseek permission,furtherinformationaboutthePublisher’spermissionspoliciesandourarrangementswithorganizationssuchastheCopyright ClearanceCenterandtheCopyrightLicensingAgency,canbefoundatourwebsite: www.elsevier.com/permissions .

ThisbookandtheindividualcontributionscontainedinitareprotectedundercopyrightbythePublisher(otherthanasmaybenotedherein).

Notices

Knowledgeandbestpracticeinthisfieldareconstantlychanging.Asnewresearchandexperiencebroadenourunderstanding,changesin researchmethods,professionalpractices,ormedicaltreatmentmaybecomenecessary.

Practitionersandresearchersmustalwaysrelyontheirownexperienceandknowledgeinevaluatingandusinganyinformation,methods, compounds,orexperimentsdescribedherein.Inusingsuchinformationormethodstheyshouldbemindfuloftheirownsafetyandthesafety ofothers,includingpartiesforwhomtheyhaveaprofessionalresponsibility.

Tothefullestextentofthelaw,neitherthePublishernortheauthors,contributors,oreditors,assumeanyliabilityforanyinjuryand/ordamage topersonsorpropertyasamatterofproductsliability,negligenceorotherwise,orfromanyuseoroperationofanymethods,products, instructions,orideascontainedinthematerialherein.

LibraryofCongressCataloging-in-PublicationData

AcatalogrecordforthisbookisavailablefromtheLibraryofCongress

BritishLibraryCataloguing-in-PublicationData

AcataloguerecordforthisbookisavailablefromtheBritishLibrary

ISBN978-0-12-818622-0

ForinformationonallElsevierpublicationsvisitour websiteat https://www.elsevier.com/books-and-journals

Publisher: JoeHayton

AcquisitionEditor: MarisaLaFleur

EditorialProjectManager: ReddingMorse

ProductionProjectManager: OmerMukthar

CoverDesigner: GregHarris

TypesetbySPiGlobal,India

Dedication

Abbas

Tothememoryofmyparents(deceased) TomychildrenAtheer,Hibah,andFarah

Wasin

Tothememoryofmyfather(deceased) Tomymother

Preface

AppliedstatisticsforenvironmentalsciencewithRwaswritteninaneasystyletointroducesomestatisticaltechniquesthatareusefultostudentsandresearcherswhoworkinenvironmentalscienceandenvironmentalengineering tochoosetheappropriatestatisticaltechniqueforanalyzingtheirdataanddrawingsmartconclusions.TheexplanationoftheRoutputiscarriedoutinastep-by-stepmannerandinaneasyandclearstyletoenablenon-statisticiansto understandanduseitintheirresearch.

Astep-by-stepprocedureisemployedtoperformtheanalysisandtheinterpretationofresultsbymatchingthe resultstothefieldofstudywherethedatawereobtained.Thebookfocusesontheapplicationsofunivariateandmultivariatestatisticaltechniquesinthefieldofenvironmentalscience.Furthermore,realdataobtainedfromresearchover morethanfifteenyearsofworkinenvironmentalsciencewereemployedtoillustratetheconceptsandanalysis.

ThebookusesRstatisticalsoftwaretoanalyzethedataandgeneratetherequiredresults.Risopensourceand providesfacilitiestoprovidefeedbackandproduceahigh-resolutionplot.Furthermore,itiseasytogetonlineassistanceprovidedbyvariouscommunities.RisavailableovertheinternetundertheGeneralPublicLicense(GPL)forthe Windows,Macintosh,andLinuxoperatingsystems.

Finally,wewishtothankourfamilies,friends,andcolleaguesfortheircontinuoussupport.Wewouldliketoextend ourthankstotheRsoftwarecommunityandRfamily(RusersandcontributorstoR)providingthesoftwareforfree. Thisbookwouldnothavebeenpossiblewithouttheinformationprovidedonline,whichiseasytoobtain.Wethank theUniversityofKualaLumpur(UniklMICET)foritssupport.

Abbas Wasin

1

MultivariateData

LEARNINGOBJECTIVES

Aftercarefulconsiderationofthischapter,youshouldbeable:

• Todescribetheconceptofenvironmentalstatistics.

• Tounderstandtheimportanceofenvironmentalstatisticsformakingintelligentconclusions.

• Todescribetheconceptofmultivariateanalysis.

• Toknowtheadvantagesofemployingmultivariatemethodsforanalyzingenvironmentaldata.

• Toorganizetheoutcomesofmultivariatedatatobepreparedforanalysis.

• Toexplainthedistinctionbetweenunivariateandmultivariatenotions.

• Toexplainhowandwheretoemploymultivariatedata.

• Tounderstandtheunivariatenormaldistribution.

• Tounderstandthemultivariatenormaldistribution.

1.1THECONCEPTOFENVIRONMENTALSTATISTICS

Environmentalstatisticsistheapplicationofvariousstatisticalmethods,includingproceduresandtechniquesinthe fieldofenvironmentalscienceandenvironmentalengineering,suchasweather,air,waterquality,climate,soil,fisheriesandotherenvironmentalactivities.Statisticalmethodsareusedindesigningenvironmentalprojectsandforthe analysisandinterpretationofenvironmentaldatatohelpandguidescientistsdrawusefulandmeaningfulconclusions invariousaspectsoftheenvironment.Environmentalstatisticscanhelpindescribingenvironmentalproblemsin termsofmathematicalmodelingtounderstandtheimpactofthechosenvariablesunderstudyandshowthedirection ofchange increaseordecrease orthenatureoftherelationship positiveornegative.Furthermore,statisticaltechniquescanidentifythegeneraltrendandentanglethehiddenrelationshipsaswell,whichwouldhelpscientistsunderstandtheprocessandhaveaclearpictureregardingallrelationshipstoavoidriskandguidemanagementtoproperly planenvironmentalprojects.

Environmentalstatisticscanhelpinunderstandingtheimportanceofvariabilityandoscillationinthedata,employingvariousmeasuresandmethodstoshowtheinfluenceofvariability,andleadscientiststosearchforscientificexplanations.Thus,scientistsshouldatleastlearnandunderstandbasicstatisticstohelpunderstandtheimportanceofthe resultsandtheanalysisthatguidesinformativeconclusions.

1.2THECONCEPTOFMULTIVARIATEANALYSIS

Theconceptofunivariatestatisticalanalysiscoversstatisticaltechniquesfortestingadatasetwithonevariable. However,mostresearchprojectsneedtomeasureseveralvariablesforeachresearchunitorindividual(samplingunits orexperimentalunits)inoneormoresamples.

Forexample,considerassessingthewaterqualityofariverbasedonmonitoringcertainparameterssuchaspH, dissolvedoxygen(DO),electricalconductivity(EC),turbidity,biologicaloxygendemand(BOD),chemicaloxygen

demand(COD),andtotalsuspendedsolids(TSS),tomakeadecisionaboutthepollutionstatusoftheriver.Inthiscase, therearesevenvariablestobemeasuredforeachsample,whicharegenerallyhighlycorrelated.Ifonevariableata timeisconsideredtoanalyzetheresultsofmultivariatedata,therelationshipsbetweenthevariableswouldbeignored andadifferentpicturewouldbereflectedregardingthetruebehaviorofthechosenparameters(variables)inthepresenceofotherparameters(variables).Thus,weshoulduseamethodthattakesintoaccountthecorrelationbetween chosenvariablestountietheoverlappinginput(information)bythecorrelatedvariablestounderstandthebehaviorof thechosenvariablesproperly.

Therefore,datasetswithseveralvariablescanbeanalyzed,employingmultivariatemethodsthatconsiderthe relationshipbetweenthechosenvariables.Multivariatemethodsareacollectionoftechniquesthatcanserveseveral purposesinthefieldofenvironmentalscienceandengineering,whichincludeclusteranalysisforrecognizing groupsofsimilarobservation(e.g.,individuals,objects) ;principalcomponentsanalys isandfactoranalysisasdata reductionmethodstoreducethenumberofvariablesto asmallernumberofdimens ions,calledcomponents (factors),whichareuncorrelatedwithoutlosingvaluable information;discriminant analysis,whichisapplied toseparatingthedataintovariousgroupsbasedonthemeasuredvariables;multivariateanalysisofvariance (MANOVA),employedtoperformstatisticalhypothesistestingbasedonmultivariat edata(severalvariables); andmultivariatemultipleregressionanalysis,whichisemployedformakingpredictionsbasedontherelation amongthevariables.

1.3CONFIGURATIONOFMULTIVARIATEDATA

Wecanorganizemultivariatedataforvariousvariablesmeasuredfromanumberofsamples(items)inatable. Thenumberofsamples(items)specifythenumberofrowsandthenumberofvariablesspecifythenumberofcolumns ofthetable.

Ingeneral, Table1.1 showstheconfigurationof n samples(representingtherows)and k variables(representingthe columns)measuredforeachsample.

Note:Thetotalnumberofvariablesmeasuredforeachsample(k)isusuallysmallerthanthetotalnumberof samples, n.

1.4EXAMPLESOFMULTIVARIATEDATA

Wecanillustratetheconceptofmultivariatedatabyprovidingsomerealexamplesofmultivariatedatawithregard toenvironmentalsciencethatmaybeusefulinmoreeasilyexplainingtheassociatedcomputations.Furthermore,the interpretationoftheseexampleswillbegiven,includingthescenarioofeachcase.

TABLE1.1 TheConfigurationof n Samplesand k VariablesinMultivariateForm

EXAMPLE1.1 ASSESSMENTOFSURFACEWATER

AnassessmentofwaterqualityintheJuruandJejawiRiversinthePenangStateofMalaysiawasconductedbymonitoring 10physiochemicalparameters;namely,theelectricalconductivity(EC)andtemperatureweremeasuredemployinganHACH portablepHmeter,anddissolvedoxygen(DO)wasmeasuredwithaYSI1000DOmeter.Thechemicaloxygendemand(COD), biochemicaloxygendemand(BOD),totalphosphateconcentrations,andtotalnitratewereanalyzedemployingaspectrophotometer(HACH/2010).Theturbiditywasmeasuredemployinganephelometer.Thetotalsuspendedsolids(TSS)wereanalyzedgravimetricallyinthelaboratory.TheAPHAStandardMethodsfortheExaminationofWaterandWastewaterwere appliedtoanalyzetheconcentrationoftheabovementionedparameters.Thedataobtainedfrom10differentsitesareeachgiven in Table1.2

TABLE1.2 PhysiochemicalParametersfortheJuruandJejawiRivers

LocationTemperaturePHDOBODCODTSSECTurbidityTotalphosphateTotalnitrate

Juru28.157.886.7310.561248.00473.3342.4513.050.8812.45

Juru28.20

Inthisexample,20samples(rows)arecollectedfromtheJuruandJejawiRivers(10differentsiteseach),and10physiochemicalparameters(columns)aremeasuredfromeachsample.Itisverydifficultfortheresearcherstomakeadecisionaboutthe pollutionstatusoftheriversoraboutthebehaviorofthedifferentparametersbasedonthechosenparametersunderstudy becausetherearemanydifferentvalues,andthedifferences(fluctuations)betweenthevaluesoftheparametersfromsample tosamplewillmisleadtheresearcher,makingitdifficulttomakeacorrectdecision.Thus,thesedatashouldbeanalyzed employinganappropriatemultivariatemethodthatmeetstheobjectiveofthestudytountangletheinformationandtounderstandthebodyofthephenomenon.Theobjectivesofthisresearchweretodeterminetherangeofsimilarityamongthesampling sites,torecognizethevariablesresponsibleforspatialdifferencesinriverwaterquality,todeterminetheunobservedfactors whiledemonstratingtheframeworkofthedatabase,andtoquantifytheeffectofpossiblenaturalandanthropogenicsourcesof thechosenwaterparametersoftherivers.

EXAMPLE1.2 ASSESSMENTOFLANDFILLLEACHATETREATMENT

Aresearcherwishestoinvestigatetheconcentrationsof1 2heavymetals(magnesium(Mg),calcium(Ca),sodium(Na), iron(Fe),zinc(Zn),copper(Cu),chromium(Cr),cadmium(Cd),lead(Pb),arsenic(As),cobalt(Co),andmanganese(Mn)) andeightphysiochemicalparameters(chemicaloxygendemand(COD),biochemicaloxygendem and(BOD),totaldissolved solids(TDS),totalsuspendedsolids(TSS ),electricalconductivity(EC),pH,ammoniacal-N(NH3N),anddissolvedoxygen (DO))inthreeponds(collection,aeration,andstabilization)ofalandfillleachate.Thedataforthethreepondsaregivenin Table1.3

TABLE1.3 TheResultsofPhysiochemicalandHeavyMetalsObtainedFromThreePondsofLandfillLeachate

Pond

pHECTDSTSSCODBODNH3HDOMgCa

Collection7.196242.004530.0043.26944.679.5781.666.9012.5814.28 Collection7.366062.004392.0029.45845.338.7761.336.6917.9113.17 Collection7.416164.004476.0031.52897.009.2171.006.4814.4415.54

Collection7.926710.334710.0039.36810.008.6652.436.8519.9988.25

Collection7.846690.004709.8336.46686.679.6655.806.8520.5081.78

Collection7.836600.334654.8337.001365.0010.3352.807.2321.1885.18

Collection8.466893.004849.0047.33692.008.47110.672.3427.2324.94

Collection8.476850.004860.0050.67967.009.2793.002.8219.0114.38

Collection8.486873.004859.0049.00833.008.87103.002.6724.6619.99

Aeration8.982501.001792.00101.67502.6776.333.277.8810.9817.04

Aeration8.972438.001750.0089.00543.6793.678.907.6210.999.84

Aeration9.072422.001762.0092.00526.6771.0010.808.3110.9815.00

Aeration8.212255.001594.00148.83566.674.664.307.2318.5154.30

Aeration8.382092.831577.83151.00618.3315.333.807.3720.1454.08

Aeration8.322240.671578.83162.67573.3310.004.407.5521.7755.30

Aeration9.971513.001065.0051.00339.009.1020.088.087.965.25

Aeration10.181497.001049.0042.00322.009.3223.278.016.854.44

Aeration10.211532.001084.0038.00324.008.9523.268.128.536.78

Stabilization8.671491.001087.00115.00521.3317.331.376.9520.8219.99

Stabilization8.601521.001104.00105.33504.0029.677.826.4220.5023.10

Stabilization8.581545.001123.00108.33532.0047.336.005.7120.4827.13

Stabilization8.821321.83932.6798.00253.6713.666.607.5511.9266.96

Stabilization8.921352.88965.0089.39520.0015.006.207.2312.4665.28

Stabilization8.931314.83942.83101.61395.0015.006.207.2312.3462.13

Stabilization9.09652.00462.0057.00248.004.982.606.744.471.96

Stabilization9.19669.00476.0051.00257.005.612.596.714.112.31

Stabilization9.43680.00480.0047.00273.004.921.936.663.872.30

TABLE1.3 TheResultsofPhysiochemicalandHeavyMetalsObtainedFromThreePondsofLandfillLeachate—cont’d

Pond

Parameter NaFeZnCuCrCdPbAsCoMn

Collection978.370.3514.469.8031.500.312.7132.286.345.44 Collection978.370.2317.899.0322.370.323.3520.555.149.55 Collection978.370.6211.509.5910.700.012.169.753.9516.60 Collection284.440.5623.8057.1018.120.858.0311.526.2652.72 Collection273.470.5624.0057.4017.041.737.6014.417.4351.82 Collection283.800.5623.1056.8017.581.717.0410.096.6150.92 Collection673.451.3461.0164.9242.360.040.7025.2121.30109.34 Collection633.351.9849.1553.4559.060.563.9114.8118.78133.59 Collection665.610.7927.4858.7552.270.062.0817.0723.67122.59

Aeration980.000.25151.100.4347.820.012.8210.726.7422.94

Aeration975.550.73259.082.833.670.392.7211.122.6132.44

Aeration970.551.21203.885.2391.970.082.6210.9210.8141.94

Aeration525.570.87425.4669.4286.180.567.1721.6616.184.97

Aeration516.210.93424.4366.6889.810.357.5721.3615.995.32

Aeration517.130.85423.4563.6385.560.167.3720.6815.785.96

Aeration676.980.7728.0513.6627.880.322.586.735.738.52

Aeration630.310.9418.6915.9737.110.084.9010.125.6110.34

Aeration665.120.5734.4416.7530.650.113.867.666.759.52

Stabilization975.371.33117.473.9055.150.303.8022.2315.0111.50

Stabilization978.370.81278.475.2170.290.014.1218.1020.8421.61

Stabilization973.370.29175.476.5154.570.002.9012.1517.9330.51

Stabilization411.601.37229.9023.9348.470.286.4516.555.196.44

Stabilization439.041.43227.3925.2349.260.615.2515.754.756.02

Stabilization494.261.23221.8925.0348.830.015.8917.354.626.82

Stabilization141.000.33149.116.4212.880.022.485.711.281.63

Stabilization103.810.87240.777.918.760.144.633.321.541.93

Stabilization81.200.51190.686.6411.000.043.782.371.621.66

Theleachatesampleswerecollectedfromcollection,aeration,andstabilizedpondsintheATLSleachatecollectionsystem. TheleachatesampleswerecollectedthreetimesduringtheperiodbetweenAugust2017andJanuary2018,withthreesampling pointsateachpond.Thesamplesweremanuallygatheredandplacedin500mlpolyethenecontainers.Thesampleswere immediatelytransportedtothelaboratoryandcooledto4°Ctoreducebiologicalandchemicalreactions(JapanInternational CooperationAgency(JICA)).

Inthisexample,27samples(rows)werecollectedfromthethreeponds(collection,aeration,andstabilization),and 20parameters(columns)weremeasuredfromeachsample.It isnoteasyforthescientist tomakeadecisionaboutthe

treatmentprocessofthelandfilloraboutthebehaviorofthedifferentparametersunderstudybecausetherearemanydifferentvalues(27samples 20parameters ¼ 540values).Thus,therelationshipamongthedifferentchosenparameters shouldbeinvestigatedandstudiedproperlytounderstandthedifferences(fluctuations)intheparametersfromonepond toanother.Thesedatashouldbeanalyzedemployinganappropriatetechniquetoac hievetheobjectiveoftheproject.The firstobjectivewastoassesswhetherthetreatmentprocessofthelandfillleachateworkedproperly,andtherelationship amongthechosenparametersshouldbeinvestigatedformoreinformationonthebehaviorofeachvariableinthepresence ofotherchosenvariables,whichwouldhelptoidentifythesourceofthevariation.Thesecondobjectivewastoassessthe effectofthelandfillonthegroundwaterandsurfacewaterinthechosenarea(thedataforgroundwaterandsurfacewaterare notpresentedtosavespace).Furthermore,thecontributionofeachchosenvariable(parameter)inillustratingthetotalvariationinthecollecteddatawasidentifiedemployingmultivariatemethods.Thisresearchmayhelpinestimatingtheimpact ofthelandfillongroundwaterandsurfacewaterinthechosenarea.

EXAMPLE1.3 INORGANICELEMENTSINTHEPARTICULATEMATTER INTHEAIR

Aresearcherwishestoinvestigatetheconcentrationsofnineinorganicelementsintheparticulatematter(PM10)intheairof anequatorialurbancoastallocation.In2009,airpollutionlevelswerestudiedduringthesummerandwintermonsoonseasons employinghigh-volumesamplingtechniques.AtomicabsorptionspectrophotometrywasemployedtocollectPM10 samples, withanaveragetimeof24h.TheparametersweretheparticulatematterPM10,aluminum(Al),zinc(Zn),iron(Fe),copper(Cu), calcium(Ca),sodium(Na),manganese(Mn),nickel(Ni),andcadmium(Cd).Thedataaregivenin Table1.4

TABLE1.4 TheResultsofInorganicElementsinParticulateMatterintheAir(μg/m3) SeasonPM10 AlZnFeCuCaNaMnNiCd

Summer37.710.012.320.970.001.377.940.050.010.01

Summer48.030.012.050.750.001.568.460.050.010.04

Summer67.870.013.430.440.001.419.980.060.100.05

Summer39.010.014.240.450.001.1212.930.010.070.00

Summer38.330.013.510.500.021.609.920.020.000.05

Summer29.700.012.780.580.021.3812.130.040.060.02

Summer53.660.012.400.730.021.597.810.230.110.02

Summer132.280.012.340.550.021.599.010.030.000.01

Summer66.310.012.130.450.021.238.760.020.000.05

Summer69.200.012.130.500.031.958.660.030.080.05

Summer78.170.011.960.560.041.358.880.030.020.07

Summer31.630.012.210.420.041.249.020.020.080.05

Summer66.730.011.460.510.031.338.530.020.130.03

Summer113.560.012.070.600.031.517.650.020.060.05

Summer123.400.011.750.570.031.757.450.020.120.04

Summer72.390.011.610.570.031.548.400.020.070.02

Summer51.850.011.160.510.181.727.440.010.090.00

Summer77.590.011.660.810.042.047.890.030.080.03

Summer30.300.011.090.610.031.806.280.020.050.03

Summer100.400.011.820.470.041.707.050.020.050.02

Summer132.980.011.860.520.031.747.690.010.100.03

Summer126.380.010.340.020.000.011.360.010.180.00

TABLE1.4 TheResultsofInorganicElementsinParticulateMatterintheAir(μg/m3)—cont’d

SeasonPM10 AlZnFeCuCaNaMnNiCd

Summer31.820.011.410.520.031.937.040.020.040.00

Summer110.530.011.250.670.042.297.390.020.030.00

Summer38.410.011.550.790.122.217.740.020.060.03

Summer124.260.012.240.000.001.569.650.070.060.01

Summer53.620.011.970.010.001.397.070.030.040.03

Summer23.300.011.280.020.001.806.390.030.000.00

Summer67.340.010.910.020.000.807.140.000.080.00

Summer22.580.031.290.020.010.927.210.020.050.02

Summer54.520.031.370.040.011.487.980.010.030.03

Summer112.830.011.880.030.001.468.440.010.020.00

Summer175.280.011.680.030.001.388.130.000.030.00

Summer47.300.012.200.030.001.1311.390.010.030.01

Summer57.080.011.690.040.001.048.340.020.070.00

Summer15.160.011.580.040.011.776.690.050.080.02

Summer272.980.011.880.040.011.607.050.040.080.00

Summer101.820.011.860.020.011.796.020.050.130.00

Summer59.900.011.730.020.011.707.250.050.000.02

Summer31.070.011.920.020.021.404.920.050.030.04

Summer107.430.012.030.020.021.485.190.040.060.00

Summer30.510.012.420.030.032.085.700.070.000.04

Summer84.020.012.100.220.011.615.850.030.000.00

Summer7.370.012.630.350.021.243.900.030.000.00

Summer7.650.011.380.340.011.114.200.040.000.00

Summer15.050.011.560.210.020.352.470.030.000.00

Summer84.130.011.890.040.020.414.440.020.000.02

Winter82.870.011.660.150.020.644.790.020.060.00

Winter138.250.012.530.300.030.423.020.020.000.03

Winter82.670.012.660.310.040.033.160.030.000.00

Winter46.080.013.390.520.020.212.340.030.000.00

Winter15.330.012.630.230.020.064.380.030.050.00

Winter30.780.011.980.170.010.403.680.010.000.00

Winter155.000.013.720.150.021.214.240.020.000.00

Winter61.750.012.930.000.030.924.290.010.000.00

Winter88.900.014.910.110.020.936.650.020.000.00

Winter38.470.013.460.150.020.734.250.010.010.00

Winter24.460.011.970.190.010.751.960.010.000.00

Winter14.290.012.760.110.010.679.550.030.000.00

Winter65.520.012.560.320.010.542.700.040.000.00

Winter63.330.011.210.160.020.533.400.040.030.00

Winter62.280.013.520.430.010.807.780.020.110.00

TABLE1.4 TheResultsofInorganicElementsinParticulateMatterintheAir(μg/m3)—cont’d SeasonPM10 AlZnFeCuCaNaMnNiCd

Winter65.870.013.480.250.020.9911.060.010.030.00

Winter122.830.013.240.410.031.0911.470.010.060.00

Winter 50.180.002.400.490.021.167.900.020.110.00

Winter60.190.013.020.380.020.849.510.020.060.01

Winter 28.280.013.680.480.061.1812.210.010.090.00

Winter 42.500.011.860.460.061.3410.510.030.070.00

Winter 49.680.012.290.320.041.5711.020.020.000.04

Winter39.910.013.300.450.051.468.610.030.080.02

Winter37.940.013.570.310.081.599.470.040.000.05

Winter27.970.013.140.090.061.4612.080.030.000.05

Winter45.340.012.590.260.061.5911.430.030.060.00

Winter58.450.013.460.120.061.668.980.040.100.00

Winter43.300.011.580.090.051.467.590.030.000.00

Table1.4 providesahugeandcomplexdataset.Itcaneasilybeobservedthatthetabledoesnotprovidehelpfuldata(information)formakingadecision.TheobjectiveofthestudywastoassesstheairqualityofPenang,Malaysia,intermsofPM10 and inorganicelements,andtorecognizethemainsourcesofPM10 andinorganicelements,whethercrustalornoncrustal.Thegoals includedinvestigatingtherelationbetweenthedifferentchoseninorganicelementsandPM10 duringthesummerandwinter monsoonsanddeterminingthesimilaritiesbetweenthechosenparameters.

EXAMPLE1.4 HEAVYMETALSINSEDIMENT(mg/L)

Aresearcherwishestoinvestigatetheconcentrationsofeightheavymetals(cadmium(Cd),iron(Fe),copper(Cu),zinc(Zn), chromium(Cr),mercury(Hg),manganese(Mn),andlead(Pb))insedimentsobtainedfromtwosites.Thetwosites,with10samplingpointsateachsite,wereKualaJuru(theJuruRiver)andBukitTambun(theJejawiRiver)inthePenangStateofMalaysia. ThesedimentsamplesweregatheredatlowtidewithanEijkelkampgougeaugerfromeachofthe20chosensamplingpoints(10 samplingpointsfromeachriverestuary).Aflameatomicabsorptionspectrometer(FAAS;PerkinElmerHGA-600)was employedfortheanalysisofCu,Pb,Zn,Cd,Mn,Fe,andCr,andacoldvaporatomicabsorptionspectrometer(CV-AAS) methodwasusedforHganalysisaftersampledigestioninanacidsolution.Thedataaregivenin Table1.5

TABLE1.5 TheConcentrationofHeavyMetalsinSediment(mg/L)forJuruandJejawiRivers

TABLE1.5 TheConcentrationofHeavyMetalsinSediment(mg/L)forJuruandJejawiRivers—cont’d

Inthisstudy,theresearcherwantedtoinvestigateandunderstandtheinterrelationshipbetweenthechosenparametersand toextractinformationabouttheresemblanceordifferencesbetweenthedifferentsamplingsites,identificationofthevariables (heavymetals)accountableforthespatialdifferencesinriverestuaries,andtheeffectofthepossiblesources(naturaland anthropogenic)onthechosenheavymetalsofthetworiverestuaries.

1.5MULTIVARIATENORMALDISTRIBUTION

Thenormaldistributionisthemostimportantdistributioninstatistics.Thenormaldistributionisveryimportant becausemostofthetestsusedinstatisticsrequirethattheassumptionofnormalitybemet;thedataaregatheredfroma normallydistributedpopulation.

Abriefexplanationoftheunivariateandmultivariatenormaldistributionsisgivenbelow.

1.5.1UnivariateNormalDistribution

Suppose Y isarandomvariablethatfollowsthenormaldistribution.Then,theprobabilitydistributionfunctionof theunivariatenormaldistributionisgiveninEq. (1.1).

where

μ isthemean;and σ 2 isthevariance.

Theunivariatenormaldistributioncanbewrittenas Y N(μ, σ 2).

Thenormaldistribution(bell-shaped)curveispresentedin Fig.1.1.Rstatisticalsoftwarewasusedtogenerate thecurve.Thecommandsandbuilt-infunctionsforcreatinganormaldistributioncurvearepresentedinthe Appendix.

1.5.2MultivariateNormalDistribution

Suppose Y isarandomvectorof k randomvariablesthatfollowsthemultivariatenormaldistribution.Then,the multivariatenormaldistributionisgiveninEq. (1.2)

where k isthenumberofvariables; μ isthemeanvector; P isthecovariancematrix;and ðY μÞ P 1 Y μ ðÞ istheMahalanobisdistance(statisticaldistance).

Themultivariatenormaldistributioncanbedenotedas Y Nk( μ, P).

Note:

• Asuitabletransformationofthedatashouldbeusedifthenormalityassumptionisviolatedbyoneormore variablesunderinvestigation,aswhenthedataarehighlyskewedwithseveraloutlier(extreme)values(highor low)orrepeatedvalues.

• Ifalltheindividualvariablesfollowanormaldistribution,thenitissupposedthatthecombined(joint)distribution isamultivariatenormaldistribution.

• Inpractice,therealdataneverfollowamultivariatenormaldistributioncompletely;however,thenormaldensity canbeemployedasanapproximationofthetruepopulationdistribution.

FurtherReading

Alkarkhi,A.F.M.,Alqaraghuli,W.A.A.,2019.EasyStatisticsforFoodSciencewithR,firsted.AcademicPress. Alkarkhi,A.F.M.,Ismail,N.,Ahmed,A.,Easa,A.M.,2009.AnalysisofheavymetalconcentrationsinsedimentsofselectedestuariesofMalaysia a statisticalassessment.EnvironMonitAssess153,179–185.

Banch,T.J.H.,Hanafiah,M.M.,Alkarkhi,A.F.M.,Amr,S.S.A.,2018.Statisticalevaluationoflandfillleachatesystemanditsimpactongroundwater andsurfacewaterinMalaysia.

Blogger,2011.Rgraphgallery:Acollection[Online].Available, http://rgraphgallery.blogspot.my/2013/04/shaded-normal-curve.html Bryan,F.J.M.,1991.MultivariateStatisticalMethods:APrimer.Chapman&Hall,GreatBritain. Daniel&Hocking,2013.BlogArchives,HighResolutionFiguresinR[Online].R-bloggers.Available, https://www.r-bloggers.com/author/ daniel-hocking/ Johnson,R.A.W.,W,D.,2002.AppliedMultivariateStatisticalAnalysis.PrenticeHall,NewJersey. Rencher,A.C.,2002.MethodsofMultivariateAnalysis.J.Wiley,NewYork.

Yusup,Y.,Alkarkhi,A.F.M.,2011.Clusteranalysisofinorganicelementsinparticulatematterintheairenvironmentofanequatorialurbancoastal location.ChemistryandEcology27,273–286.

FIG.1.1 Normaldistributionplot.

2

RStatisticalSoftware

LEARNINGOBJECTIVES

Aftercarefulconsiderationofthischapter,youshouldbeable:

• ToknowRsoftware.

• ToknowRStudio.

• TodescribehowtosetupRpackages.

• Toknowhowtowritevariables,vectors,sequences,andmatrixesinRlanguage.

• TouseRcommandsandbuilt-infunctionsforenvironmentaldata.

• TounderstandhowtocallstoreddatafilesintotheREnvironment.

• Toknowhowtowritedownsimplescripts.

• TochoosesuitableRcommandstowritethescripts.

• ToknowhowtogeneratehighresolutionplotsinR.

• Tounderstandtheconceptofworkingdirectoryandhowtosetanewworkingdirectory.

• TodescribeRoutputandextractausefulreport.

2.1INTRODUCTION

Risasoftwareenvironmentforstatisticalcomputingandgraphicalprogramminglanguages.TheRsoftwarehas beenemployedbyprofessionalsatvariousorganizations,colleges,surveyinstitutions,andothers.Considerable statisticalpackagesareprovidedforvariousstatisticalanalyses;however,researchers,scientists,andotherswho areconcernedindataanalysis,designing,modeling,andproducingbeautifulandhigh-resolutionplotsprioritize employingR.Risofferedforfreeasanopen-sourcesoftware;underthetermsoftheGNUGeneralPublicLicense, “RisanofficialpartoftheFreeSoftwareFoundation’sGNUproject,andtheRFoundationhassimilarobjectives tootheropen-sourcesoftwarefoundationsliketheApacheFoundationortheGNOMEFoundation.” Rlanguageis similartotheSlanguageandenvironmentthatwasdevelopedbyBellLaboratories.Risemployedbymillionsof researchersaroundtheworld,andthenumberofRuserscontinuestoincrease.Rhasbecomeasubstantial,engaging, singular,andnewstatisticalsoftwareforthefollowingpurposes:

• Risfree(open-source)softwareincludingmanypackages,andmanysourcesaroundtheworldpermit downloadingandinstallationofthesoftware,regardlessofthepositionandtheinstitutionyouworkwith,or whetheryouareaffiliatedwithapublicorprivateorganization.

• Roffersmanybuilt-infunctionstohelpmakethestepsoftheanalysissimpleandeasy.Wecancarryoutdata analysisinRbyprovidingscriptstoknowtherequiredvariablesandaskingbuilt-infunctionsinRtocarryoutthe requiredprocess,suchascomputingthecorrelation,average,variance,orotherstatisticalvalues.

• Clear,high-resolution,anduniquegraphscanbeproducedbyRthatmeetspecificstandardsorreflectaparticular opinionofthetask(conveythoughtstotheplot).

• Rcaneasilybeusedbyresearcherswithoutprogrammingproficiency.

• PeoplecandownloadandinstallRforvariousoperatingsystemssuchasWindows,Linux,andMacOS.

• ManystatisticalandgraphicalpackagesareprovidedbyRlibraryfordataanalysis,anddifferentcomputationsand graphicalapplicationsgeneratehigh-qualityplots.

• Itiseasytointeractwithonlinecommunityaroundtheworld,interchangethoughts,andreceiveassistance.

• Rcodes,commands,andfunctionsareavailableonlineforfree;moreover,considrablesourcesofferdemonstrations regardingRforfree,containingcourses,material,andresponsestoinquiries,whichothersoftwarepackagesdo notoffer.

• MorefacilitiesareofferedbyRStudiotooperateR,anditissimplerandfriendliertoemploythanR.

TheconceptsandrelatedtermstotheRstatisticalpackageareaddressedtoprovideastartingpointforreaderswho arenovicetotheRlanguageandenvironment.Beginnerswillbeguidedonhowtodownloadandinstallthesoftware (RandRStudio),comprehendsomeideasandrelatedconceptsemployedinR,andwritesimpleandeasyscriptsinR. Weattemptedasmuchaswecouldtomaketheprocedureanddirectivessimpleandcomprehensibletoeverybody. Considerableexamplesareprovidedtoleadtheresearchersstep-by-stepandmaketheprocedureinteresting.

WecandownloadandinstallRstatisticalsoftwareanditspackagessimplyinafewsteps,andtherequiredpackagesrelatedwithRsoftwarearetheninstalled.Rprovidesconsiderablepackagestocarryoutvariousstatistical methods.AfterinstallingRsoftware,wecandownloadandinstallRStudiothatcanbeemployedtooperate ReffectivelyandinamorefriendlywaythanR.

2.2INSTALLINGR

ConsiderwehavenotinstalledRstatisticalsoftwareyet.Thesoftwarecanbeinstalledanddownloadedforfreeby employingthesixstepsbelow.

1. Thereadercantype https://cran.r-project.org/ andthenclick “Enter” ; “theComprehensiveRArchiveNetwork” willappearasshownin Fig.2.1

2. TheScreenfor “TheComprehensiveRArchiveNetwork” offersthreechoicestodownloadRsoftwarebasedonthe operatingsystemofthecomputer,aspresentedbelow: (1) DownloadRforLinux (2) DownloadRfor(Mac)OSX (3) DownloadRforWindows IfwechooseRforWindows,clickinstallRforthefirsttime(orbase)aspresentedin Fig.2.2.

FIG.2.1 ComprehensiveRArchiveNetwork.

FIG.2.2 ShowingtheinstructionsforinstallingRforWindows.

FIG.2.3 TheScreentodownloadR-3.5.1forWindows(32/64bit).

3. Wecanclickontheavailableversion.ThenewestavailableversionforRsoftwareisR-3.5.1,asshowsontheScreen, ortheremaybeotherversions.Clickonthe “DownloadR-3.5.1forWindows(62megabytes,32/64bit)” asshownin Fig.2.3.

4. Clickonthe “DownloadR-3.5.1forWindows(62megabytes,32/64bit)”;inthelowerbottom-leftcorner,thereisa message “R-3.5.1-win.exe” (Fig.2.4),thisindicatesthatthefilestartsdownloadingtothecomputer.

5. ClickondownloadfileoncethedownloadofRisfinishedtoopenanotherscreen,andthenclickon “ run ” asshown in Fig.2.5.Thenextstepistofollowtheguidancegiventofinishtheinstallation.

6. Risinstalledonthecomputerandreadytobeused.WecanstartusingRbydoubleclickingontheRicon.

2.2.1RMaterial

Itishighlyusefultohaveahandbookormanualstodirectreadersemployingthenewsoftware,particularlyfor noviceswhoarenewtotheRsoftware.Thisserviceisprovidedforfree;wecanuseon-linewebsitestodownload manualsandnotessuchas https://cran.r-project.org/ ,orotherauthorizedRsources.Thehandbooks,manuals andnotescanbeobtainedofflinebyemployingthehelpbuttonintheupperrowoftheRenvironment,aspresented in Fig.2.6;thehelpbuttonprovidesmanyoptions,oneofwhichisManuals(PDF).

FIG.2.4 Showingtheplaceofdownloadedfile.

FIG.2.5 Showingthedirectivestodownloadthesoftware.

Wehavefoundthatthehandbooksandassociatednotesarehelpfulandoffercleardirection,particullaryfornovice readers.Somedocumentshavebeenproducedinvariouslanguages,suchasRussian,Chinese,andGerman.

2.2.2RPackages

Roffersvariousstatisticalpackages,andsomepackagesarebuilt-inpackages(standard/basepackages,loaded packagesonceRinstallationisfinished).UserscandownloadotherpackagesfromtheupperrowoftheRConsole (“Packages”).Thepackagescanbedownloadedbyclicking “Installpackage(s),” andthenselectthesiteyouwish todownload.AlistofPackagesareprovidedinRsoftware,thusyoucanselectthepackageyourequiretoinstall. Thesearch () functionisemployedtodisplaysomeoftheloadedpackagesinyourcomputerwhenRstarts.

Search()

Thefunction search() isusedtoshowtheloadedpackages.

>search()

[1]".GlobalEnv""package:stats""package:graphics" [4]"package:grDevices""package:utils""package:datasets" [7]"package:methods""Autoloads""package:base"

2.3THERCONSOLE

TheScreenwhereweplacethecommands;thenrunthescriptstoperformtheanalysisoranywantedcomputations iscalledRConsole.Thereisasymbolcalledthe “commandprompt” bydefault >,whichisplacedattheendofthe Console.TheRscriptsshouldbewrittenafterthecommandprompt.Forexample,8+3isdefinedandthenfollowed by “Enter” toobtaintheoutput(11)aspresentedin Fig.2.7.

ThebuttonsforFile,Edit,View,MISC,Packages,WindowsandHelpareshownintheupperrowoftheScreen;each buttonhaschoicestocarryoutaparticularjob.

Note:

• Theoutputlinesforrunninganyscriptareprecededby[1].

• Built-infunctionsarecalledintoperformtherequiredjob,aswehaveseenintheprecedingexamplefor8+3; pressing “Enter” willinvitethebuilt-infunctionforadditiontoperformtheproposedtask.Namesareusedtocall thebuilt-infunctionsinRfollowedbytheargumentinparentheses;thenpress “Enter,” whichisaninstructionto performtherequestedjob.Usually,thebuilt-infunctionsareavailableinthememoryofthecomputer;forinstance, toquitR,weshouldwritethebuilt-infunction q()

>q()

• Askingforassistanceisachievedbyemployingthefunction help() toturnonanewScreenconcerningtherequested subject.Forexample,the help(quit) commandwillofferthereadyinputconcerningtothequit(terminatean Rsession)inRlibrarycontainingacharacterizationofthefunction,application,arguments,references,andmodels.

>help(quit)

FIG.2.6 Showingthestepstodownloadnotesandmanuals.

Thequestionmark (?) isanotherbuilt-infunctionemployedforrequestingassistance,whichmaybeutilizedas ashortcutforaskingassistance,asin ?quit.Thetwofunctions help(quit) and ?quit arevalent.Thescreenfor callingassistanceemployingbothfunctionsispresentedin Fig.2.8.

• Writingthenameofavariable(parameter)asacommandwillworksimilarlytothebuilt-infunction print();for example,theoutcomeofB ¼ 6+4canbeprinted(displayedonscreen)eitherbywritingBorwriting print(B) as presentedbelow.

>B=6+4

>B #PrintthevalueofB [1]10

>print(B)#printthevalueofB [1]10

• WecantransfertheoutputtoafilebyemployingRbuilt-infunctionssuchas write.table() ,orthesimplest methodistocopyandpastefromtheRConsoletoaworddocument.

2.4EXPRESSIONANDASSIGNMENTINR

Thissectioncoversexpressionandassignmentincludingarithmeticoperators,mathematicalfunctions,and relationaloperators.

1. Arithmeticoperatorsreferstothestandardarithmeticoperators,whichare ^,+,-, *,and/,andeveryoperator performsaspecific(particular)action.Thejobofeachoperatorisshownbelow:

• ^ or **:exponentiationoperation

• +:additionoperation

• -:subtractionoperation

• * :multiplicationoperation

• /:divisionoperation

FIG.2.7 ShowingRcommandsplacedafterthecommandprompt.

ThepriorityofarithmeticoperatorsinRfollowsthestandardprecedence;i.e., ^ isthehighest,andadditionand subtractionarethelowest.Parenthesesareemployedtocontroltheorderofthearithmeticoperators.

EXAMPLE2.1 EMPLOYING ÷, ^ AND+FUNCTIONS

Compute9 3,and3^2,5+3employingR.

Roperatesasacalculatorwitharithmeticoperatorstoperformtherequestedjobs.TheoutputofemployingRbuilt-infunctionsforperformingdivision,exponentiation,andadditionareshownbelow:

>9/3

[1]3

>3^2

[1]9

>5+3

[1]8

2. Ralsodealswithmathematicalfunctionssuchasexp(exponential),sqrt(squareroot),andlog(logarithm);more built-infunctionsareofferedbyRinvariouspackages.

EXAMPLE2.2 EMPLOYLOGFUNCTION

Computelog(3 (1+0.5))Thefunction log() isemployedtocomputethevalueoflog. Log(3/(1+.5)) [1]0.6931472

FIG.2.8 Showinghelpcommand.

3. RelationaloperatorsarealsoofferedbyRstatisticalsoftwarelike >¼ , <¼ , <, >,and!¼.Thetaskofeveryoperatoris shownbelow:

• <¼ :lessthanorequal,

• >¼ :greaterthanorequal,

• < :lessthan,

• > :greaterthan,and

• !¼ :notequal.

2.5VARIABLESANDVECTORSINR

Rhandlesasinglevalueaswellasvectors.Variablesandvectors(matrix)canbedefinedinRbyemployingthe assignmentoperator <-orequalitysign( ¼ );forinstance, Y < 6and Y ¼ 6areequalandbearexactlythesame concept(sense)(assignthevalue6to Y ).Wecanemploythefunction c() todefineaseriesofvaluesintheformofa vectorinR.

>y<-c(dataframe)

whereyreferstothevariablename,and dataframe representstheseriesofvaluesthatshouldbepositioned betweenthetwoparenthesesofthefunction c()

EXAMPLE2.3 PRODUCEAVECTOR

Placethevalues10,7,14,8,11,12inavectorformcalled Y.Thefunction c() isemployedtorepresentthegivenvaluesina vectorform.

>Y<-c(10,7,14,8,11,12)

Note:

• Theassignmentoperator <-takesplacesidebysideorperformsasa “variable-defining” operator,whichisthe valentoftheoperator “ ¼ ”

• Acommaisusuallyemployedtoseparatethedatavaluesinthevector(?,?,...).

• Anycommandcanbeperformedbypressing “Enter.”

• X and x representtwodifferentvariablesinR.

• Consider X and Y aretwovectorsofthesamelength,thenanewvectorwillbeproducedby X + Y (X Y)with valuesrepresentingthesum(difference)ofthecorrespondingvaluesof X and Y.

EXAMPLE2.4 TWOVECTORSOFTHESAMELENGTH

Consider X ¼ 5,3,2and Y ¼ 4,2,6,then X + Y equalsto9,5,8and X Y equalsto1,1,-4ofthesamelength.TheRcommands areemployedtocompute X + Y and X Y.

>X<-c(5,3,2)

>X

[1]532

>Y<-c(4,2,6)

>Y

[1]426

>X+Y

[1]958

>X-Y

[1]11-4

Consider X and Y aretwovectorsofdifferentlength,thenanewvectorwillbeproducedby X + Y (X Y),repeatingthe shortervectorasneeded.Thenumberofvalues(observations)inthe X + Y (X Y)generatedvectorsisequaltothe extendedvector.

EXAMPLE2.5 TWOVECTORSOFDIFFERENTLENGTH

Consider X ¼ 5,3and Y ¼ 4,2,6,then X Y equalsto1,1,-1.TheRcommandsandbuilt-infunctionsareemployedto compute X Y

>X<-c(5,3)

>Y<-c(4,2,6)

>X-Y

[1]11-1

Warningmessage:

InX-Y:longerobjectlengthisnotamultipleofshorterobjectlength

Onecanobservethatthevalue5ofthevector X isemployedtwotimes(5-4)and(5-6),whilethevalue3isemployedonly onetimetomakethelengthof X thesameas Y.

EXAMPLE2.6 CREATEAVECTORWITHAZEROINTHECENTER

Formavectorwith2n +1valuesrepresentingtwocopiesof X withazerointhecenter.Fourvalues3,6,4and8representsthe variable X.Thecommandsemployedtocreateavectorwithazerointhemiddleare:

>X<-c(3,6,4,8)

>z<-c(X,0,X)

>z

[1]364803648

Note:Callingasinglevaluecanbeachievedbyusingthefunction vectorname[] tolocatetheposition. >vectorname[positionoftheelement]

EXAMPLE2.7 EXTRACTAVALUE

Extractthesecondvalue,andthenextractthefourthvalueofavector y ¼ 10, 5, 8, 4.Thefunction y[] inRisemployedto extracttherequestedvalues.

>y<-c(10,5,8,4)

>y[2] [1]5

>y[4] [1]4

Roffersspecificfunctionstoextractsuccessivevaluesofavector.Successivevaluesofthevectorcanbeextractedbyusing colon(:)operator.

Vectorname[A:B] whereAreferstothestartingvalueandBreferstotheendingvalue.

EXAMPLE2.8 EXTRACTSUCCESSIVEVALUES

Extractthelasttwovaluesofavector y presentedin Example2.7.Thecommandforextractingthetwosuccessivevaluesis y[3:4].

>y[3:4]

[1]84

Excludingavaluefromavectorcanbeachievedbyaddinganegativesubscriptassociatedwiththebuilt-infunctionasgiven below.

Vectorname[-positionofthevalue]

thevalueatthatpositionwillbeexcludedfromthevector,andthecallforexcludingsequentialvaluesofthevectorrequires tolocatetwovaluesinthevectorandputthembetweentwobrackets,asshownbelow:

vectorname[-(A:B)] where,AreferstothestartingvalueandBreferstotheendingvalue.

EXAMPLE2.9 ELIMINATEAVALUEORVALUES

Employthedatasetpresentedin Example2.7 toeliminatethethirddatavalueandthentoeliminatethefirstandseconddata values.Thefunction y[-3] isemployedtoeliminatethethirdvalue,andthefunction y[-(1:2)] isemployedtoeliminate thefirstandsecondvalues.

>y<-c(10,5,8,4)

>y[-3] [1]1054

>y[-(1:2)] [1]84

Thefunction y[y<ory>] canbeemployedtocallvaluesthataremoreorlessagivenelement.

y[y< Specificvalue]or y[y> Specificvalue]

EXAMPLE2.10 LESSTHANANDMORETHAN

Locatethevaluesof y thatarelessthan7,lessthan10,andmorethan15respectively,where y ¼ 8,14,13,7,10,15,18,20.The function y[y<ory>] wasemployedtolocatethevaluesthatarelessthanormorethanavalue,asshownbelow.

>y<-c(8,14,13,7,10,15,18,20)

>y[y<7] numeric(0)

>y[y<10] [1]87

>y[y>15] [1]1820

2.5.1MatrixinR

Multivariatedataforvariousvariablescanbepresentedinatableincludingrowsandcolumns,calledamatrix.So far,wehaveexperimentedwithscalarsandhowtoformavectoremployingbuilt-infunctionsinR.Thenextsection showshowtoemployRcommandstogenerateamatrixfromavailabledata.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.