https://ebookmass.com/product/applied-statistics-forenvironmental-science-with-r-1st-edition-abbas-f-m-alkarkhi/
Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...
Applications of Hypothesis Testing for Environmental Science Abbas F.M. Alkarkhi
https://ebookmass.com/product/applications-of-hypothesis-testing-forenvironmental-science-abbas-f-m-alkarkhi/
ebookmass.com
Easy Statistics for Food Science with R Abdulraheem Alqaraghuli
https://ebookmass.com/product/easy-statistics-for-food-science-with-rabdulraheem-alqaraghuli/
ebookmass.com
Applied Statistics with R: A Practical Guide for the Life Sciences Justin C. Touchon
https://ebookmass.com/product/applied-statistics-with-r-a-practicalguide-for-the-life-sciences-justin-c-touchon/
ebookmass.com
Economia applicata all'Ingegneria Tecla De Luca
https://ebookmass.com/product/economia-applicata-allingegneria-teclade-luca/
ebookmass.com
The Terraformers 1st Edition Annalee Newitz https://ebookmass.com/product/the-terraformers-1st-edition-annaleenewitz-2/
ebookmass.com
Easy Italian Reader, Premium 3rd Edition Riccarda Saggese https://ebookmass.com/product/easy-italian-reader-premium-3rd-editionriccarda-saggese/
ebookmass.com
My First Knock-Knock Jokes Jimmy Niro
https://ebookmass.com/product/my-first-knock-knock-jokes-jimmy-niro/
ebookmass.com
Freud and Said: Contrapuntal Psychoanalysis as Liberation Praxis 1st ed. Edition Robert K. Beshara
https://ebookmass.com/product/freud-and-said-contrapuntalpsychoanalysis-as-liberation-praxis-1st-ed-edition-robert-k-beshara/
ebookmass.com
Feelings Transformed: Philosophical Theories of the Emotions, 1270-1670 Dominik Perler https://ebookmass.com/product/feelings-transformed-philosophicaltheories-of-the-emotions-1270-1670-dominik-perler/
ebookmass.com
Diagnostic Ultrasound for Sonographers Aya Kamaya https://ebookmass.com/product/diagnostic-ultrasound-for-sonographersaya-kamaya/
ebookmass.com
APPLIEDSTATISTICSFORENVIRONMENTAL SCIENCEWITHR APPLIED STATISTICSFOR ENVIRONMENTAL SCIENCEWITHR ABBAS F.M.ALKARKHI
WASIN A.A.ALQARAGHULI Elsevier
Radarweg29,POBox211,1000AEAmsterdam,Netherlands TheBoulevard,LangfordLane,Kidlington,OxfordOX51GB,UnitedKingdom 50HampshireStreet,5thFloor,Cambridge,MA02139,UnitedStates
© 2020ElsevierInc.Allrightsreserved.
Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans,electronicormechanical,includingphotocopying, recording,oranyinformationstorageandretrievalsystem,withoutpermissioninwritingfromthepublisher.Detailsonhowtoseek permission,furtherinformationaboutthePublisher’spermissionspoliciesandourarrangementswithorganizationssuchastheCopyright ClearanceCenterandtheCopyrightLicensingAgency,canbefoundatourwebsite: www.elsevier.com/permissions .
ThisbookandtheindividualcontributionscontainedinitareprotectedundercopyrightbythePublisher(otherthanasmaybenotedherein).
Notices Knowledgeandbestpracticeinthisfieldareconstantlychanging.Asnewresearchandexperiencebroadenourunderstanding,changesin researchmethods,professionalpractices,ormedicaltreatmentmaybecomenecessary.
Practitionersandresearchersmustalwaysrelyontheirownexperienceandknowledgeinevaluatingandusinganyinformation,methods, compounds,orexperimentsdescribedherein.Inusingsuchinformationormethodstheyshouldbemindfuloftheirownsafetyandthesafety ofothers,includingpartiesforwhomtheyhaveaprofessionalresponsibility.
Tothefullestextentofthelaw,neitherthePublishernortheauthors,contributors,oreditors,assumeanyliabilityforanyinjuryand/ordamage topersonsorpropertyasamatterofproductsliability,negligenceorotherwise,orfromanyuseoroperationofanymethods,products, instructions,orideascontainedinthematerialherein.
LibraryofCongressCataloging-in-PublicationData
AcatalogrecordforthisbookisavailablefromtheLibraryofCongress
BritishLibraryCataloguing-in-PublicationData
AcataloguerecordforthisbookisavailablefromtheBritishLibrary
ISBN978-0-12-818622-0
ForinformationonallElsevierpublicationsvisitour websiteat https://www.elsevier.com/books-and-journals
Publisher: JoeHayton
AcquisitionEditor: MarisaLaFleur
EditorialProjectManager: ReddingMorse
ProductionProjectManager: OmerMukthar
CoverDesigner: GregHarris
TypesetbySPiGlobal,India
Dedication Abbas
Tothememoryofmyparents(deceased) TomychildrenAtheer,Hibah,andFarah
Wasin
Tothememoryofmyfather(deceased) Tomymother
Preface AppliedstatisticsforenvironmentalsciencewithRwaswritteninaneasystyletointroducesomestatisticaltechniquesthatareusefultostudentsandresearcherswhoworkinenvironmentalscienceandenvironmentalengineering tochoosetheappropriatestatisticaltechniqueforanalyzingtheirdataanddrawingsmartconclusions.TheexplanationoftheRoutputiscarriedoutinastep-by-stepmannerandinaneasyandclearstyletoenablenon-statisticiansto understandanduseitintheirresearch.
Astep-by-stepprocedureisemployedtoperformtheanalysisandtheinterpretationofresultsbymatchingthe resultstothefieldofstudywherethedatawereobtained.Thebookfocusesontheapplicationsofunivariateandmultivariatestatisticaltechniquesinthefieldofenvironmentalscience.Furthermore,realdataobtainedfromresearchover morethanfifteenyearsofworkinenvironmentalsciencewereemployedtoillustratetheconceptsandanalysis.
ThebookusesRstatisticalsoftwaretoanalyzethedataandgeneratetherequiredresults.Risopensourceand providesfacilitiestoprovidefeedbackandproduceahigh-resolutionplot.Furthermore,itiseasytogetonlineassistanceprovidedbyvariouscommunities.RisavailableovertheinternetundertheGeneralPublicLicense(GPL)forthe Windows,Macintosh,andLinuxoperatingsystems.
Finally,wewishtothankourfamilies,friends,andcolleaguesfortheircontinuoussupport.Wewouldliketoextend ourthankstotheRsoftwarecommunityandRfamily(RusersandcontributorstoR)providingthesoftwareforfree. Thisbookwouldnothavebeenpossiblewithouttheinformationprovidedonline,whichiseasytoobtain.Wethank theUniversityofKualaLumpur(UniklMICET)foritssupport.
Abbas Wasin
1 MultivariateData LEARNINGOBJECTIVES Aftercarefulconsiderationofthischapter,youshouldbeable:
• Todescribetheconceptofenvironmentalstatistics.
• Tounderstandtheimportanceofenvironmentalstatisticsformakingintelligentconclusions.
• Todescribetheconceptofmultivariateanalysis.
• Toknowtheadvantagesofemployingmultivariatemethodsforanalyzingenvironmentaldata.
• Toorganizetheoutcomesofmultivariatedatatobepreparedforanalysis.
• Toexplainthedistinctionbetweenunivariateandmultivariatenotions.
• Toexplainhowandwheretoemploymultivariatedata.
• Tounderstandtheunivariatenormaldistribution.
• Tounderstandthemultivariatenormaldistribution.
1.1THECONCEPTOFENVIRONMENTALSTATISTICS Environmentalstatisticsistheapplicationofvariousstatisticalmethods,includingproceduresandtechniquesinthe fieldofenvironmentalscienceandenvironmentalengineering,suchasweather,air,waterquality,climate,soil,fisheriesandotherenvironmentalactivities.Statisticalmethodsareusedindesigningenvironmentalprojectsandforthe analysisandinterpretationofenvironmentaldatatohelpandguidescientistsdrawusefulandmeaningfulconclusions invariousaspectsoftheenvironment.Environmentalstatisticscanhelpindescribingenvironmentalproblemsin termsofmathematicalmodelingtounderstandtheimpactofthechosenvariablesunderstudyandshowthedirection ofchange increaseordecrease orthenatureoftherelationship positiveornegative.Furthermore,statisticaltechniquescanidentifythegeneraltrendandentanglethehiddenrelationshipsaswell,whichwouldhelpscientistsunderstandtheprocessandhaveaclearpictureregardingallrelationshipstoavoidriskandguidemanagementtoproperly planenvironmentalprojects.
Environmentalstatisticscanhelpinunderstandingtheimportanceofvariabilityandoscillationinthedata,employingvariousmeasuresandmethodstoshowtheinfluenceofvariability,andleadscientiststosearchforscientificexplanations.Thus,scientistsshouldatleastlearnandunderstandbasicstatisticstohelpunderstandtheimportanceofthe resultsandtheanalysisthatguidesinformativeconclusions.
1.2THECONCEPTOFMULTIVARIATEANALYSIS Theconceptofunivariatestatisticalanalysiscoversstatisticaltechniquesfortestingadatasetwithonevariable. However,mostresearchprojectsneedtomeasureseveralvariablesforeachresearchunitorindividual(samplingunits orexperimentalunits)inoneormoresamples.
Forexample,considerassessingthewaterqualityofariverbasedonmonitoringcertainparameterssuchaspH, dissolvedoxygen(DO),electricalconductivity(EC),turbidity,biologicaloxygendemand(BOD),chemicaloxygen
demand(COD),andtotalsuspendedsolids(TSS),tomakeadecisionaboutthepollutionstatusoftheriver.Inthiscase, therearesevenvariablestobemeasuredforeachsample,whicharegenerallyhighlycorrelated.Ifonevariableata timeisconsideredtoanalyzetheresultsofmultivariatedata,therelationshipsbetweenthevariableswouldbeignored andadifferentpicturewouldbereflectedregardingthetruebehaviorofthechosenparameters(variables)inthepresenceofotherparameters(variables).Thus,weshoulduseamethodthattakesintoaccountthecorrelationbetween chosenvariablestountietheoverlappinginput(information)bythecorrelatedvariablestounderstandthebehaviorof thechosenvariablesproperly.
Therefore,datasetswithseveralvariablescanbeanalyzed,employingmultivariatemethodsthatconsiderthe relationshipbetweenthechosenvariables.Multivariatemethodsareacollectionoftechniquesthatcanserveseveral purposesinthefieldofenvironmentalscienceandengineering,whichincludeclusteranalysisforrecognizing groupsofsimilarobservation(e.g.,individuals,objects) ;principalcomponentsanalys isandfactoranalysisasdata reductionmethodstoreducethenumberofvariablesto asmallernumberofdimens ions,calledcomponents (factors),whichareuncorrelatedwithoutlosingvaluable information;discriminant analysis,whichisapplied toseparatingthedataintovariousgroupsbasedonthemeasuredvariables;multivariateanalysisofvariance (MANOVA),employedtoperformstatisticalhypothesistestingbasedonmultivariat edata(severalvariables); andmultivariatemultipleregressionanalysis,whichisemployedformakingpredictionsbasedontherelation amongthevariables.
1.3CONFIGURATIONOFMULTIVARIATEDATA Wecanorganizemultivariatedataforvariousvariablesmeasuredfromanumberofsamples(items)inatable. Thenumberofsamples(items)specifythenumberofrowsandthenumberofvariablesspecifythenumberofcolumns ofthetable.
Ingeneral, Table1.1 showstheconfigurationof n samples(representingtherows)and k variables(representingthe columns)measuredforeachsample.
Note:Thetotalnumberofvariablesmeasuredforeachsample(k)isusuallysmallerthanthetotalnumberof samples, n.
1.4EXAMPLESOFMULTIVARIATEDATA Wecanillustratetheconceptofmultivariatedatabyprovidingsomerealexamplesofmultivariatedatawithregard toenvironmentalsciencethatmaybeusefulinmoreeasilyexplainingtheassociatedcomputations.Furthermore,the interpretationoftheseexampleswillbegiven,includingthescenarioofeachcase.
TABLE1.1 TheConfigurationof n Samplesand k VariablesinMultivariateForm
EXAMPLE1.1 ASSESSMENTOFSURFACEWATER AnassessmentofwaterqualityintheJuruandJejawiRiversinthePenangStateofMalaysiawasconductedbymonitoring 10physiochemicalparameters;namely,theelectricalconductivity(EC)andtemperatureweremeasuredemployinganHACH portablepHmeter,anddissolvedoxygen(DO)wasmeasuredwithaYSI1000DOmeter.Thechemicaloxygendemand(COD), biochemicaloxygendemand(BOD),totalphosphateconcentrations,andtotalnitratewereanalyzedemployingaspectrophotometer(HACH/2010).Theturbiditywasmeasuredemployinganephelometer.Thetotalsuspendedsolids(TSS)wereanalyzedgravimetricallyinthelaboratory.TheAPHAStandardMethodsfortheExaminationofWaterandWastewaterwere appliedtoanalyzetheconcentrationoftheabovementionedparameters.Thedataobtainedfrom10differentsitesareeachgiven in Table1.2
TABLE1.2 PhysiochemicalParametersfortheJuruandJejawiRivers
LocationTemperaturePHDOBODCODTSSECTurbidityTotalphosphateTotalnitrate
Juru28.157.886.7310.561248.00473.3342.4513.050.8812.45
Juru28.20
Inthisexample,20samples(rows)arecollectedfromtheJuruandJejawiRivers(10differentsiteseach),and10physiochemicalparameters(columns)aremeasuredfromeachsample.Itisverydifficultfortheresearcherstomakeadecisionaboutthe pollutionstatusoftheriversoraboutthebehaviorofthedifferentparametersbasedonthechosenparametersunderstudy becausetherearemanydifferentvalues,andthedifferences(fluctuations)betweenthevaluesoftheparametersfromsample tosamplewillmisleadtheresearcher,makingitdifficulttomakeacorrectdecision.Thus,thesedatashouldbeanalyzed employinganappropriatemultivariatemethodthatmeetstheobjectiveofthestudytountangletheinformationandtounderstandthebodyofthephenomenon.Theobjectivesofthisresearchweretodeterminetherangeofsimilarityamongthesampling sites,torecognizethevariablesresponsibleforspatialdifferencesinriverwaterquality,todeterminetheunobservedfactors whiledemonstratingtheframeworkofthedatabase,andtoquantifytheeffectofpossiblenaturalandanthropogenicsourcesof thechosenwaterparametersoftherivers.
EXAMPLE1.2 ASSESSMENTOFLANDFILLLEACHATETREATMENT Aresearcherwishestoinvestigatetheconcentrationsof1 2heavymetals(magnesium(Mg),calcium(Ca),sodium(Na), iron(Fe),zinc(Zn),copper(Cu),chromium(Cr),cadmium(Cd),lead(Pb),arsenic(As),cobalt(Co),andmanganese(Mn)) andeightphysiochemicalparameters(chemicaloxygendemand(COD),biochemicaloxygendem and(BOD),totaldissolved solids(TDS),totalsuspendedsolids(TSS ),electricalconductivity(EC),pH,ammoniacal-N(NH3N),anddissolvedoxygen (DO))inthreeponds(collection,aeration,andstabilization)ofalandfillleachate.Thedataforthethreepondsaregivenin Table1.3
TABLE1.3 TheResultsofPhysiochemicalandHeavyMetalsObtainedFromThreePondsofLandfillLeachate
Pond
pHECTDSTSSCODBODNH3HDOMgCa
Collection7.196242.004530.0043.26944.679.5781.666.9012.5814.28 Collection7.366062.004392.0029.45845.338.7761.336.6917.9113.17 Collection7.416164.004476.0031.52897.009.2171.006.4814.4415.54
Collection7.926710.334710.0039.36810.008.6652.436.8519.9988.25
Collection7.846690.004709.8336.46686.679.6655.806.8520.5081.78
Collection7.836600.334654.8337.001365.0010.3352.807.2321.1885.18
Collection8.466893.004849.0047.33692.008.47110.672.3427.2324.94
Collection8.476850.004860.0050.67967.009.2793.002.8219.0114.38
Collection8.486873.004859.0049.00833.008.87103.002.6724.6619.99
Aeration8.982501.001792.00101.67502.6776.333.277.8810.9817.04
Aeration8.972438.001750.0089.00543.6793.678.907.6210.999.84
Aeration9.072422.001762.0092.00526.6771.0010.808.3110.9815.00
Aeration8.212255.001594.00148.83566.674.664.307.2318.5154.30
Aeration8.382092.831577.83151.00618.3315.333.807.3720.1454.08
Aeration8.322240.671578.83162.67573.3310.004.407.5521.7755.30
Aeration9.971513.001065.0051.00339.009.1020.088.087.965.25
Aeration10.181497.001049.0042.00322.009.3223.278.016.854.44
Aeration10.211532.001084.0038.00324.008.9523.268.128.536.78
Stabilization8.671491.001087.00115.00521.3317.331.376.9520.8219.99
Stabilization8.601521.001104.00105.33504.0029.677.826.4220.5023.10
Stabilization8.581545.001123.00108.33532.0047.336.005.7120.4827.13
Stabilization8.821321.83932.6798.00253.6713.666.607.5511.9266.96
Stabilization8.921352.88965.0089.39520.0015.006.207.2312.4665.28
Stabilization8.931314.83942.83101.61395.0015.006.207.2312.3462.13
Stabilization9.09652.00462.0057.00248.004.982.606.744.471.96
Stabilization9.19669.00476.0051.00257.005.612.596.714.112.31
Stabilization9.43680.00480.0047.00273.004.921.936.663.872.30
TABLE1.3 TheResultsofPhysiochemicalandHeavyMetalsObtainedFromThreePondsofLandfillLeachate—cont’d
Pond
Parameter NaFeZnCuCrCdPbAsCoMn
Collection978.370.3514.469.8031.500.312.7132.286.345.44 Collection978.370.2317.899.0322.370.323.3520.555.149.55 Collection978.370.6211.509.5910.700.012.169.753.9516.60 Collection284.440.5623.8057.1018.120.858.0311.526.2652.72 Collection273.470.5624.0057.4017.041.737.6014.417.4351.82 Collection283.800.5623.1056.8017.581.717.0410.096.6150.92 Collection673.451.3461.0164.9242.360.040.7025.2121.30109.34 Collection633.351.9849.1553.4559.060.563.9114.8118.78133.59 Collection665.610.7927.4858.7552.270.062.0817.0723.67122.59
Aeration980.000.25151.100.4347.820.012.8210.726.7422.94
Aeration975.550.73259.082.833.670.392.7211.122.6132.44
Aeration970.551.21203.885.2391.970.082.6210.9210.8141.94
Aeration525.570.87425.4669.4286.180.567.1721.6616.184.97
Aeration516.210.93424.4366.6889.810.357.5721.3615.995.32
Aeration517.130.85423.4563.6385.560.167.3720.6815.785.96
Aeration676.980.7728.0513.6627.880.322.586.735.738.52
Aeration630.310.9418.6915.9737.110.084.9010.125.6110.34
Aeration665.120.5734.4416.7530.650.113.867.666.759.52
Stabilization975.371.33117.473.9055.150.303.8022.2315.0111.50
Stabilization978.370.81278.475.2170.290.014.1218.1020.8421.61
Stabilization973.370.29175.476.5154.570.002.9012.1517.9330.51
Stabilization411.601.37229.9023.9348.470.286.4516.555.196.44
Stabilization439.041.43227.3925.2349.260.615.2515.754.756.02
Stabilization494.261.23221.8925.0348.830.015.8917.354.626.82
Stabilization141.000.33149.116.4212.880.022.485.711.281.63
Stabilization103.810.87240.777.918.760.144.633.321.541.93
Stabilization81.200.51190.686.6411.000.043.782.371.621.66
Theleachatesampleswerecollectedfromcollection,aeration,andstabilizedpondsintheATLSleachatecollectionsystem. TheleachatesampleswerecollectedthreetimesduringtheperiodbetweenAugust2017andJanuary2018,withthreesampling pointsateachpond.Thesamplesweremanuallygatheredandplacedin500mlpolyethenecontainers.Thesampleswere immediatelytransportedtothelaboratoryandcooledto4°Ctoreducebiologicalandchemicalreactions(JapanInternational CooperationAgency(JICA)).
Inthisexample,27samples(rows)werecollectedfromthethreeponds(collection,aeration,andstabilization),and 20parameters(columns)weremeasuredfromeachsample.It isnoteasyforthescientist tomakeadecisionaboutthe
treatmentprocessofthelandfilloraboutthebehaviorofthedifferentparametersunderstudybecausetherearemanydifferentvalues(27samples 20parameters ¼ 540values).Thus,therelationshipamongthedifferentchosenparameters shouldbeinvestigatedandstudiedproperlytounderstandthedifferences(fluctuations)intheparametersfromonepond toanother.Thesedatashouldbeanalyzedemployinganappropriatetechniquetoac hievetheobjectiveoftheproject.The firstobjectivewastoassesswhetherthetreatmentprocessofthelandfillleachateworkedproperly,andtherelationship amongthechosenparametersshouldbeinvestigatedformoreinformationonthebehaviorofeachvariableinthepresence ofotherchosenvariables,whichwouldhelptoidentifythesourceofthevariation.Thesecondobjectivewastoassessthe effectofthelandfillonthegroundwaterandsurfacewaterinthechosenarea(thedataforgroundwaterandsurfacewaterare notpresentedtosavespace).Furthermore,thecontributionofeachchosenvariable(parameter)inillustratingthetotalvariationinthecollecteddatawasidentifiedemployingmultivariatemethods.Thisresearchmayhelpinestimatingtheimpact ofthelandfillongroundwaterandsurfacewaterinthechosenarea.
EXAMPLE1.3 INORGANICELEMENTSINTHEPARTICULATEMATTER INTHEAIR Aresearcherwishestoinvestigatetheconcentrationsofnineinorganicelementsintheparticulatematter(PM10)intheairof anequatorialurbancoastallocation.In2009,airpollutionlevelswerestudiedduringthesummerandwintermonsoonseasons employinghigh-volumesamplingtechniques.AtomicabsorptionspectrophotometrywasemployedtocollectPM10 samples, withanaveragetimeof24h.TheparametersweretheparticulatematterPM10,aluminum(Al),zinc(Zn),iron(Fe),copper(Cu), calcium(Ca),sodium(Na),manganese(Mn),nickel(Ni),andcadmium(Cd).Thedataaregivenin Table1.4
TABLE1.4 TheResultsofInorganicElementsinParticulateMatterintheAir(μg/m3) SeasonPM10 AlZnFeCuCaNaMnNiCd
Summer37.710.012.320.970.001.377.940.050.010.01
Summer48.030.012.050.750.001.568.460.050.010.04
Summer67.870.013.430.440.001.419.980.060.100.05
Summer39.010.014.240.450.001.1212.930.010.070.00
Summer38.330.013.510.500.021.609.920.020.000.05
Summer29.700.012.780.580.021.3812.130.040.060.02
Summer53.660.012.400.730.021.597.810.230.110.02
Summer132.280.012.340.550.021.599.010.030.000.01
Summer66.310.012.130.450.021.238.760.020.000.05
Summer69.200.012.130.500.031.958.660.030.080.05
Summer78.170.011.960.560.041.358.880.030.020.07
Summer31.630.012.210.420.041.249.020.020.080.05
Summer66.730.011.460.510.031.338.530.020.130.03
Summer113.560.012.070.600.031.517.650.020.060.05
Summer123.400.011.750.570.031.757.450.020.120.04
Summer72.390.011.610.570.031.548.400.020.070.02
Summer51.850.011.160.510.181.727.440.010.090.00
Summer77.590.011.660.810.042.047.890.030.080.03
Summer30.300.011.090.610.031.806.280.020.050.03
Summer100.400.011.820.470.041.707.050.020.050.02
Summer132.980.011.860.520.031.747.690.010.100.03
Summer126.380.010.340.020.000.011.360.010.180.00
TABLE1.4 TheResultsofInorganicElementsinParticulateMatterintheAir(μg/m3)—cont’d
SeasonPM10 AlZnFeCuCaNaMnNiCd
Summer31.820.011.410.520.031.937.040.020.040.00
Summer110.530.011.250.670.042.297.390.020.030.00
Summer38.410.011.550.790.122.217.740.020.060.03
Summer124.260.012.240.000.001.569.650.070.060.01
Summer53.620.011.970.010.001.397.070.030.040.03
Summer23.300.011.280.020.001.806.390.030.000.00
Summer67.340.010.910.020.000.807.140.000.080.00
Summer22.580.031.290.020.010.927.210.020.050.02
Summer54.520.031.370.040.011.487.980.010.030.03
Summer112.830.011.880.030.001.468.440.010.020.00
Summer175.280.011.680.030.001.388.130.000.030.00
Summer47.300.012.200.030.001.1311.390.010.030.01
Summer57.080.011.690.040.001.048.340.020.070.00
Summer15.160.011.580.040.011.776.690.050.080.02
Summer272.980.011.880.040.011.607.050.040.080.00
Summer101.820.011.860.020.011.796.020.050.130.00
Summer59.900.011.730.020.011.707.250.050.000.02
Summer31.070.011.920.020.021.404.920.050.030.04
Summer107.430.012.030.020.021.485.190.040.060.00
Summer30.510.012.420.030.032.085.700.070.000.04
Summer84.020.012.100.220.011.615.850.030.000.00
Summer7.370.012.630.350.021.243.900.030.000.00
Summer7.650.011.380.340.011.114.200.040.000.00
Summer15.050.011.560.210.020.352.470.030.000.00
Summer84.130.011.890.040.020.414.440.020.000.02
Winter82.870.011.660.150.020.644.790.020.060.00
Winter138.250.012.530.300.030.423.020.020.000.03
Winter82.670.012.660.310.040.033.160.030.000.00
Winter46.080.013.390.520.020.212.340.030.000.00
Winter15.330.012.630.230.020.064.380.030.050.00
Winter30.780.011.980.170.010.403.680.010.000.00
Winter155.000.013.720.150.021.214.240.020.000.00
Winter61.750.012.930.000.030.924.290.010.000.00
Winter88.900.014.910.110.020.936.650.020.000.00
Winter38.470.013.460.150.020.734.250.010.010.00
Winter24.460.011.970.190.010.751.960.010.000.00
Winter14.290.012.760.110.010.679.550.030.000.00
Winter65.520.012.560.320.010.542.700.040.000.00
Winter63.330.011.210.160.020.533.400.040.030.00
Winter62.280.013.520.430.010.807.780.020.110.00
TABLE1.4 TheResultsofInorganicElementsinParticulateMatterintheAir(μg/m3)—cont’d SeasonPM10 AlZnFeCuCaNaMnNiCd
Winter65.870.013.480.250.020.9911.060.010.030.00
Winter122.830.013.240.410.031.0911.470.010.060.00
Winter 50.180.002.400.490.021.167.900.020.110.00
Winter60.190.013.020.380.020.849.510.020.060.01
Winter 28.280.013.680.480.061.1812.210.010.090.00
Winter 42.500.011.860.460.061.3410.510.030.070.00
Winter 49.680.012.290.320.041.5711.020.020.000.04
Winter39.910.013.300.450.051.468.610.030.080.02
Winter37.940.013.570.310.081.599.470.040.000.05
Winter27.970.013.140.090.061.4612.080.030.000.05
Winter45.340.012.590.260.061.5911.430.030.060.00
Winter58.450.013.460.120.061.668.980.040.100.00
Winter43.300.011.580.090.051.467.590.030.000.00
Table1.4 providesahugeandcomplexdataset.Itcaneasilybeobservedthatthetabledoesnotprovidehelpfuldata(information)formakingadecision.TheobjectiveofthestudywastoassesstheairqualityofPenang,Malaysia,intermsofPM10 and inorganicelements,andtorecognizethemainsourcesofPM10 andinorganicelements,whethercrustalornoncrustal.Thegoals includedinvestigatingtherelationbetweenthedifferentchoseninorganicelementsandPM10 duringthesummerandwinter monsoonsanddeterminingthesimilaritiesbetweenthechosenparameters.
EXAMPLE1.4 HEAVYMETALSINSEDIMENT(mg/L) Aresearcherwishestoinvestigatetheconcentrationsofeightheavymetals(cadmium(Cd),iron(Fe),copper(Cu),zinc(Zn), chromium(Cr),mercury(Hg),manganese(Mn),andlead(Pb))insedimentsobtainedfromtwosites.Thetwosites,with10samplingpointsateachsite,wereKualaJuru(theJuruRiver)andBukitTambun(theJejawiRiver)inthePenangStateofMalaysia. ThesedimentsamplesweregatheredatlowtidewithanEijkelkampgougeaugerfromeachofthe20chosensamplingpoints(10 samplingpointsfromeachriverestuary).Aflameatomicabsorptionspectrometer(FAAS;PerkinElmerHGA-600)was employedfortheanalysisofCu,Pb,Zn,Cd,Mn,Fe,andCr,andacoldvaporatomicabsorptionspectrometer(CV-AAS) methodwasusedforHganalysisaftersampledigestioninanacidsolution.Thedataaregivenin Table1.5
TABLE1.5 TheConcentrationofHeavyMetalsinSediment(mg/L)forJuruandJejawiRivers
TABLE1.5 TheConcentrationofHeavyMetalsinSediment(mg/L)forJuruandJejawiRivers—cont’d
Inthisstudy,theresearcherwantedtoinvestigateandunderstandtheinterrelationshipbetweenthechosenparametersand toextractinformationabouttheresemblanceordifferencesbetweenthedifferentsamplingsites,identificationofthevariables (heavymetals)accountableforthespatialdifferencesinriverestuaries,andtheeffectofthepossiblesources(naturaland anthropogenic)onthechosenheavymetalsofthetworiverestuaries.
1.5MULTIVARIATENORMALDISTRIBUTION Thenormaldistributionisthemostimportantdistributioninstatistics.Thenormaldistributionisveryimportant becausemostofthetestsusedinstatisticsrequirethattheassumptionofnormalitybemet;thedataaregatheredfroma normallydistributedpopulation.
Abriefexplanationoftheunivariateandmultivariatenormaldistributionsisgivenbelow.
1.5.1UnivariateNormalDistribution Suppose Y isarandomvariablethatfollowsthenormaldistribution.Then,theprobabilitydistributionfunctionof theunivariatenormaldistributionisgiveninEq. (1.1).
where
μ isthemean;and σ 2 isthevariance.
Theunivariatenormaldistributioncanbewrittenas Y N(μ, σ 2).
Thenormaldistribution(bell-shaped)curveispresentedin Fig.1.1.Rstatisticalsoftwarewasusedtogenerate thecurve.Thecommandsandbuilt-infunctionsforcreatinganormaldistributioncurvearepresentedinthe Appendix.
1.5.2MultivariateNormalDistribution Suppose Y isarandomvectorof k randomvariablesthatfollowsthemultivariatenormaldistribution.Then,the multivariatenormaldistributionisgiveninEq. (1.2)
where k isthenumberofvariables; μ isthemeanvector; P isthecovariancematrix;and ðY μÞ P 1 Y μ ðÞ istheMahalanobisdistance(statisticaldistance).
Themultivariatenormaldistributioncanbedenotedas Y Nk( μ, P).
Note:
• Asuitabletransformationofthedatashouldbeusedifthenormalityassumptionisviolatedbyoneormore variablesunderinvestigation,aswhenthedataarehighlyskewedwithseveraloutlier(extreme)values(highor low)orrepeatedvalues.
• Ifalltheindividualvariablesfollowanormaldistribution,thenitissupposedthatthecombined(joint)distribution isamultivariatenormaldistribution.
• Inpractice,therealdataneverfollowamultivariatenormaldistributioncompletely;however,thenormaldensity canbeemployedasanapproximationofthetruepopulationdistribution.
FurtherReading Alkarkhi,A.F.M.,Alqaraghuli,W.A.A.,2019.EasyStatisticsforFoodSciencewithR,firsted.AcademicPress. Alkarkhi,A.F.M.,Ismail,N.,Ahmed,A.,Easa,A.M.,2009.AnalysisofheavymetalconcentrationsinsedimentsofselectedestuariesofMalaysia a statisticalassessment.EnvironMonitAssess153,179–185.
Banch,T.J.H.,Hanafiah,M.M.,Alkarkhi,A.F.M.,Amr,S.S.A.,2018.Statisticalevaluationoflandfillleachatesystemanditsimpactongroundwater andsurfacewaterinMalaysia.
Blogger,2011.Rgraphgallery:Acollection[Online].Available, http://rgraphgallery.blogspot.my/2013/04/shaded-normal-curve.html Bryan,F.J.M.,1991.MultivariateStatisticalMethods:APrimer.Chapman&Hall,GreatBritain. Daniel&Hocking,2013.BlogArchives,HighResolutionFiguresinR[Online].R-bloggers.Available, https://www.r-bloggers.com/author/ daniel-hocking/ Johnson,R.A.W.,W,D.,2002.AppliedMultivariateStatisticalAnalysis.PrenticeHall,NewJersey. Rencher,A.C.,2002.MethodsofMultivariateAnalysis.J.Wiley,NewYork.
Yusup,Y.,Alkarkhi,A.F.M.,2011.Clusteranalysisofinorganicelementsinparticulatematterintheairenvironmentofanequatorialurbancoastal location.ChemistryandEcology27,273–286.
FIG.1.1 Normaldistributionplot.
2 RStatisticalSoftware LEARNINGOBJECTIVES Aftercarefulconsiderationofthischapter,youshouldbeable:
• ToknowRsoftware.
• ToknowRStudio.
• TodescribehowtosetupRpackages.
• Toknowhowtowritevariables,vectors,sequences,andmatrixesinRlanguage.
• TouseRcommandsandbuilt-infunctionsforenvironmentaldata.
• TounderstandhowtocallstoreddatafilesintotheREnvironment.
• Toknowhowtowritedownsimplescripts.
• TochoosesuitableRcommandstowritethescripts.
• ToknowhowtogeneratehighresolutionplotsinR.
• Tounderstandtheconceptofworkingdirectoryandhowtosetanewworkingdirectory.
• TodescribeRoutputandextractausefulreport.
2.1INTRODUCTION Risasoftwareenvironmentforstatisticalcomputingandgraphicalprogramminglanguages.TheRsoftwarehas beenemployedbyprofessionalsatvariousorganizations,colleges,surveyinstitutions,andothers.Considerable statisticalpackagesareprovidedforvariousstatisticalanalyses;however,researchers,scientists,andotherswho areconcernedindataanalysis,designing,modeling,andproducingbeautifulandhigh-resolutionplotsprioritize employingR.Risofferedforfreeasanopen-sourcesoftware;underthetermsoftheGNUGeneralPublicLicense, “RisanofficialpartoftheFreeSoftwareFoundation’sGNUproject,andtheRFoundationhassimilarobjectives tootheropen-sourcesoftwarefoundationsliketheApacheFoundationortheGNOMEFoundation.” Rlanguageis similartotheSlanguageandenvironmentthatwasdevelopedbyBellLaboratories.Risemployedbymillionsof researchersaroundtheworld,andthenumberofRuserscontinuestoincrease.Rhasbecomeasubstantial,engaging, singular,andnewstatisticalsoftwareforthefollowingpurposes:
• Risfree(open-source)softwareincludingmanypackages,andmanysourcesaroundtheworldpermit downloadingandinstallationofthesoftware,regardlessofthepositionandtheinstitutionyouworkwith,or whetheryouareaffiliatedwithapublicorprivateorganization.
• Roffersmanybuilt-infunctionstohelpmakethestepsoftheanalysissimpleandeasy.Wecancarryoutdata analysisinRbyprovidingscriptstoknowtherequiredvariablesandaskingbuilt-infunctionsinRtocarryoutthe requiredprocess,suchascomputingthecorrelation,average,variance,orotherstatisticalvalues.
• Clear,high-resolution,anduniquegraphscanbeproducedbyRthatmeetspecificstandardsorreflectaparticular opinionofthetask(conveythoughtstotheplot).
• Rcaneasilybeusedbyresearcherswithoutprogrammingproficiency.
• PeoplecandownloadandinstallRforvariousoperatingsystemssuchasWindows,Linux,andMacOS.
• ManystatisticalandgraphicalpackagesareprovidedbyRlibraryfordataanalysis,anddifferentcomputationsand graphicalapplicationsgeneratehigh-qualityplots.
• Itiseasytointeractwithonlinecommunityaroundtheworld,interchangethoughts,andreceiveassistance.
• Rcodes,commands,andfunctionsareavailableonlineforfree;moreover,considrablesourcesofferdemonstrations regardingRforfree,containingcourses,material,andresponsestoinquiries,whichothersoftwarepackagesdo notoffer.
• MorefacilitiesareofferedbyRStudiotooperateR,anditissimplerandfriendliertoemploythanR.
TheconceptsandrelatedtermstotheRstatisticalpackageareaddressedtoprovideastartingpointforreaderswho arenovicetotheRlanguageandenvironment.Beginnerswillbeguidedonhowtodownloadandinstallthesoftware (RandRStudio),comprehendsomeideasandrelatedconceptsemployedinR,andwritesimpleandeasyscriptsinR. Weattemptedasmuchaswecouldtomaketheprocedureanddirectivessimpleandcomprehensibletoeverybody. Considerableexamplesareprovidedtoleadtheresearchersstep-by-stepandmaketheprocedureinteresting.
WecandownloadandinstallRstatisticalsoftwareanditspackagessimplyinafewsteps,andtherequiredpackagesrelatedwithRsoftwarearetheninstalled.Rprovidesconsiderablepackagestocarryoutvariousstatistical methods.AfterinstallingRsoftware,wecandownloadandinstallRStudiothatcanbeemployedtooperate ReffectivelyandinamorefriendlywaythanR.
2.2INSTALLINGR ConsiderwehavenotinstalledRstatisticalsoftwareyet.Thesoftwarecanbeinstalledanddownloadedforfreeby employingthesixstepsbelow.
1. Thereadercantype https://cran.r-project.org/ andthenclick “Enter” ; “theComprehensiveRArchiveNetwork” willappearasshownin Fig.2.1
2. TheScreenfor “TheComprehensiveRArchiveNetwork” offersthreechoicestodownloadRsoftwarebasedonthe operatingsystemofthecomputer,aspresentedbelow: (1) DownloadRforLinux (2) DownloadRfor(Mac)OSX (3) DownloadRforWindows IfwechooseRforWindows,clickinstallRforthefirsttime(orbase)aspresentedin Fig.2.2.
FIG.2.1 ComprehensiveRArchiveNetwork.
FIG.2.2 ShowingtheinstructionsforinstallingRforWindows.
FIG.2.3 TheScreentodownloadR-3.5.1forWindows(32/64bit).
3. Wecanclickontheavailableversion.ThenewestavailableversionforRsoftwareisR-3.5.1,asshowsontheScreen, ortheremaybeotherversions.Clickonthe “DownloadR-3.5.1forWindows(62megabytes,32/64bit)” asshownin Fig.2.3.
4. Clickonthe “DownloadR-3.5.1forWindows(62megabytes,32/64bit)”;inthelowerbottom-leftcorner,thereisa message “R-3.5.1-win.exe” (Fig.2.4),thisindicatesthatthefilestartsdownloadingtothecomputer.
5. ClickondownloadfileoncethedownloadofRisfinishedtoopenanotherscreen,andthenclickon “ run ” asshown in Fig.2.5.Thenextstepistofollowtheguidancegiventofinishtheinstallation.
6. Risinstalledonthecomputerandreadytobeused.WecanstartusingRbydoubleclickingontheRicon.
2.2.1RMaterial Itishighlyusefultohaveahandbookormanualstodirectreadersemployingthenewsoftware,particularlyfor noviceswhoarenewtotheRsoftware.Thisserviceisprovidedforfree;wecanuseon-linewebsitestodownload manualsandnotessuchas https://cran.r-project.org/ ,orotherauthorizedRsources.Thehandbooks,manuals andnotescanbeobtainedofflinebyemployingthehelpbuttonintheupperrowoftheRenvironment,aspresented in Fig.2.6;thehelpbuttonprovidesmanyoptions,oneofwhichisManuals(PDF).
FIG.2.4 Showingtheplaceofdownloadedfile.
FIG.2.5 Showingthedirectivestodownloadthesoftware.
Wehavefoundthatthehandbooksandassociatednotesarehelpfulandoffercleardirection,particullaryfornovice readers.Somedocumentshavebeenproducedinvariouslanguages,suchasRussian,Chinese,andGerman.
2.2.2RPackages Roffersvariousstatisticalpackages,andsomepackagesarebuilt-inpackages(standard/basepackages,loaded packagesonceRinstallationisfinished).UserscandownloadotherpackagesfromtheupperrowoftheRConsole (“Packages”).Thepackagescanbedownloadedbyclicking “Installpackage(s),” andthenselectthesiteyouwish todownload.AlistofPackagesareprovidedinRsoftware,thusyoucanselectthepackageyourequiretoinstall. Thesearch () functionisemployedtodisplaysomeoftheloadedpackagesinyourcomputerwhenRstarts.
Search()
Thefunction search() isusedtoshowtheloadedpackages.
>search()
[1]".GlobalEnv""package:stats""package:graphics" [4]"package:grDevices""package:utils""package:datasets" [7]"package:methods""Autoloads""package:base"
2.3THERCONSOLE TheScreenwhereweplacethecommands;thenrunthescriptstoperformtheanalysisoranywantedcomputations iscalledRConsole.Thereisasymbolcalledthe “commandprompt” bydefault >,whichisplacedattheendofthe Console.TheRscriptsshouldbewrittenafterthecommandprompt.Forexample,8+3isdefinedandthenfollowed by “Enter” toobtaintheoutput(11)aspresentedin Fig.2.7.
ThebuttonsforFile,Edit,View,MISC,Packages,WindowsandHelpareshownintheupperrowoftheScreen;each buttonhaschoicestocarryoutaparticularjob.
Note:
• Theoutputlinesforrunninganyscriptareprecededby[1].
• Built-infunctionsarecalledintoperformtherequiredjob,aswehaveseenintheprecedingexamplefor8+3; pressing “Enter” willinvitethebuilt-infunctionforadditiontoperformtheproposedtask.Namesareusedtocall thebuilt-infunctionsinRfollowedbytheargumentinparentheses;thenpress “Enter,” whichisaninstructionto performtherequestedjob.Usually,thebuilt-infunctionsareavailableinthememoryofthecomputer;forinstance, toquitR,weshouldwritethebuilt-infunction q()
>q()
• Askingforassistanceisachievedbyemployingthefunction help() toturnonanewScreenconcerningtherequested subject.Forexample,the help(quit) commandwillofferthereadyinputconcerningtothequit(terminatean Rsession)inRlibrarycontainingacharacterizationofthefunction,application,arguments,references,andmodels.
>help(quit)
FIG.2.6 Showingthestepstodownloadnotesandmanuals.
Thequestionmark (?) isanotherbuilt-infunctionemployedforrequestingassistance,whichmaybeutilizedas ashortcutforaskingassistance,asin ?quit.Thetwofunctions help(quit) and ?quit arevalent.Thescreenfor callingassistanceemployingbothfunctionsispresentedin Fig.2.8.
• Writingthenameofavariable(parameter)asacommandwillworksimilarlytothebuilt-infunction print();for example,theoutcomeofB ¼ 6+4canbeprinted(displayedonscreen)eitherbywritingBorwriting print(B) as presentedbelow.
>B=6+4
>B #PrintthevalueofB [1]10
>print(B)#printthevalueofB [1]10
• WecantransfertheoutputtoafilebyemployingRbuilt-infunctionssuchas write.table() ,orthesimplest methodistocopyandpastefromtheRConsoletoaworddocument.
2.4EXPRESSIONANDASSIGNMENTINR Thissectioncoversexpressionandassignmentincludingarithmeticoperators,mathematicalfunctions,and relationaloperators.
1. Arithmeticoperatorsreferstothestandardarithmeticoperators,whichare ^,+,-, *,and/,andeveryoperator performsaspecific(particular)action.Thejobofeachoperatorisshownbelow:
• ^ or **:exponentiationoperation
• +:additionoperation
• -:subtractionoperation
• * :multiplicationoperation
• /:divisionoperation
FIG.2.7 ShowingRcommandsplacedafterthecommandprompt.
ThepriorityofarithmeticoperatorsinRfollowsthestandardprecedence;i.e., ^ isthehighest,andadditionand subtractionarethelowest.Parenthesesareemployedtocontroltheorderofthearithmeticoperators.
EXAMPLE2.1 EMPLOYING ÷, ^ AND+FUNCTIONS Compute9 3,and3^2,5+3employingR.
Roperatesasacalculatorwitharithmeticoperatorstoperformtherequestedjobs.TheoutputofemployingRbuilt-infunctionsforperformingdivision,exponentiation,andadditionareshownbelow:
>9/3
[1]3
>3^2
[1]9
>5+3
[1]8
2. Ralsodealswithmathematicalfunctionssuchasexp(exponential),sqrt(squareroot),andlog(logarithm);more built-infunctionsareofferedbyRinvariouspackages.
EXAMPLE2.2 EMPLOYLOGFUNCTION Computelog(3 (1+0.5))Thefunction log() isemployedtocomputethevalueoflog. Log(3/(1+.5)) [1]0.6931472
FIG.2.8 Showinghelpcommand.
3. RelationaloperatorsarealsoofferedbyRstatisticalsoftwarelike >¼ , <¼ , <, >,and!¼.Thetaskofeveryoperatoris shownbelow:
• <¼ :lessthanorequal,
• >¼ :greaterthanorequal,
• < :lessthan,
• > :greaterthan,and
• !¼ :notequal.
2.5VARIABLESANDVECTORSINR Rhandlesasinglevalueaswellasvectors.Variablesandvectors(matrix)canbedefinedinRbyemployingthe assignmentoperator <-orequalitysign( ¼ );forinstance, Y < 6and Y ¼ 6areequalandbearexactlythesame concept(sense)(assignthevalue6to Y ).Wecanemploythefunction c() todefineaseriesofvaluesintheformofa vectorinR.
>y<-c(dataframe)
whereyreferstothevariablename,and dataframe representstheseriesofvaluesthatshouldbepositioned betweenthetwoparenthesesofthefunction c()
EXAMPLE2.3 PRODUCEAVECTOR Placethevalues10,7,14,8,11,12inavectorformcalled Y.Thefunction c() isemployedtorepresentthegivenvaluesina vectorform.
>Y<-c(10,7,14,8,11,12)
Note:
• Theassignmentoperator <-takesplacesidebysideorperformsasa “variable-defining” operator,whichisthe valentoftheoperator “ ¼ ”
• Acommaisusuallyemployedtoseparatethedatavaluesinthevector(?,?,...).
• Anycommandcanbeperformedbypressing “Enter.”
• X and x representtwodifferentvariablesinR.
• Consider X and Y aretwovectorsofthesamelength,thenanewvectorwillbeproducedby X + Y (X Y)with valuesrepresentingthesum(difference)ofthecorrespondingvaluesof X and Y.
EXAMPLE2.4 TWOVECTORSOFTHESAMELENGTH Consider X ¼ 5,3,2and Y ¼ 4,2,6,then X + Y equalsto9,5,8and X Y equalsto1,1,-4ofthesamelength.TheRcommands areemployedtocompute X + Y and X Y.
>X<-c(5,3,2)
>X
[1]532
>Y<-c(4,2,6)
>Y
[1]426
>X+Y
[1]958
>X-Y
[1]11-4
Consider X and Y aretwovectorsofdifferentlength,thenanewvectorwillbeproducedby X + Y (X Y),repeatingthe shortervectorasneeded.Thenumberofvalues(observations)inthe X + Y (X Y)generatedvectorsisequaltothe extendedvector.
EXAMPLE2.5 TWOVECTORSOFDIFFERENTLENGTH Consider X ¼ 5,3and Y ¼ 4,2,6,then X Y equalsto1,1,-1.TheRcommandsandbuilt-infunctionsareemployedto compute X Y
>X<-c(5,3)
>Y<-c(4,2,6)
>X-Y
[1]11-1
Warningmessage:
InX-Y:longerobjectlengthisnotamultipleofshorterobjectlength
Onecanobservethatthevalue5ofthevector X isemployedtwotimes(5-4)and(5-6),whilethevalue3isemployedonly onetimetomakethelengthof X thesameas Y.
EXAMPLE2.6 CREATEAVECTORWITHAZEROINTHECENTER Formavectorwith2n +1valuesrepresentingtwocopiesof X withazerointhecenter.Fourvalues3,6,4and8representsthe variable X.Thecommandsemployedtocreateavectorwithazerointhemiddleare:
>X<-c(3,6,4,8)
>z<-c(X,0,X)
>z
[1]364803648
Note:Callingasinglevaluecanbeachievedbyusingthefunction vectorname[] tolocatetheposition. >vectorname[positionoftheelement]
EXAMPLE2.7 EXTRACTAVALUE Extractthesecondvalue,andthenextractthefourthvalueofavector y ¼ 10, 5, 8, 4.Thefunction y[] inRisemployedto extracttherequestedvalues.
>y<-c(10,5,8,4)
>y[2] [1]5
>y[4] [1]4
Roffersspecificfunctionstoextractsuccessivevaluesofavector.Successivevaluesofthevectorcanbeextractedbyusing colon(:)operator.
Vectorname[A:B] whereAreferstothestartingvalueandBreferstotheendingvalue.
EXAMPLE2.8 EXTRACTSUCCESSIVEVALUES Extractthelasttwovaluesofavector y presentedin Example2.7.Thecommandforextractingthetwosuccessivevaluesis y[3:4].
>y[3:4]
[1]84
Excludingavaluefromavectorcanbeachievedbyaddinganegativesubscriptassociatedwiththebuilt-infunctionasgiven below.
Vectorname[-positionofthevalue]
thevalueatthatpositionwillbeexcludedfromthevector,andthecallforexcludingsequentialvaluesofthevectorrequires tolocatetwovaluesinthevectorandputthembetweentwobrackets,asshownbelow:
vectorname[-(A:B)] where,AreferstothestartingvalueandBreferstotheendingvalue.
EXAMPLE2.9 ELIMINATEAVALUEORVALUES Employthedatasetpresentedin Example2.7 toeliminatethethirddatavalueandthentoeliminatethefirstandseconddata values.Thefunction y[-3] isemployedtoeliminatethethirdvalue,andthefunction y[-(1:2)] isemployedtoeliminate thefirstandsecondvalues.
>y<-c(10,5,8,4)
>y[-3] [1]1054
>y[-(1:2)] [1]84
Thefunction y[y<ory>] canbeemployedtocallvaluesthataremoreorlessagivenelement.
y[y< Specificvalue]or y[y> Specificvalue]
EXAMPLE2.10 LESSTHANANDMORETHAN Locatethevaluesof y thatarelessthan7,lessthan10,andmorethan15respectively,where y ¼ 8,14,13,7,10,15,18,20.The function y[y<ory>] wasemployedtolocatethevaluesthatarelessthanormorethanavalue,asshownbelow.
>y<-c(8,14,13,7,10,15,18,20)
>y[y<7] numeric(0)
>y[y<10] [1]87
>y[y>15] [1]1820
2.5.1MatrixinR Multivariatedataforvariousvariablescanbepresentedinatableincludingrowsandcolumns,calledamatrix.So far,wehaveexperimentedwithscalarsandhowtoformavectoremployingbuilt-infunctionsinR.Thenextsection showshowtoemployRcommandstogenerateamatrixfromavailabledata.