Assessment of Variation in Concentration of Air Pollutants Within Monitoring Stations in Mumbai
Keisha Mehta, Ram Chirapuram, Ritvik Saraf, Sakshi Bokil, Shachi HardiStudent, Narsee Monjee Institute of Management Studies, Mumbai
Abstract - This research paper presents a study on the variationofconcentrationofairpollutantsinvariousareasof Mumbai,inordertodeterminetheredundancyofmaintaining separate Air Quality monitoring stations in those areas. The pollution concentrationof7 pollutants, namelyPM2.5, PM10, NO2,NH3,SO2,COandOzone, ismonitoredacross12stations in Mumbai. A statistical approach was conducted on monthly data for 2020 and 2021 collected from the Central Pollution Control Board (CPCB), India. The presence or absence of significant difference in the means of concentrations of a particular pollutant among monitoring stations is identified using ANOVA Analysis. Tukey’s Honest Significant Difference (HSD) Post-Hoc Test is performed to ascertain the stations whichshowasignificant differenceinpollutantconcentration. Thosepairsofstationswhichshownosignificantdifferencein concentration of all 7 pollutants are identified. Recommendationsare madebasedontheseobservationsand the straight-line distance between the pairs of stations. This analysis can form the basis for identifying redundant air qualitymonitoringstationsinMumbai.Thispaperalsoapplies amultinomiallogisticregressionmodelinordertopredictthe air quality class based on Tree Cover, Population Density, Petrol Price, Temperature, Humidity, Wind Speed, Air Pressure,Elevation,CoastalLocation,LatitudeandLongitude.

Key Words: AirPollutant,AQI,ParticulateMatter,ANOVA, MonitoringStation,LogisticRegression.
1.INTRODUCTION
Air pollution is a serious problem in many developing countries. Atmospheric concentrations of fine particulate matter, which is one of the worst air contaminants, are severalfoldshigherinadevelopingcountryascomparedto adevelopedcountry[1].Aerosolsareairbornegases,solids, and liquids that contaminate the atmosphere. They are presentinawiderangeofsizes.Therearethreerangesin particulatematter,PM10,PM2.5andPM1,withupperlimits of 10 μm, 2.5 μm and 1 μm, respectively. For regulatory reasons, they are most commonly used to define aerosol fractions[2].Theincreasingrateofpollutantconcentrations is on account of the growing population of humans and
vehicles, as well as industries [1]. Indian megacities are among the most polluted in the world. The pollutant concentrations in India are much higher than the level of pollutantconcentrationsrecommendedbytheWorldHealth Organisation[3].
In 1981, the Government of India introduced the Air (PreventionandControlofPollution)Actinordertoarrest thedeteriorationinairquality.TheCentralPollutionControl Board (CPCB) comprises State Pollution Control Boards (SPCB) that are required to perform a number of duties undertheAct.Concernsoverairqualityhavegrownoverthe past few years as evidence of harmful effects on health, productivity, and the economy has increased. As a result, between2015and2020,therehasbeenasignificantrisein the number of monitoring stations [4]. The National Air QualityMonitoringProgram(NAQMP),launchedin1985and runbytheCPCBalongwiththeSPCBs,isthemostextensive monitoringnetworkinthecountry.Asof2022,thisprogram consistsofanetworkof804operatingmanualandreal-time monitoring stations across India. The monitoring of pollutantslikePM2.5,PM10,SO2andNO2,isdonetwicea week for a duration of 24 hours resulting in 104 observationsperyear.Thesestationsuseanestimatecalled theAirQualityIndex(AQI)totrackthedailyairquality.The greatertheAQIvalue,themoreseverethelevelofpollution. Launchedin2019bytheMinistryofEnvironment,Forests andClimateChange,theNationalCleanAirProgram(NCAP) aims at the reduction of 20-30% of the concentration of PM10andPM2.5by2024with2017asthebaseyear[5].
Mumbai is transforming into a city primarily focused on commercial activities, rather than being recognised as an industrialcityinthepast.Thetransportsectoristhemain contributortopollution,followedbypowerplants,industrial unitsandwasteincineration[6].InMumbai,theairquality monitoring stations are operated by the Maharashtra Pollution Control Board (MPCB). Some stations are collocatedinpockets,leavingalargeareaunmonitored.
Theabovetable(Table -1)indicatesthefivecategoriesofAir Quality Index, showing the ranges of AQI and their correspondinglevels of risk.GreatertheAQI,higher is the risktohealth Level1indicatessatisfactoryairqualitywhich poseslittletonorisktohealth.Level2pollutionmayposea riskonlytocertainhigh-riskmembersofthepopulation.Itis considered to be unhealthy for those people exposed to certaindiseasessuchaslungdiseasesandheartdiseases.Itis not particularly harmful to those members of the general populationwithoutanyunderlyingillnesses.Level3issaidto beunhealthy.Thistypeofpollutionmayhavesomenegative healtheffectsonsomemembersofthegeneralpublic,while sensitivepopulationsmayfacemoreseverehealthproblems. Level4,‘veryunhealthy’,isthepointwhereahealthalertis issued,astheriskofadversehealtheffectsforthepopulation hasgreatlyincreased.Lastly,Level5,issaidtobehazardous astheentirepopulationwouldbeaffectedbythehighlevels ofpollution[7].
Thisstudyaimstoinvestigatetheneedforalargenumberof monitoringstationsinaparticulararea.Theaimofthestudy istoanalysewhetherthereexistsanysignificantdifference inconcentrationofpollutantsbetweenselectedmonitoring stationsinMumbai,India.Therearevariousfactorsthatcan affectair quality, includingtreecover, population density, petrol price, temperature, humidity, wind pressure, air pressure,elevation,andcoastallocation.Understandingthe relationships between these factors and air quality is imperative for developing effective plans to improve air quality. This paper also applies a multinomial logistic regressionmodelinordertopredicttheairqualityclasson thebasisoftheaforementionedfactors.
2. LITERATURE REVIEW

A study of industrial clusters with the majority of the industriesinthestudyregiononamoderatetolargescale rendered it vulnerable to poor air quality. At 16 research locations,12pollutants,includingPM10,PM2.5,SO2,NO2, CO, O3, NH3, and heavy metals (Cu, Mn, Ni, Pb, Zn), were
monitoredthroughoutthewinter.Multivariateanalysiswas usedtoevaluatetheairqualitydatafurther,andtheresults weredisplayedusinghistograms,boxplots,clusteranalysis, PCA,analysisofvariance(ANOVA),andairqualityindex[8]. It was revealed that while the industrial areas experience deterioratingairquality,theyalsoinfluencetheincreasing concentrationoftoxicmetalsintheneighbouringregionsdue to the increase in suspended particles originating in contaminated or eroded soil. Thereby, since 2010, the concentration of Ni & Cu particles in the air has increased everyyearandisahugecauseforconcern[9].
Trends in air pollution and potential sources of emission were identified by the variations in the means of the sites sinceANOVAfindingswerestatisticallydifferent.Thiswas followedbyPCAthatidentifiedthesourcesofairpollution [10]. Following this, the Tukey Kramer test can be implemented to identify the pair of stations which are significantlydifferentwithrespecttopollutantconcentration [11].
Northern India experiences PM10 and PM2.5 exceeding NationalAmbientAirQualityStandards(NAAQS)by150% and100%respectively.ForSouthIndia,thefiguresare50% and40%respectively.Ontheotherhand,SO2,NO2andO3 meettheresidentialNAAQSalloverIndia.Theprogressof ContinuousaswellasManualmonitoringNetworksinIndia was compared against US State Department Air-Now Network.Whilethefrequencyofobservationsandqualityof data definitely improved, gaps remained in spatial and temporal coverage, thus indicating a requirement for additionalmonitoringstations[4].
Astudyemploysvariousmachinelearningmodelstopredict PM2.5levelsonthebasisofdailyatmosphericconditionsof aspecificcity.Thepapermakesuseofalogisticregression modeltodetectandpredictwhetheracityispollutedornot pollutedbasedontemperature,windspeed,dewpointand pressure[12].
3. DATA AND METHODOLOGY
Thedataset(Table -2)containspollutionconcentrationsof 7pollutants,PM2.5,PM10,NO2,NH3,SO2,CO,andOzone, across12stationsinMumbai,operatedbytheMaharashtra Pollution Control Board. These stations are located in industrialaswellasresidentialareas. Thestudywascarried outondatafortheyears2020and2021.Exploratorydata analysis was performed on the raw data to determine patterns and relationships. Furthermore, descriptive
statistics were obtained to summarise the data set. The missing values of pollutant concentrations were treated using interpolation. The research identifies redundancy amongstmonitoringstationsinMumbaibyanalysingsimilar pollutantconcentrations.

Aftercheckingthedatafornormalityandhomoscedasticity, one-wayANOVAwasperformedindividuallyforeachofthe pollutant to identify whether there was a significant difference in the means between stations [8]. For ANOVA, thenullhypothesisstatesthatthatallthegroupmeansare equalwhilethealternatehypothesisstatesthatatleastone pairofmeansissignificantlydifferent.Ifthecalculatedvalue of F is greater than the tabulated value, then the null hypothesis,whichclaimsthatthemeans
acomparisonofallpairsofmeansinordertoidentifywhich differences are significant [10]. The null hypothesis of Tukey's HSD test states that the two group means do not differsignificantly.
Anotherdataset(Table -3)wascreatedtohelpinbuilding thepredictivelogisticregressionmodelforAirQualitylevels based on various predictor variables like tree cover, population density, petrol price, temperature, humidity, windpressure,airpressure,coastallocationandelevation. Thedatawascollectedfromsourcesavailablepublicly.The study was carried out for 3rd March 2022. Trees play a significant role in absorbing pollutants and improving air quality,whilehighpopulationdensitycanleadtoincreased pollutionfromhumanactivities.
Source:https://cpcb.nic.in/ ofallgroupshavenosignificantdifference,mustberejected [13] When the Null Hypothesis is rejected in a One-way ANOVA, it is concluded that not all means are equal. The results of ANOVA, however, do not specify which specific differencesbetweenmeanpairsaresignificant[10].Thus, Tukey’sHonestSignificantDifference(HSD)PostHocTest wasperformedas
Higherpetrolpricescanresultinreducedfuelconsumption andlowervehicleemissions,buttherelationshipbetween petrol prices and air quality is complex. Temperature and humidity can affect air quality by influencing chemical reactionsandatmosphericprocesses,andwindpressurecan disperseairpollutants.Changesinairpressureandelevation can also impact the concentration of pollutants near the
ground, while coastal locations can experience better air qualityduetonaturalairfiltrationprovidedbythemarine environment.However,coastalareascanalsobesusceptible topoorairqualityfromhumanactivities.
Inthecontextofthismodel,therearefiveordinalairquality classes. When fitting the multinomial logistic regression model,oneoftheseclassesischosenasthereferenceclass
(Moderate).Themodelthenestimatesthelogoddsofeachof theotherclassesrelativetothereferenceclass.Thechoiceof the reference class can influence the interpretation of the coefficients, but it does not affect the model's overall predictive performance or the predicted probabilities for each observation.
Source:https://www.indiastat.com/
4. DATA ANALYSIS
One-Way ANOVAforNO2pollutantconcentrationhadthe followinghypotheses:
H0: ThereisnosignificantdifferenceinthemeansofNO2 concentrationbetweenanymonitoringstations.
H1: There is a significant difference in the means of NO2 concentration between at least one pair of monitoring stations.
Since the p-value was less than α = 0.05 (α = level of significance),therewassufficientevidencetorejectthenull hypothesis.Hence,therewasasignificantdifferenceinthe concentration of NO2 between at least one pair of monitoringstations.
Further,toidentifytheexactpairofstationswhichshowed nosignificantdifferenceinmeans,Tukey’sHSDPost-Hoc
Test was performed. Tukey’s HSD test had the following hypotheses:
H0: ThereisnosignificantdifferenceinthemeansofNO2 concentrationbetweenthetwomonitoringstations.
H1: There is a significant difference in the means of NO2 concentrationbetweenthetwomonitoringstations.

Allstationsexceptthefollowing14stationpairings(Table4)showednosignificantdifferenceinNO2concentrationat levelofsignificanceα=0.05.
Table -4: Stationpairingswithnosignificantdifferencein NO2concentration.
Airport
Kurla
Vile
Worli
Mahape
Mahape
Vasai
Vasai
Mahape-Vasai
Source:Authors
Asimilaranalysiswascarriedoutfortheother6pollutants. ExceptPM2.5,allpollutantsdepictedasignificantdifference in the mean concentration between at least two AQI monitoringstations.
TofurtherunderstandtherelationshipsbetweenAirQuality and various factors affecting it, a multinomial logistic regression model was fitted to predict air quality levels based on the predictor variables. The model used the 'multinom'functionfromthe'nnet'packageinRandhadfive ordinalclasses.

5. RESULTS
After performing ANOVA and Tukey’s HSD test on 12 differentmonitoringstationswithinMumbai,andanalysing the7pollutantconcentrations,25pairsofstationsshowed no significant difference in any pollution concentrations (Table -5).
Table -5: Stationpairingswithnosignificantdifferencein concentrationsofanypollutants.
Stations
Sion-Powai
Sion-Airport
Sion-Borivali
Powai-Kurla
Powai-Airport
NerulKurla
Nerul-Colaba
Nerul-Airport
Kurla-Colaba
Kurla-Airport
Colaba-Airport
Mahape-Kurla
Mahape-Colaba
Mahape-Airport
Mahape-Worli
Mahape-VileParle
Worli-Sion
Worli-Powai
Worli-Nerul
Worli-Colaba
Worli–Airport
VileParle-Nerul
VileParle-Kurla
VileParle-Colaba
VileParle-Airport
Source:Authors
Among these 25, the closest stations were measured in termsofthestraight-linedistance.Thestationpairingswith theleastdistancewerefoundtobeKurla-AirportandVile Parle-Airport(Table -6).
Table -6: Stationpairingswiththeleaststraight-line distance.
Kurla-Airport 3.14km
VileParle-Airport 3.22km
Source:https://distancebetween2.com/
Table -6 shows the station pairings that were the closest amongallwithinMumbaianddidnotshowanysignificant variation in any pollution concentrations. The research proposesthatthenumberofstationsmonitoringairquality in the above-mentioned areas can be reduced and the resourcescanbeutilisedelsewhere.
Table -7: Straightlinedistancebetweenmonitoring
Worli-Sion 7.37km
Source:https://distancebetween2.com/
The next least straight-line distance was found to be betweenWorliandSion(Table -7).Similartotheprevious case, these stations did not significantly differ in any pollutionconcentrations.Therefore,theresearchproposes thatonlyonemonitoringstationisenoughtorepresentthe area.
Theaccuracyofthemultinomial logistic regressionmodel was found to be 52% (Table -8), which is substantially better than random guessing. The interpretation of the modelcoefficients(Fig -1)revealedthatanincreaseintree cover was associated with a decrease in the log odds of belonging to the "Good" air quality class compared to the reference class (Moderate), holding all other variables constant.Conversely,ahigherpopulationdensitywasfound tobepositivelycorrelatedwithhigherairpollutionlevels. The relationship between petrol price and air quality was found to be complex, as other factors such as public transportationavailabilityandvehicleefficiencyalsoplaya role.Highertemperatureswereassociatedwithanincrease inground-levelozoneandotherpollutants.Highhumidity wasfoundtoincreasethe formation ofsecondaryorganic aerosolsandotherpollutants.Strongerwindswerefoundto disperse air pollutants, reducing their concentration in a givenarea,whileweakorstagnantwindscontributedtoair pollutionbuild-up.Lowerairpressurewasassociatedwith enhanced vertical mixing and improved air quality, while higher air pressure trapped pollutants near the surface. Higher elevations generally experience cleaner air due to reducedpollutantemissionsandenhanceddispersion,but complextopographycancreatelocalisedareasofpoorair quality.Coastalareasexperiencebetterairqualityduetothe dispersion of pollutants by sea breezes and natural air filtrationprovidedbythemarineenvironment,butshipping emissions,industrialactivities,andotherhumaninfluences canalsocausepoorairquality.

Accuracy 52.35%
Source:Authors
6. CONCLUSIONS
InurbanpartsofIndia,airpollutionisaseriousissueanda source of growing worry. Particulate matter pollution is a major issue for the entire nation. Urban life quality is severelyimpactedbyairpollution,thusfindingandfunding effectivemitigationandreductionstrategiesisacrucialfirst step. This paper provides an analysis on the variation of concentrationofairpollutantsinvariousareasofMumbaito determinetheredundantAirQualitymonitoringstationsto offer decision-makers a clearly defined path to begin addressingtheissue.

Onthebasisoftheseobservations,recommendationswere madeforthosestationpairsthatdidnotsignificantlydiffer intheconcentrationofanyofthesevenpollutants,basedon the straight-line distance between the stations. Excessive and redundant air quality monitoring stations in Mumbai can be identified by use of this research. Thus, these resourcescanbeplacedwheretherearenoAQImonitoring stationspresentallacrossIndia.
TheANOVAanalysisinthisstudyisrestrictedtothecityof Mumbai, Maharashtra since other cities in India do not containahighdensityofmonitoringstationsandthusare not useful for the purposes of this analysis. Further, this study can be extended to find redundant stations across IndiausingANOVAanalysis,giventhestraight-linedistance betweeneachpairofstations.
Thelogisticregressionmodelgivesa52%classificationrate withfiveclassesimplyingthatthemodelcorrectlyclassifies 52% of the instances in the dataset. This is substantially
better than random guessing, which would achieve an accuracy of 20% for a five-class problem. The model revealedseveralimportantrelationshipsbetweenpredictor variables and air quality levels. Policies that focus on increasing tree cover, reducing population density, and implementingmeasurestoreduceairpollutionfromtraffic, industrial activities, and other human sources can help to improveairquality.Additionally,policiesthataimtotackle the effects of climate change, such as cutting down greenhouse gas emissions, can also help to improve air quality.
REFERENCES
[1] Gwilliam,K.M.,Johnson,T.M.,&Kojima,M. (2004). Reducingairpollutionfromurbantransport : Main report (English). World Bank Group. http://documents.worldbank.org/curated/en/9897 11468328204490/Main-report
[2] Nazarenko,Y.,Pal,D.,&Ariya,P.A.(2020). Air quality standards for the concentration of particulatematter2.5,globaldescriptiveanalysis. Bulletin of the World Health Organization, 99(2), 125-137D.https://doi.org/10.2471/blt.19.245704
[3] Gurjar,B.R.,Ravindra,K.,&Nagpure,A.S. (2016).AirpollutiontrendsoverIndianmegacities andtheirlocal-to-globalimplications. Atmospheric Environment, 142, 475–495. https://doi.org/10.1016/j.atmosenv.2016.06.030
[4] Sharma,D.,&Mauzerall,D.(2022).Analysis of air pollution data in India between 2015 and 2019. Aerosol and Air Quality Research, 22(2). https://doi.org/10.4209/aaqr.210204
[5] MinistryofHousing&UrbanAffairs.(2020, September 16). Long-Term, Time-Bound, National LevelStrategytoTackleAirPollution-NationalClean Air Programme (NCAP).PressInformationBureau. https://pib.gov.in/PressReleasePage.aspx?PRID=16 55203
[6] Shah, J. J., & Nagpal, T. (1997). Urban air quality management strategy in Asia : Greater Mumbai report (English). World Bank Group. http://documents.worldbank.org/curated/en/5878 31468772744677/Urban-air-quality-managementstrategy-in-Asia-Greater-Mumbai-report
[7] Wambebe, N. M., & Duan, X. (2020). Air quality levels and health risk assessment of particulatemattersinAbujamunicipalarea,Nigeria. Atmosphere, 11(8), 817. https://doi.org/10.3390/atmos11080817
[8] Yadav, M., Singh, N. K., Sahu, S. P., & Padhiyar,H.(2022).Investigationsonairqualityof a critically polluted industrial city using multivariate statistical methods: Wayforwardfor future sustainability. Chemosphere, 291. https://doi.org/10.1016/j.chemosphere.2021.1330 24
[9] Aenab, A. M., Singh, S. K., & Lafta, A. J. (2013). Critical assessment of air pollution by ANOVAtestandhumanhealtheffects. Atmospheric Environment, 71, 84–91. https://doi.org/10.1016/j.atmosenv.2013.01.039
[10] Mohamad,N.D.,Ash’aari,Z.H.,&Othman, M.(2015).Preliminaryassessmentofairpollutant sources identification at selected monitoring stations in Klang Valley, Malaysia. Procedia Environmental Sciences, 30, 121–126. https://doi.org/10.1016/j.proenv.2015.10.021
[11] Anil, S., & P, S. (2017). Assessment of Variation of Air Pollutant Concentration within Monitoring Stations in Ernakulam. International Journal of Innovative Research in Science, Engineering and Technology, 6(5).
[12] CR,A.,Deshmukh,C.R.,DK,N.,Gandhi,P., &Astu,V. (2018). Detection andPredictionofAir Pollution using Machine Learning Models. International Journal of Engineering Trends and Technology, 59(4), 204–207. https://doi.org/10.14445/22315381/ijett-v59p238
[13] Mishra,P.,Singh,U.,Pandey,C.,&Pandey, G.(2019).Applicationofstudent’st-test,analysisof variance, and covariance. Annals of Cardiac Anaesthesia, 22(4), 407. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6 813708/
[14] Angelevska,B.,Atanasova,V.,&Andreevski, I. (2021). Urban air quality guidance based on

measures categorization in road transport. Civil Engineering Journal, 7(2), 253–267.
https://doi.org/10.28991/cej-2021-03091651
[15] Briggs, D. J., de Hoogh, K., Morris, C., Gulliver,J.,Wills,J.,Elliott,P.,&Kingham,S.(2008). Effect of living close to a main road on asthma, allergy, and lung function in UK primary school children.Occupationalandenvironmentalmedicine, 65(5), 293-297.
https://doi.org/10.1136/oem.2007.033878
[16] Jacob,D.J.,&Winner,D.A.(2009).Effectof climate change on air quality. Atmospheric Environment, 43(1), 51-63.
https://doi.org/10.1016/j.atmosenv.2008.09.051
[17] Kroll, J. H., Donahue, N. M., Jimenez, J. L., Kessler,S.H.,Canagaratna,M.R.,Wilson,K.R.,...& Worsnop,D.R.(2011).Carbonoxidationstateasa metricfordescribingthechemistryofatmospheric organicaerosol.NatureChemistry,3(2),133-139.
https://doi.org/10.1038/nchem.948
[18] Li,X.,Huang,S.,Huang,X.,Lu,Y.,&Zhang,Y. (2019).Spatial-temporalvariationsandinfluencing factors of PM2.5 pollution in China from 2015 to 2017.Ecotoxicologyandenvironmentalsafety,183, 109507.
https://doi.org/10.1016/j.ecoenv.2019.109507
[19] Méndez-Lázaro, P., Muller-Karger, F. E., Otis, D., McCarthy, M. J., Rodríguez, E., & GarcíaMontiel,D.C.(2018).Coastaloceanandpopulation dynamics of the neglected Puerto Rico trench. Frontiers in Marine Science, 5, 231. https://doi.org/10.3389/fmars.2018.00231
[20] Morawska,L.,Thai,P.K.,Liu,X.,AsumaduSakyi,A.,Ayoko,G.,Bartonova,A.,Bedini,A.,Chai,F., Christensen, B., Dunbabin, M., Gao, J., Hagler, G. S. W.,Jayaratne,R.,Kumar,P.,Lau,A.K.H.,Louie,P.K. K.,Mazaheri,M.,Ning,Z.,Motta,N.,…Williams,R. (2018). Applications of low-cost sensing technologies for air quality monitoring and exposure assessment: How far have they gone? Environment International, 116, 286–299. https://doi.org/10.1016/j.envint.2018.04.018
[21] Seinfeld, J. H., & Pandis, S. N. (2016). Atmospheric chemistry and physics: from air pollution to climate change. John Wiley & Sons. https://doi.org/10.1002/9781118947401
[22] Small, K. A. (2012). Energy policy and gasoline taxation. Annual Review of Resource Economics, 4(1), 345-360. https://doi.org/10.1146/annurev-resource110811-114540
[23] Tiwary,A.,&Colls,J.(2010).Airpollution: measurement,modellingandmitigation.CRCPress. https://doi.org/10.1201/9780203863561
[24] Yang, J., McBride, J., Zhou, J., & Sun, Z. (2015).TheurbanforestinBeijinganditsroleinair pollution reduction. Urban forestry & urban greening, 14(3), 857-864. https://doi.org/10.1016/j.ufug.2015.06.010
