Skip to main content

Deep Learning-Based Automated Detection of Pneumonia from Chest X-Ray Using Convolutional Neural Net

Page 1


International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072

Deep Learning-Based Automated Detection of Pneumonia from Chest X-Ray Using Convolutional Neural Network

Dept. of Electrical and Electronic Engineering, Jatiya Kabi Kazi Nazrul Islam University, Trishal, Mymensingh2224 , Bangladesh ***

Abstract - Pneumonia is considered a leading cause of childhood mortality worldwide, particularly in resource-limited settings where early diagnosis is challenging. Chest X-ray (CXR) imaging is the most widely used diagnostic tool, but its interpretation is highly prone to variability and error. To demonstrate this, we introduced a computer-aided diagnosis (CAD) system that leverages convolutional neural networks (CNNs) with batch normalization and transfer learning for automated pneumonia detection. Using the publicly available Kermany dataset, we pre-processed and resized CXR images, applied model checkpoints to minimize over fitting, and evaluated performance with a confusion matrix. Our proposed methodology achieved an accuracy of 85%, exhibiting that deep learning can provide reliable, scalable support for radiologists in distinguishing pneumonic from healthy lungs with early response. This work underscores the potential of CNN-based CAD systems to improve diagnostic consistency and accessibility in clinical practice.

Key Words: Convolutional Neural Network, confusion matrix, keras, recall, hyper parameters, Pneumonia detection.

1. INTRODUCTION

Pneumoniaisaseriousinfectiousdiseaseoftherespiratorysystemthatarisesfrombacterial,viral,orfungalpathogens.The infection primarily affects the lungs by inducing inflammation in the alveoli and may lead to the accumulation of fluid, a conditioncommonlyreferredtoaspleuraleffusion.Globally,pneumoniaremainsoneoftheleadingcausesofmortalityamong childrenbelowfiveyearsofage,contributingtomorethanfifteenpercentofdeathsinthisagegroup[1].Theburdenofthe diseaseisparticularlysevereinlow-andmiddle-incomecountries,wherefactorssuchasdensepopulation,environmental pollution,poorhygiene,andlimitedaccesstohealthcarefacilitiessignificantlyincreasevulnerability.Consequently,timely identificationandappropriatetreatmentareessentialtoreducetheriskoffataloutcomes.

Medicalimagingtechniques,includingcomputedtomography(CT),magneticresonanceimaging(MRI),andchestradiography, are routinely employed to assist in the diagnosis of pneumonia. Among these modalities, chest X-ray imaging is widely preferredduetoitsnon-invasivenature,affordability,andaccessibility.Figure1illustratesrepresentativechestX-rayimages ofhealthyandpneumoniclungs.Infectedlungstypicallyexhibitabnormalopacities,knownasinfiltrates,whichappearas whiteregionsandserveaskeyindicatorsofpneumonia.Despitetheirclinicalusefulness,chestX-rayinterpretationsareoften influencedbytheexperienceandsubjectivejudgmentofradiologists,whichcanleadtodiagnosticinconsistencies[2].This limitationhighlightstheneedforreliableautomatedapproachestosupportpneumoniadetection.

Toaddressthischallenge,thepresentstudyproposesacomputer-aideddiagnosis(CAD)frameworkbasedonanensembleof deep transfer learning models for accurate classification of chest X-ray images. Such systems have the potential to assist cliniciansbyprovidingconsistentandobjectivediagnosticsupport.

Deeplearninghasemergedasapowerfulbranchofartificialintelligence,demonstratingremarkableperformanceinawide range of computer vision applications [3]. In particular, convolutional neural networks (CNNs) have achieved significant successinimageclassificationtasksacrossmultipledomains[4].However,theeffectivenessof CNN-basedmodelsishighly dependentontheavailabilityoflarge,well-annotateddatasets.Inthebiomedicaldomain,obtainingextensivelabeledimage collectionsischallenging,asexpertannotationbymedicalprofessionalsisbothcostlyandtime-consuming.Transferlearning offersapracticalsolutiontothislimitationbyenablingthereuseofknowledgefrommodelstrainedonlarge-scaledatasets.In thisapproach,pre-trainedCNNs,commonlytrainedondatasetssuchasImageNet,whichcontainsmillionsofnaturalimages, areadaptedtomedicalimagingtasks,therebyimprovingperformanceevenwhenonlylimitedtrainingdataareavailable.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072

Figure 1: ExamplesoftwoX-rayplatesthatdisplay(a)ahealthylungand(b)apneumoniclung.Theredarrowsin(b)indicate whiteinfiltrates,adistinguishingfeatureofpneumonia.TheimagesweretakenfromtheKermanydataset

2. RELATED WORKS

AutomaticdetectionofpneumoniafromchestX-rayimageshasremainedachallengingresearchproblemforanextended period,largelyduetothelimitedavailabilityoflarge,publiclyaccessiblemedicalimagingdatasets.Earlystudiesprimarily reliedonconventionalmachinelearningtechniques,wheredomain-specificfeaturesweremanuallydesignedandsubsequently usedforclassification.Forexample,Chandraetal.performedlungregionsegmentationonchestradiographsandextracteda setofstatisticalfeaturestocharacterizetheimages.Thesefeatureswereevaluatedusingmultipleclassicalclassifiers,including multilayer perceptron (MLP), random forest, sequential minimal optimization, classification via regression, and logistic regression.Theirexperiments,conductedonadatasetof412images,demonstratedthattheMLPclassifierachievedthehighest classificationaccuracyof95.39%.

Similarly,Kuoetal.investigatedpneumoniadetectioninacohortof185patientsbyextractingelevenhandcraftedfeatures frommedicalimagesandapplyingawiderangeofregressionandclassificationalgorithms,suchasdecisiontrees,support vectormachines,andlogisticregression.Amongthetestedmodels,thedecisiontreeclassifierdeliveredthebestperformance, achievinganaccuracyof94.5%,whiletheremainingmodelsshowedcomparativelyweakerresults.Inanotherstudy,Yueetal. utilizedsixmanuallyengineeredfeaturestoidentifypneumoniafromchestCTimagescollectedfrom52patientsandreported a maximum area under the curve (AUC) value of 97%. Although these approaches yielded promising outcomes, their applicabilityislimitedduetosmalldatasetsizesandtherelianceonhandcraftedfeatures,whichrestrictgeneralizationto unseendata.

Unlike traditional machine learning pipelines that require explicit feature extraction and selection, deep learning-based approaches enable end-to-end learning, where discriminative features are automaticallylearned directly from raw input images.Convolutional neural networks(CNNs),inparticular,have become the dominantchoicefor image-based medical diagnosisbecauseoftheirabilitytocapturespatiallyinvariantpatternsthroughconvolutionaloperations.Owingtothese characteristics,CNN-basedmodelshaveconsistentlydemonstratedsuperiorperformanceoverconventionalimageprocessing andmachinelearningtechniquesinvariousimageclassificationtasks.

SeveralstudieshaveexploredCNNarchitecturesforpneumoniaclassificationusingchestX-rayimages.Sharmaetal.and Stephenetal.developedrelativelyshallowCNNmodelsandemployeddataaugmentationstrategiestomitigatethechallengeof limited training data. Using the dataset released by Kermany et al., both studies reported encouraging results, with classificationaccuraciesof90.68%and93.73%,respectively[5].However,whiledataaugmentationincreasestheapparent sizeoftrainingdatasets,itintroduceslimitednewsemanticinformation,whichmayconstrainfurtherimprovementsinmodel

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072

performance.Rajpurkaretal.appliedaDenseNet-121architectureforpneumoniadetectionandachievedanF1-scoreof 76.8%[6].Theauthorsattributedthismodestperformance,inpart,totheabsenceofcomplementaryclinicalinformation,such aspatienthistory,whichalsoaffectedthediagnosticaccuracyofhumanradiologistsintheircomparativeanalysis.

To improve model robustness and generalization, Janizek et al. proposed an adversarial optimization-based framework designedtoreducedependencyondataset-specificcharacteristics.TheirapproachachievedAUCscoresof74.7%inthesource domain and 73.9% in the target domain, indicating improved cross-domain consistency [7]. Zhang et al. introduced a confidence-awaremechanismforlungX-rayanomalydetectionbyformulatingthetaskasaone-classclassificationproblem focusedsolelyonidentifyingabnormalsamples.TheirmethodachievedanAUCscoreof83.61%.Incontrast,Tunceretal. adoptedahybridmachinelearningapproachinwhichfuzzytreetransformationandexemplar-basedpartitioningwerefirst appliedtotheimages,followedbyfeatureextractionusingmulti-kernellocalbinarypatterns.Theextractedfeatureswerethen classifiedusingconventionalclassifiers.Theirmethod,evaluatedonasmalldatasetcontainingCOVID-19andpneumoniacases, achievedanaccuracyof97.01%[8].

3. METHODOLOGY

Themethodologyandapproachwe’veusedhasbeendescribesstepbystepahead.Thetaskhasbeenperformedasperthe following step. We have pre-processed the data and all the X-ray images were cropped to the ideal dimensions for computationalneeds.Next,weuploadedthedatainthememory.Thenwecreatedacallbacksfunctionandmodelcheckpoint. Next,wereshapedthedataandcreatedthemodelthroughCNN.Andlastwedidmodelfittingandfoundtheaccuracyandvalue lossusingconfusionmatrix.

Pre-processing:Inthiswehaveresizedtheimagesforperformingefficientoperationonthem.Theimagesareoriginally1000 pixelsperdimensionandtheywereresizedtoacompatiblesizeforbettercomputation.Wealsocreatedafunctioninwhich pneumoniafileintrainingdatasetisgivenlabel=0andnormallabel=1,else=2.Twoarrays,‘X’and‘Y’arecreatedwhere‘X’ storespre-processed(resizedimage)dataand‘Y’storesthelabelwhichisagainstoredbackinthefileweusethefunction ‘resize’,anduploadingtheimagesinnewh5pyfilewhosemainfunctionistostorethedatainbinaryformat,meansimagesof chest X-ray are converted to array of numbers as in figure 8. Uploading data in memory: Now, as the data has been preprocessed,weconsidereduploadingitinthememory.Wecreateafunctiontodoso,itloadsthetrainingdatasetandalsothe testdatasetalongwiththeirlabels.Afterwards,theshapeoftestandtraindatasetsisobtainedandtheX-raysaredisplayedof both,withpneumoniaandwithoutpneumoniausing“matplotlib.

Callbacksandmodelcheckpoint:Wehaveimported“ModelCheckpoint”fromthe“callbacks”libraryof“keras”.Wehavedone soastoreducelearningratetimelyaftermonitoringaquantity.ModelCheckpointcallbackisutilizedinadditionwiththe trainingusingmodel.Fit()functiontosavesomethemodelorweights(incheckpointfile)atsomeinterval,model/weightsare henceloadedafterwardstocontinuewiththepreviouslysavedmodelmakingcheckpointswhichhelpsintimelycheckand savethebestmodelperformancetilllastandalsoavoidfurthervalidationaccuracydropduetooverfitting.Reshapingand modelcreation:Nowwehadimportedvariousmodelsfrom“keraslibrary”forcreatingourmodelaccordingtothedataset.We reshapedthe“X_train”and”X_test”with“.reshape(5216,3,150,150)”and“.reshape(624,3,150,150)”respectively. Modelfittingandfindingtheaccuracy:Nextandfinalstepinthecompletionoftheprojectisfittingthetrainingdatasetinthe modelusing.fit()functioninwhichtheargumentsareX_trainandy_trainandcallbackfunctiontoreducethelearningrateand thevalueofepochs(inourprojectthevalueepochsissettobe10).

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072

4. RESULTS & DISCUSSION

Thequalityandadequacyofthedatasetplayadecisiveroleindeterminingtheeffectivenessofanymachinelearningproject.A well-curatedandsufficientlylargedatasetisessentialfordevelopingareliableimageclassificationmodel.Inthiswork,the trainingprocesswascarriedoutusingtheCreateMLframework,whichprovidesastreamlinedanduser-friendlyenvironment byabstractingmanyoftheunderlyingtechnicalcomplexities.Unlikeconventionaldevelopmenttoolsthatrequireextensive expertise,CreateMLenablesmodeltrainingthroughanintuitiveinterfacewhilestillallowinguserstoadjustkeyparameters. Furthermore, the framework supports multiple training runs under different configurations and automaticallygenerates performancevisualizations,whichfacilitateinformeddecision-makingduringmodeloptimization,asillustratedinFigure3. Imageclassificationmodelsconsistofmultipleinterconnectedlayers,eachdesignedtoperformaspecificoperationandpass itsoutputtosubsequentlayersforprogressiverefinementofthelearnedrepresentation.Initiallayersprocesstherawpixel intensities and extract low-level visual patterns, while deeper layers gradually capture more abstract and discriminative features.Throughthis

Figure2:FlowchartofCNNmodelforpneumoniadetection

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072

hierarchicalfeaturelearningprocess,themodelbecomescapableofdifferenttiltingbetweennormalandpneumonia-affected chestX-rayimages.Variousarchitecturaldesignsexisttodefinehowtheselayersarestructuredandhowinformationflows amongthem,withthechoiceofarchitecturedependingonthenatureoftheclassificationtask.IntheCreateMLframework, theseinternaloperationsarehandledautomatically,allowingmodeltrainingwithoutrequiringexplicitinterventioninthe underlyinglayerorganization.

Aftercarryingoutseveraltrainingprocesseswithdifferentconfigurationsandvariationsintheparameters,itwaspossibleto obtainamodelwithanaccuracyof∼85%integratedintoaniPhoneapplicationthatisveryeasytouseanddistribute. Sincethemodelwastrainedwithasetofdataclosedtotwocategories,NORMAL(Healthy)andPNEUMONIA,anyimagepassed asinputtothemodelwillbecategorizedintooneofthesetwogroupssince,duringthetrainingprocess.

5. CONCLUSIONS

Thisstudypresentedadeeplearning–basedcomputer-aideddiagnosissystemfortheautomateddetectionofpneumoniafrom chestX-rayimages.Byleveragingconvolutionalneuralnetworksandtransferlearning,theproposedapproachdemonstrated effectivefeatureextractionandreliableclassificationperformanceonthepubliclyavailableKermanydataset.Theexperimental resultsachievedanaccuracyofapproximately85%,indicatingthepotentialofCNN-basedmodelstoassistradiologistsin distinguishing between normal and pneumonia-affected lungs. The simplicity and scalability of the proposed framework suggestitssuitabilityfordeploymentinresource-constrainedclinicalenvironments.Futureworkmayfocusonincorporating larger and more diverse datasets, integrating clinical metadata, and further optimizing model architecture to enhance robustnessanddiagnosticaccuracy.

Figure 3: TrainingVsValidationLoss
Figure 4: FeatureExtraction

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072

ACKNOWLEDGEMENT

TheauthorcanacknowledgetheuseofgenerativeAItoolssuchasChatGPT,Grammarlyonlyformakingthegrammatical correctionsandincreasethereadability.

REFERENCES

[1] ChanH.P,SahinerB,HadjiyskiL,ZhouCandPetrickN2005LungnoduledetectionandclassificationU.S.patentapplication no.10/504,197

[2] Neuman M., Lee E., Bixby S., Diperna S., Hellinger J., Markowitz R., et al. Variability in the interpretation of chest radiographsforthediagnosisofpneumoniainchildren.JournalOfHospitalMedicine.7,294–298(2012)pmid:2200985

[3] LalS.,RehmanS.,ShahJ.,MerajT.,RaufH.,DamaševičiusR.,etal.AdversarialAttackandDefencethroughAdversarialT rainingandFeatureFusionforDiabeticRetinopathyRecognition.Sensors.21,3922(2021)pmid:34200216

[4] AlbawiS,MohammedT.AandAl-ZawiS2017UnderstandingofaconvolutionalneuralnetworkInternationalConference onEngineeringandTechnology(ICET)(Antalya:IEEE)pp1-6

[5] Kermany D., Zhang K. & Goldbaum M. Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification.(Mendeley,2018)

[6] https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

[7] KrizhevskyA,SutskeverIandHintonG.E2012ImageNetclassificationwithdeepconvolutionalneuralnetworksAdvances inNeuralInformationProcessingSystems(NIPS2012)edPereiraFetalpp1097-1105

[8] RahmanT,ChowdhuryM.E.H,KhandakarA,IslamK.R,IslamK.F,MahbubZ.B,KadirM.AandKashemA2020Transfer learningwithdeepconvolutionalneuralnetwork(CNN)forpneumoniadetectionusingchestX-rayAppliedSciencesvol. 10(MDPI)p3233

2025, IRJET | Impact Factor value: 8.315 | ISO 9001:2008 Certified Journal | Page1268

Turn static files into dynamic content formats.

Create a flipbook