CNN-based Architecture with Attention Mechanisms for Enhanced Diabetic Retinopathy Detection and Cla

Page 1


International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 05 | May 2025 www.irjet.net p-ISSN: 2395-0072

CNN-based Architecture with Attention Mechanisms for Enhanced Diabetic Retinopathy Detection and Classification

1MTech student, Dept of Computer Engineering and IT, VJTI college Mumbai, Maharashtra, India

2Associate Professor, Dept of Computer Engineering and IT, VJTI college Mumbai, Maharashtra, India

Abstract - Diabetic retinopathy is one of the leading causes of visual loss, and early detection is critical to effective treatment. This research discusses in detail the current deep learning practices, particularly convolutional neural networks and attention mechanisms that are applied to improve the detection and grading of DR. The study considers different algorithms and technologies employed in this area, assesses their performance, and examines the way attention mechanisms increase CNN performance. We conclude by highlighting key developments and identifying future research directions.

Key Words: Diabetic Retinopathy, Convolutional Neural Networks (CNN), Attention Mechanisms, Medical Image Classification,RetinalImages.

1.INTRODUCTION

1.1 Diabetic Retinopathy

DiabeticRetinopathyisaseverecomplicationofdiabetes,characterizedbydamagetobloodvesselsoftheretina,progressive visualimpairment,andpotentiallyleadingtoblindness.EarlydetectionofthiscomplicationisveryimportanttopreventtheDR fromprogressingtoanadvancedstage,suchasNPDRandPDR.Traditionally,DRdetectionhasbeenperformedmanuallyby analyzingtheretinalimages.Itisaverytime-consumingprocessandpronetohumanerrors;therefore,itrequiresanautomated detectionsystem.

1.2 Convolutional Neural Networks (CNNs)

Convolutionalneuralnetworkshavebecomepowerfultoolsforimageclassification,includingmedicalimagingofdiabetic retinopathy.Theyinherentlylearnthespatialhierarchiesoffeaturesfromimageswithoutrequiringmanualfeatureextraction. Thus,theyarequiteefficientinthediagnosisofDRfromfundusimages.However,CNNssometimesfailtocapturethesubtle yet critical features of fundus images, which are very vital in the diagnosis of DR in its early stages limitation to their effectiveness

1.3

Attention Mechanisms

TosuchCNNs,attentionmechanismsareutilizedtoguidethenetworkonwhereintheimageitshouldfocusitsattention. Spatialattentionguidesthemodelonsignificantregionsofattention,usuallywithhemorrhagesandexudates,whereaschannel attention models support the model to emphasize feature maps relevant to further processes. Integrating an attention mechanism will improve the focus of the important areas of the CNN models, classifying with high precision, and better interpretingtheresults.

1.4 Algorithms for DR Detection

Variousapproacheshave,therefore,beendevelopedforDRdetection:

Thesetechniquesdependonthefeaturesextractedmanuallyandthenperformtheclassification,typicallyusingalgorithms suchassupportvectormachinesorrandomforest.

•CNN-basedmethods:DRdetectionperformancehasimprovedsignificantlyduetoarchitecturessuchasResNetandInception, whichlearnfeaturesdirectlyfromretinalimagesusingCNNs.

• Hybridapproaches:Sometechniquesinvolvemulti-modelarchitectures,suchascombiningCNNswithothermodelsfordata augmentationorattentionmechanismstoenhancefeaturefocus.purposes.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 05 | May 2025 www.irjet.net p-ISSN: 2395-0072

2. LITERATURE REVIEW

[1]ACNNmodelthatwastrainedontheKaggledatasetwaspresentedbyKokaneetal.(2021),whowereabletoobtain74.8% accuracyforDRdetection.Theauthorshypothesizedthatbyenablingthemodeltofocusmoreonthesalientcharacteristicsof lesions,theattentionprocessesmayimproveperformance.

[2]AnenhancedResNetarchitectureoptimizationforDRclassificationwaspresentedbyJiaoandTao(2023).Theirnetwork’s abilitytocaptureintricateretinalfeatureswasfurtherenhancedbytheirhierarchicalresidual-likeconnections,whichledtoan improvementinperformancecomparedtoconventionalCNNs.

[3]Tocapturethesubtlechangesinretinallesions,PanandYang(2023)createdanencoder-decodermodelwithadualbranch structure that uses LGE and attention mechanisms. For applied, it demonstrated notable improvements in accuracy for identifyingearly-stageDRusingtheMessidordataset.

[4]ToovercometheproblemofdatascarcityinDRdetection,WietsetenDametal.(2023)suggestedamethodusingGenerative AdversarialNetworks(GANs)togeneratesyntheticretinalpictures.EventhoughCNNswerenotdirectlyevaluatedinthiswork, itisanticipatedthatusingthegenerateddataformodeltrainingmayenhanceCNNperformance.

[5]HybridCNNModel:ToenhancefeatureextractionforthecategorizationofDRin2022,thismodelmergedtheResNetand DenseNetarchitectures.WiththeMessidordataset,thehybridmodelachievedahighaccuracyof96.22%demonstratingthe benefitsofcombiningseveralnetworktopologies.

[6]ThestudybyMuhammadMohsinButtandcolleagues(2022)focusesonimprovingtheautomaticdetectionofDiabetic Retinopathy(DR)usingahybriddeeplearningapproach.Intheirreviewofpastwork,theauthorsdiscusstwomaincategories ofDRdetectionmethods:traditionalmachinelearningtechniquesandmoderndeeplearningmodels.

[7]ForautonomousDRclassification,theHybridCNNApproach(2020)integratedCNNlayerswithconventionalmachine learningclassifiers.ThismodelshowsthepotentialofhybridtechniquesbyachievinggoodperformanceontheAPTOSand Messidordatasets.

[8]GANswereusedin2023DeepLearningModelsforDRDetectioninconjunctionwithCNNstocreateartificialretinalimages. ThismethodseekstoimproveCNNscapacitytoidentifyDR,particularlyinlightoftheproblemofthescarcityofmedicaldata.

[9]P.K.Darabifocusesontheearlydiagnosisofdiabeticretinopathy(DR)topreventvisionloss.Reviewstraditionaldiagnostic toolssuchasfundusphotographyandOCT.ThestudyemphasizesthegrowingroleofAI,especiallydeeplearningmodels,in automatingDRdetection.Highlightsthebenefitsofthesetoolsinimprovingaccessibilityandaccuracyofscreening.

[10]The2024studybyMeenalKatoleandProf.PramilaM.Chawanexploresusingtransferlearningwithensemblelearningfor early detection of diabetic retinopathy. It utilizes pre-trained CNNs to extract features from retinal images, enhancing classificationaccuracy.Theapproachaddresseslimiteddatachallengesandaimstosupportearlydiagnosis.unavoidable.

3. PROPOSED SYSTEM

3.1 Problem Statement

TodevelopaConvolutionalNeuralNetwork(CNN)andAttentionMechanismsarchitecturetoimprovethedetectionand classificationofDiabeticRetinopathyinretinalimages.

TheultimategoalwillbetodevelopanarchitecturethatwillintegrateCNNswithattentionmechanisms,yieldingbetter detectionandclassificationofdiabeticretinopathyonretinalimages.CNN-basedArchitecturewithAttentionMechanismsfor EnhancedDiabeticRetinopathyDetectionandClassificationsystem.

3.2 Problem Elaboration

Diabeticretinopathyisoneofthemostcommoncomplicationsassociatedwithdiabetes,affectingmillionsglobally.The conditionprogressesthroughseveralstages,frommildnon-proliferativetosevereproliferativeDR,andcaneventuallyleadto blindness.EarlydetectionofDRcanpreventvisionloss,butmanualdetectionbyophthalmologistsistimeconsumingand subjective.AutomatedsystemsusingdeeplearninghaveshownpromiseindetectingDRfromretinalimages;however,standard CNNarchitecturescannotconsistentlyfocusonimportantareaswithinanimage,leadingtosuboptimaldetectionaccuracy,

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 05 | May 2025 www.irjet.net p-ISSN: 2395-0072

especiallyinearly-stageDR.Inthisproject,weproposeusingCNNsenhancedwithattentionmechanisms,whichcanguidethe modeltofocusonregionsthataremostrelevantforDRdetection,improvingsensitivitytosmalllesionsandenhancingoverall classificationaccuracy.

3.3 Dataset Description

TheKaggleDiabeticRetinopathyDetectionDatasetisusedforthisresearch.Thisdatasetcontainsretinalfundusimages labeledaccordingtotheseverityofdiabeticretinopathy.Theimagesarecategorizedintotwoclasses:

Class0:DiabeticRetinopathy

Class1:NoDiabeticRetinopathy

Thedatasetconsistsof:-TrainingSet:35,126images

Each image in the dataset is captured under various conditions, including varying image resolution and lighting. Preprocessingtechniquessuchasresizing,normalization,anddataaugmentation(e.g.,rotation,flipping,andscaling)areapplied toenhancethequalityanddiversityofthetrainingdata

3.4 Data Preprocessing

Thedatasetcomprisesretinalimagesfromindividualsacrossdiverseagegroups,capturedundervariouslightingconditions, leadingtoinconsistentpixelintensitydistributions.Tomitigatethesevariations,colornormalizationtechniquesareappliedto standardizeimageappearance.Giventhehighresolutionoftheoriginalimagesandtheirsubstantialmemoryrequirements,the RGBimagesarefirsttransformedintograyscaleformattoreducecomputationalload.Theimagesarethenuniformlyresizedto 224x224pixels,whichhelpsretaintherelevantretinalfeatureswhileoptimizingmemoryusageandtrainingefficiency.Asthe architectureisbasedonaConvolutionalNeuralNetwork,extensivepreprocessingisnotnecessarybeyondtheseessentialsteps.

3.5 Model Description

To effectively detect and classify diabetic retinopathy (DR) from retinal fundus images, we propose a deep learning architecturethatintegratesConvolutionalNeuralNetworks(CNNs)withattentionmechanisms.Thishybridapproachenables

Fig -1:DRSystemArchitecture

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 05 | May 2025 www.irjet.net p-ISSN: 2395-0072

themodeltonotonlylearnhierarchicalfeaturesfromtheimagesbutalsofocusonthemostdiagnosticallyrelevantregions,such asmicroaneurysms,hemorrhages,andexudates.

TheCNNbackboneisresponsibleforfeatureextraction.ItconsistsofmultipleconvolutionallayerswithReLUactivation, interleavedwithmaxpoolinglayerstoreducespatialdimensionswhileretainingkeyfeatures.Theselayerscapturebothlowleveltexturesandhigh-levelpatternscriticalfordistinguishingbetweenDRseveritylevels.

To enhance interpretability and performance, an attention module is introduced after the convolutional layers. This mechanismdynamicallyweighsfeaturemaps,allowingthemodeltoattendtoregionsmostlikelytocontributetothefinal classification.Thismimicsaclinician’sbehavioroffocusingonabnormalretinalzonesduringdiagnosis.

ThenetworkconcludeswithfullyconnectedlayersfollowedbyaSoftmaxclassifier,predictingoneofthetwostandardDR classes(DiabeticRetinopathy,NoDiabeticRetinopathy).Dropoutlayersareusedtopreventoverfitting,andthemodelistrained usingcross-entropylossandanadaptiveoptimizersuchasAdam.

TheuseofattentionmechanismsalongsideCNNnotonlyboostsclassificationaccuracybutalsohelpslocalizecriticallesions, makingthemodelmorerobustandclinicallyrelevant.KeyArchitecturalParameters:

Inputsize:224x224pixels(preprocessedgrayscaleimage)

Convolutionalfilters:32,64,128(increasingwithdepth)

Kernelsize:3x3

Pooling:Maxpooling(2x2)

Fullyconnectedlayers:1–2layerswithReLU

Outputclasses:2

Activationfunction(finallayer):Softmax

Batchsize:64

Optimizer:Adam

Attentionmechanism:Channeland/orspatialattention(ex.CBAMorSEblock)

4 IMPLEMENTATION

TheproposedsystemisimplementedonaWindows11machineequippedwithanIntel®Core™i7processorandNVIDIA RTX3060GPUtofacilitateefficienttrainingofdeeplearningmodels.Pythonisusedastheprogramminglanguage,andthe JupyterNotebookenvironmentischosenforeaseofvisualizationanditerativedevelopment.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 05 | May 2025 www.irjet.net p-ISSN: 2395-0072

ThefollowingPythonlibrariesareemployed:

NumPy:Fornumericalcomputationsandimagedatamanipulation. Pandas:Tomanageandpreprocessmetadataandlabelinformation.

PIL(PythonImagingLibrary):Forimagereading,resizing,normalization,andaugmentation. TensorFlowandKeras:Initiallyusedformodelprototypingandexperimentation.

PyTorch:UsedforfinalimplementationoftheCNNwithattentionmechanismduetoitsdynamiccomputationgraphand easeofcustomization.

5. RESULTS

Theperformanceoftheproposeddeeplearningmodel,builtusingPyTorch,wasevaluatedonadatasetofretinalfundus imageslabeledfordiabeticretinopathyseverity.Thearchitecturecombinedconvolutionallayerswithanattentionmechanismto improvethenetwork’sfocusondisease-relevantfeatures.

-1:AccuracyofModel

Fig -2:DefinedArchitectureforRetinopathyModel
Chart

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 05 | May 2025 www.irjet.net p-ISSN: 2395-0072

Duringtraining,dataaugmentationtechniquessuchasrotation,flipping,andbrightnessadjustmentwereappliedtoimprove themodel’sgeneralizationandreduceoverfitting.Anaccuracyof94%isachieved.

Fig -3:ClassificationReportforRetinopathyClassificationModelbasedonTestSet

6. CONCLUSION

TheproposedsystemutilizesthestrengthsofCNNsandattentionmechanismstoenhancediabeticretinopathydetection andclassification.Guidingthemodel towardfocusingonthecritical areasinretinal imagesensuresthatthearchitecture overcomesthelimitationoftraditionalCNNsinthedetectionofearlyDR.Increasedabilitybythemodelfordistinctionbetween thestagesofDRwillleadtoatimelierdiagnosiswithgreateraccuracy,enablingscalabilityforclinicaluse.Furthermore,the inclusionofattentionmechanismswillmakethemodelinterpretableandnotonlyaccuratebutalsofocusonthosefeatures relevanttothemedicalprofessionalanalyzingthembyhand.

REFERENCES

[1] AryanKokane,GourhariSharma,AkashRaina,ShubhamNarole,andProf.PMChawan,“DetectionofDiabeticRetinopathy usingNeuralNetworks,”InternationalResearchJournalofEngineeringandTechnology(IRJET),Vol8,Issue3,2021.

[2] Jiao X, Tao*, “Classification of Diabetic Retinopathy based on Improved ResNet Model,” In Proceedings of the ACM ConferenceonInformationTechnology,2023.

[3] AnningPan,JingzongYang,“ResearchonDeepLearning-basedDetectionofchangesinDiabeticRetinopathyLesions,”In ProceedingsoftheACMConferenceonMedicalImagingApplications,2023.

Chart -2:LossofModel

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 05 | May 2025 www.irjet.net p-ISSN: 2395-0072

[4] WietsetenDam,MeikeGrol,ZeldaZeegers,AlirezaDehghani,andHuibAldewereld,“RepresentativeDataGenerationof DiabeticRetinopathySyntheticRetinalImages,”InHuman-CenteredAIEducationandPracticeConference2023(HCAIep ’23),ACM,NewYork,NY,USA,7pages.https://doi.org/10.1145/3633083.3633175

[5] Yasashvini R., Vergin Raja Sarobin M., Rukmani Panjanathan, Graceline Jasmine S. and Jani Anbarasi L, “Diabetic Retinopathy Classification using CNN and Hybrid Deep Convolutional Neural Networks,” Symmetry, MDPI, 2022, https://doi.org/10.3390/sym14091932

[6] Muhammad Mohsin Butt , D N F Awang Iskandar , Sherif E Abdelhamid , Ghazanfar Latif , Runna Alghazo, “Diabetic RetinopathyDetectionfromFundusImagesoftheEyeUsingHybridDeepLearningFeatures,”MDPI,NationalLibraryof Medicine,2022.

[7] WietsetenDam,MeikeGrol,ZeldaZeegers,AlirezaDehghani,andHuibAldewereld,“DeepLearningModelsforDiabetic RetinopathyDetectionUsingGANsandCNNs,”ACMTransactionsonMedicalApplications,2023.

[8] AliGhulam,AqsaDastgir,MuhammadWaseemIqbal,MuhammadAnwar,“AHybridConvolutionalNeuralNetworkModel forAutomaticDiabeticRetinopathyClassification from FundusImages,”IEEE Journal ofTranslational Engineeringin HealthandMedicine,2023.

[9] P. K. Darabi, “Diagnosis of Diabetic Retinopathy,” ResearchGate, 2024, [Online], Available: https://doi.org/10.13140/RG.2.2.13037.19688

[10] MeenalKatole,Prof.PramilaMChawan,“TransferLearningmodelwithEnsembleLearningtodetectDiabeticRetinopathy fromretinalimages,enhancingearlydiagnosis:ASurvey,”InternationalResearchJournalofEngineeringandTechnology (IRJET),Vol11,Issue11,2024.

BIOGRAPHIES

Meenal Katole is presently pursuing her MTech in Software Engineering from Veermata Jijabai TechnologicalInstitute,Mumbai.ShehasdoneherB.EinComputerScienceandEngineering.

Prof. Pramila M. Chawan is an Associate Professor at Veermata Jijabai Technological Institute (VJTI), Mumbai.ShehasdoneherB.E.(ComputerEngineering)&M.E.(ComputerEngineering)fromVJTICollege, Mumbai,withover31yearsofteachingexperience.Prof.Chawanhaspublishedanoutstandingamountof 191papersacrossinternationaljournals,conferences,andsymposiums.Shehasbeenactivelyinvolvedin guidingstudentsintheirresearch,withover100M.E./MTech.projectsand150B.E./B.Tech.projects under her supervision. She has also worked on the editorial boards of many international scientific journals Shehasbeenawardedwiththe‘InnovativeandDedicatedEducationalistAwardSpecialization: Computer Engineering and I.T.’ by The Society of Innovative Educationalist and Scientific Research Professional,Chennai(SIESRP).

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.