
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
Aiswarya Rani1 , Sagar1, Shivangi Mishra1
¹Student, Delhi Pharmaceutical Sciences and Research University, New Delhi, India
¹Student, Delhi Pharmaceutical Sciences and Research University, New Delhi, India
¹Student, Delhi Pharmaceutical Sciences and Research University, New Delhi, India
Abstract - This study developed two models for Polycystic Ovary Syndrome (PCOS) detection using clinical and imaging data. A stacking ensemble model achieved 93% accuracy (Area Under the Curve [AUC] 0.99), identifying key predictors such as follicle counts, Body Mass Index (BMI), and hormonal markers. A Convolutional Neural Network (CNN) achieved 92.72% accuracy (AUC 0.89) with zero misclassifications, ensuring reliable ultrasound image classification. Advanced feature selection methods and a Streamlit-based interface enable real-time diagnostics, supporting early detection and improved clinical outcomes.
Key Words: PolycysticOvarySyndrome(PCOS),Diagnostic Models,MachineLearning,Prevalence,ReproductiveHealth
1.INTRODUCTION
Polycystic Ovary Syndrome (PCOS) is a multifaceted endocrine disorder that affects 4% to 20% of women of reproductiveageglobally,withprevalencevaryingbasedon diagnostic criteria and geographic location [1, 2]. First describedin1935bySteinandLeventhal,itischaracterized by a wide spectrum of reproductive, metabolic, and psychological symptoms, including infertility, hyperandrogenism, menstrual irregularities, insulin resistance,obesity,andmooddisorderssuchasanxietyand depression[3,4].
Long-term complications of PCOS include type 2 diabetes, cardiovasculardisease,endometrialcancer,andobstructive sleepapnea,underscoringitspublichealthsignificance[4,5] Despite these health concerns, up to 70% of cases remain undiagnosedgloballydueto thedisorder’sheterogeneous presentation and overlapping symptoms with other endocrineconditions[6].
Globally, PCOS affects approximately 8–13% of women of reproductiveage,withsignificantvariationsinprevalence acrosspopulationsduetodifferencesindiagnosticcriteria andstudymethodologies.[7]
InIndia,theprevalenceofPCOSrangesfrom3.7%to22%, highlighting regional anddemographicdisparities [8].For example, a community-based study in Mumbai reported a prevalenceof22.5%usingtheRotterdamcriteria[9],whilea pilotstudyinTamilNadufoundan18%prevalenceamong
adolescent females, with higher rates in urban areas comparedtoruralregions[10]
Conversely,researchinLucknowobservedaprevalenceof only3.7%usingtheNIHcriteriaamongwomenaged18–25 withmenstrualirregularitiesandhirsutism[11].Similarly,a studyinAndhraPradeshreporteda9.13%prevalencebased ontheRotterdamcriteria[12]
These variations are often influenced by the choice of diagnosticframework,lifestylefactors,andhealthcareaccess [13,14]
ThepathophysiologyofPCOSinvolvesacomplexinterplayof hormonal, genetic, and environmental factors. Hyperandrogenism, characterized by elevated androgen levels, disrupts normal ovarian function and follicular development,leadingtoanovulationandpolycysticovarian morphology [15,16]. Insulin resistance is another critical factor, as hyperinsulinemia exacerbates androgen production while reducing sex hormone-binding globulin (SHBG)levels,amplifyingsymptomslikehirsutismandacne [17,18]
Geneticpredispositions,suchasmutationsinFSHR,LHCGR, INSR,andTHADAgenes,furthercontributetoPCOS,along withepigeneticchangesresultingfromprenatal androgen exposure and elevated maternal anti-Müllerian hormone (AMH)levels[19][20][21].
Environmentalandlifestylefactorsplayasignificantrolein the onset and progression of PCOS. High-calorie diets, sedentarybehaviour,andexposuretoendocrine-disrupting chemicals(EDCs)havebeenlinkedtoincreasedprevalence, particularlyinurbanpopulations[22,12]
Geographical and socioeconomic disparities further influence symptom severity and healthcare access. Indian women,forinstance,frequentlypresentwithuniqueclinical features, such as higher incidences of insulin resistance, acanthosisnigricans,andthyroiddysfunction,necessitating tailoreddiagnosticandtherapeuticapproaches[14,9][23].
ThediagnosticframeworkforPCOShasevolvedsignificantly over the decades. The Rotterdam criteria, established in 2003, are the most widely used today and require the presenceofatleasttwoofthefollowing:oligo-anovulation,
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
clinical or biochemical hyperandrogenism, and polycystic ovarianmorphology(PCOM)detectedviaultrasonography, after excluding other disorders [24][25]. Advances in imaging techniques, such as high-frequency transvaginal ultrasonography,haveimproveddiagnosticaccuracy,though debatesregardingfolliclecountthresholdsandtheirclinical significancepersist[26,27]
Management of PCOS emphasizes a multidisciplinary approach, combining pharmacological treatments such as metformin, oral contraceptives, and anti-androgens with lifestyle modifications targeting weight loss and insulin sensitivity [28][4]. Dietary changes and regular exercise havedemonstratedsignificantimprovementsinmetabolic and reproductive outcomes. Additionally, psychological supportisessential,giventhepsychosocialburdenofPCOS and its impact on quality of life and treatment adherence [29]
Despite advancements in understanding and managing PCOS,significantgapsremaininelucidatingitsunderlying mechanisms and addressing diagnostic delays. Future researchshouldprioritizeexploringnovelbiomarkers,the role of epigenetics, and the long-term effects of early interventions.Thisstudyseekstoprovideacomprehensive overview of PCOS, emphasizing its prevalence, pathophysiology, diagnostic challenges, and management strategies while advocating for holistic and personalized approachestoimproveoutcomesforwomenaffectedbythis complexdisorder[30].
Inourstudy,weaccomplishedthefollowingkeyobjectives:
Utilized a publicly available Kaggle dataset combiningclinicalandultrasounddatatoaddress PCOSdetectioncomprehensively.
Applieddiverseanalyticaltechniquestoidentifykey predictors and optimize feature selection for improvedmodelaccuracy.
Developedahybridapproachwithtailoredmodels forclinicalandimagingdata,achievingrobustand reliabledetectionresults.
Enhancedimageprocessingwithadvancedmethods to extract critical diagnostic patterns for effective classification.
Createdauser-friendlyinterfaceenablingseamless data input, analysis, and real-time predictions to streamlinediagnostics.
Artificialintelligence(AI)hassignificantlyadvancedPCOS diagnosis using machine learning (ML) and deep learning (DL) models. Gopalakrishnan et al. achieved 93.82%
accuracy with SVM [31], while Nilofer et al.’s IFFOA-ANN modelreached97.5%accuracythroughadaptiveclustering [32].Hosainetal.’sPCONetCNNattained98.58%accuracy [33], and Maheshwari and Tiwari achieved 99.7% using Wavelet-EnhancedCNNs[34].Khannaetal.combinedHarris Hawks Optimization with XGBoost for high performance [35],andDanaeiMehrandPolatachieved98.89%accuracy withRandomForestandfeatureselection[36].
These AI-driven approaches address PCOS diagnosis challengesbyintegratingclinicalandimagingdata,enabling earlydetectionandbetterhealthcareoutcomes,particularly inhigh-prevalenceregionslikeIndia.
3.1 Inclusion Criteria
Womenofreproductiveage(18–44years).
3.2 Exclusion Criteria
Men,children,non-pubescentgirls,missingclinical data,andlow-qualityultrasoundimages.
3.3 Dataset characteristics
Twodatasetswereutilized,sourcedfromKaggle:
3.3.1 Clinical Laboratory Data: Comprising 2,000 records with 44 attributes, including demographic, hormonal, and lifestyle factors relevanttoPCOSdiagnosis.
3.3.2 Pelvic Ultrasound Images: A total of 4,400 augmented grayscale images, categorized as "infected"(PCOS)and"notinfected"(healthy), usedforimage-basedclassification.
Table 1-ParametersofDataset
Parameter
Age(years)
Weight(kg)
Height(cm)
Body Mass Index (BMI)
Reproductive age:15–45
Variesbasedon height and body composition
Variesbasedon genetics and nutrition
18.5–24.9
Typical Values in PCOS
Commonly diagnosed in late adolescencetoearly 30s
Oftenoverweightor obese,butcanoccur atnormalweight
Generally unaffectedbyPCOS
Often ≥25, indicating overweight or
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
Normal
Parameter
BloodGroup A,B,AB,O
PulseRate(bpm) 60–100
Respiratory Rate (breaths/min) 12–20
Hemoglobin (Hb) (g/dL)
Menstrual Cycle Regularity
Women: 12.0–15.5
Regular cycles every 21–35 days
CycleLength(days) 21–35
Marital Status (years)
Pregnancy Status (Y/N)
Number of Abortions 0
β-HCG(mIU/mL)
Non-pregnant: <5
FSH(mIU/mL) 3–10
LH(mIU/mL) 2–10
No direct association with PCOS
Typically within normalrange
Generallynormalin PCOSpatients
Usually within normalrange
Often irregular or absent due to anovulation
May be prolonged (>35 days) or absent
Notdirectlyrelated toPCOS
Leading cause of infertility due to anovulation
Increased risk of miscarriage
Elevated if pregnant; not directly related to PCOS
Normal or low levels
Elevated; LH:FSH ratio>2:1common
FSH/LHRatio Approximately 1:1 Often>2:1
Hip Circumference (inch)
Variesbasedon body composition
Waist Circumference (inch) Women:<35
Waist-to-HipRatio
Women:<0.85
TSH(mIU/L) 0.4–4.0
AMH(ng/mL) 1.0–4.0
Increased in overweight/obese individuals
Often increased, indicating central obesity
Often ≥0.85, indicating central obesity
Usually within normalrange
Elevated (>4.0), reflectingincreased folliclecount PRL(ng/mL) 4.0–23.0
Usually normal; mild elevation possible
Parameter
Normal Reference Range
Vitamin D3 (ng/mL) 30–100
Progesterone (ng/mL)
Typical Values in PCOS
Deficiencycommon
Follicular: 0.2–1.5;Luteal:1.7–27 Low due to anovulation
RBS(mg/dL) <140
Maybeelevateddue toinsulinresistance
WeightGain(Y/N) Common,especially centralobesity
Hirsutism (Hair Growth)(Y/N) Common due to hyperandrogenism
Skin Darkening (Y/N) Mayindicateinsulin resistance
HairLoss(Y/N) Androgenicalopecia canoccur
Pimples(Y/N) Common due to elevatedandrogens
Fast Food Consumption(Y/N)
Regular Exercise (Y/N)
High intake may exacerbate insulin resistance
Lackofexercisecan worsen metabolic parameters
BP (Systolic/Diastolic) (mmHg) <120/80 Maybeelevated
Fortheclinicaldataset,missingvalueswereimputedwith the median after cleaning and converting data to numeric formats. Outliers were removed using Z-scores, and irrelevant features were dropped to retain significant predictors. Data normalization was performed using MinMaxScaler,ensuringallfeatureswerescaledbetween0 and1.TheMin-Maxnormalizationformulascalesafeature toarangeof(0,1)using:
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
whereXistheoriginalvalueofthefeature.
Ultrasound images underwent preprocessing to enhance clarity and extract meaningful features. CLAHE (Contrast Limited Adaptive Histogram Equalization) was applied to improve image contrast, making subtle structures like ovarianfolliclesmorevisiblewhilepreventingnoiseoveramplification. Bilateral filtering reduced noise while preservingimportantedgesintheimages,followedbyedge detectionusingadaptivethresholdingandSobeloperators, which highlighted changes in pixel intensity to reveal key structuralfeatureslikecystboundaries.Keyvisualpatterns werethenidentifiedusingORB(OrientedFASTandRotated BRIEF)descriptors,whichdetectuniquepointsintheimage andencodethemintonumericalfeatures.Thesedescriptors weregroupedintoclustersusingKMeans(n_clusters=50),a methodthatorganizessimilarpatternsinto"visualwords." This process, known as the Bag-of-Visual-Words (BoVW) representation, converted each image into a histogram representingthefrequencyofthesevisualwords,enabling machine learning models to analyze them effectively. To furtherenhancemodelrobustness,augmentationtechniques suchasrotation,zoom,andflippingwereapplied,increasing dataset diversity and improving the model's ability to generalizeacrossvariedimagescenarios.
Feature extraction employed Apriori association, correlationanalysis,andmutualinformation,reducing44 parameters to essential features and enhancing PCOS detectionmodeltraining.
3.5.1 Association Rule Mining: This data mining technique identifies hidden patterns among features, where the presence of one feature predicts another. The Apriori algorithm was used to uncover significant associations between PCOS diagnosis and some features, ensuringtheretentionofimpactfulpredictors.
3.5.2 Correlation Analysis: Pearson’s correlation coefficient ( ) was used to evaluate linear relationshipsbetweenfeaturesandthetarget variable ("PCOS (Y/N)"). A threshold of ensured the inclusion of features withmoderateorstrongercorrelations,yetnot dismissing negative correlations to detect relationshippatterns.
3.5.3 Mutual Information (MI): MI captured both linear and non-linear dependencies between features and the target variable, identifying intricate relationships missed by correlation analysis.
An80:20splitwasappliedtodividethedatasetintotraining andvalidationsets.Thisensuredthat80%ofthedatawas utilized for model training, allowing the model to learn patterns, while the remaining 20% was reserved for validationtoassessgeneralizationonunseendata.Thesplit was performed randomly with a fixed random state for reproducibility,ensuringbothsubsetswererepresentative oftheoveralldataset.
For the clinical dataset, a stacking ensemble model was createdusingExtraTrees,AdaBoost,CatBoost,XGBoost,and LightGBM as base learners, with linear regression as the meta-classifier. Each algorithm contributed its strengths: Extra Trees identified key features, AdaBoost improved weaklearners,CatBoostworkedwellwithcategoricaldata, andXGBoostandLightGBMofferedhighspeedandaccuracy forlargedatasets.Themodelwastrainedusing5-foldcrossvalidation, ensuring reliable and consistent performance acrossthedata.
Fortheultrasounddataset,aConvolutionalNeuralNetwork (CNN)wasdesigned with threeconvolutional layers, each
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
pairedwithbatchnormalization,maxpooling,anddropout toextractfeaturesandminimizeoverfitting.Featuremaps wereflattenedandpassedthroughadenselayerwith128 units,followedbyafinalsigmoidlayerforclassification.The model, optimized with the Adam optimizer and binary crossentropyloss,effectivelyclassifiedPCOSandnon-PCOS images.
BoththeensemblestackingandCNNmodelswereevaluated usingcomprehensivemetricstoensurerobustperformance:
3.8.1 Classification Report:Providedmetricssuch as precision, recall, F1-score, and support, highlightingthemodel'sperformancebalance acrossdifferentclasses.
3.8.2 Confusion Matrix: Illustrated true positive (TP),truenegative(TN),falsepositive(FP),and falsenegative(FN)counts,offeringa detailed viewofclassificationoutcomes.
3.8.3 Accuracy:
Thismetricmeasurestheproportionofcorrectlyclassified instancesoutofthetotalinstances.
ThemodelswereevaluatedusingROC-AUC,whichmeasures a model's discriminative ability. The ROC curve (Receiver Operating Characteristic) visualizes the trade-off between True Positive Rate (Sensitivity) and False Positive Rate across thresholds, while AUC (Area Under the Curve) quantifiestheoverallperformance,withvaluescloserto1 indicatingbetterclassification.
4.1 Model1:ClinicalData
Themodelachievedanaccuracyof93%,with:
Class0:Precision=90%,Recall=100%,F1-score= 95%
Class1:Precision=99%,Recall=81%,F1-score= 89%
The ensemble stacking model outperformed individual classifiers like ExtraTrees (90.22%), AdaBoost (89.90%), andCatBoost(99.02%).
4.2 Model2:UltrasoundImages
Themodelrecordedanaccuracyof92.72%,with:
NegativeClassPrecision=97.83%
PositiveClassPrecision=96.25%
The model was trained for 50 epochs, with training and validation loss curves showing convergence and accuracy curvesconfirmingminimaloverfitting
5. DISCUSSION
ThestudydevelopedtworobustmodelsforPCOSdetection using clinical data and ultrasound images, with advanced featureextractiontechniques.Theintegrationofassociation rule mining, correlation analysis, and mutual information identifiedcriticalpredictors,enhancingfeatureselectionand modelperformance.
Featureextractionplayedapivotalroleinisolatingthemost relevantpredictorsforPCOSdetection.
5.1.1 Association Rule Mining: Association rules revealed significant relationships between clinicalfeaturesandPCOSdiagnosis.
The association rules highlight significant predictors for PCOSdetection.Folliclecountintherightovary (support: 0.1635, confidence: 59.18%, lift: 1.61) and the left ovary (support: 0.153, confidence: 58.17%, lift: 1.58) are key diagnosticmarkers,emphasizingtheirimportanceinPCOS diagnosis.Additionally,thestrongassociationbetweenBMI andweight(support:0.136,confidence:60.73%,lift:3.43) underscorestheirroleinbodycompositionanomalieslinked to PCOS. These findings confirm the interconnection of hormonalimbalance,folliclecount,andbodycompositionin PCOS pathology. The high conviction values (e.g., 6.69 for Rule 1) and strong associations between follicle numbers, BMI,andweightunderlinetheinterconnectionofhormonal imbalanceandbodycompositioninPCOSpathology.
5.1.2 Correlation Matrix Analysis: Thecorrelation analysisidentifiedhighlyrelevantfeatures:
Follicle No.(R)(r = 0.63) and Follicle No. (L) (r= 0.59) demonstrated the strongest positive correlations with PCOS, aligning with the condition’shallmarksymptomofincreasedovarian follicles.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
HairGrowth(Y/N)(r=0.48),SkinDarkening(Y/N) (r=0.46),andWeightGain(Y/N)(r=0.43)showed
moderatecorrelations,indicatingtheirrelevanceas clinicalindicatorsofhyperandrogenismandinsulin resistance.
FeaturessuchasPimples(Y/N)(r=0.28)andCycle (R/I) (r = 0.39) emphasized their importance in capturingirregularmenstrualcyclesandandrogendrivensymptoms.
Weakly correlated features like height (r = 0.07), pulse rate (r = 0.08), and Vitamin D3 levels (r = 0.06) were excluded to streamline the dataset, ensuringimprovedmodelclarityandefficiency.
5.1.3 Mutual Information (MI): Mutual information analysis highlighted hormonal markers as key predictors, FSH:MI=0.38,LH:MI=0.31,TSH:MI=0.30.
These results confirm the importance of elevated LH and altered FSH levels in PCOS diagnosis, alongside abnormal TSH levels, which may reflect thyroid dysfunction a frequent comorbidity in PCOS. By prioritizing high MI featuresandexcludingthosewithminimalimpact,feature selectionwasfurtheroptimized
The stacking ensemble model demonstrated exceptional performanceontheclinicaldataset,achievinganaccuracyof 93%.Keyresultsinclude:
Class0:Precision=90%,Recall=100%,F1-score= 95%
Class1:Precision=99%,Recall=81%,F1-score= 89%
Theconfusionmatrixconfirmedthemodel'seffectiveness, withonly2falsepositivesand3falsenegatives,ensuringa strong balance between precisionand recall.Additionally, theROCcurveachievedanAUCof0.99,reflectingexcellent discriminative power across thresholds. The model's trainingutilized5-foldcross-validationonVSCodeIDEwith a fixed seed value of 42, ensuring consistent and reproducibleoutcomes.Theensembleapproach,combining ExtraTrees, AdaBoost, CatBoost, and gradient boosting models, effectively generalized across the clinical data, providingreliablediagnosticaccuracy.
6 -ConsolidatedClassificationreportand accuraciesofEnsembleModel
7 -ConfusionMatrixofEnsembleModel
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
5.3 Model 2: Ultrasound Images
The CNN model achieved an accuracy of 92.72% in classifying ultrasound images into PCOS and non-PCOS categoriesaftertrainingfor50epochs.Resultsinclude:
Confusion Matrix: The model achieved zero misclassifications,with781truepositivesand1141 true negatives, underscoring its reliability in distinguishing between infected and non-infected cases.
Training vs Validation Accuracy: The training accuracystabilizednear99%,whilethevalidation accuracygraduallyalignedbyepoch30,confirming minimal overfitting and strong model generalization.
ROCCurve:TheROCcurveachievedanAUCof0.89, computedduringaninitialtrainingphaseof10-11 epochs,reflectingstrongdiscriminativecapability early in training, further refined over additional epochs.
The CNN was implemented using TensorFlow with DNN optimizationenabled,onasystemequippedwithanInteli5 (12th Gen) processor and 16 GB RAM, ensuring efficient computation. The automated feature extraction by CNNs eliminated manual intervention, allowing the model to capture complex patterns within ultrasound images for accurateclassification.
11 -CNNtrainingvsvalidationAccuracyin50 Epochs
Association rule mining, correlation analysis, and mutual information identified key predictors for PCOS detection. ThestackingmodelachievedanAUCof0.99forclinicaldata, while the CNN achieved an AUC of 0.89 for ultrasound images,providingaccurateandreliablediagnosticsolutions.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
12-AUCofCNNin11Epochs
5.4 Comparison with Previous
ThisstudyalignswithandextendspriorresearchonPCOS detectionusingMLandDLmodels.Thestackingensemble modelachievedanaccuracyof93%andAUCof0.99,slightly lower than Danaei Mehr's Random Forest (98.89%) [40], and Khanna's STACK-2 model (98% with AUC = 1) [39]. However,thisstudy’semphasisonassociationrulemining andfeatureselectionenhancedinterpretability,identifying follicle numbers, BMI, weight, and hormonal markers as criticalpredictors.
For ultrasound images, the CNN achieved an accuracy of 92.72%andAUCof0.89,comparabletopriorworkssuchas Gopalakrishnan’sSVM(93.82%)[35],andHosain'sPCONet (98.58%) [37]. Despite slightly lower accuracy, zero misclassifications and minimal overfitting highlight the CNN'sreliabilityandgeneralization.
By integrating ML and DL models and focusing on explainability,thisstudyoffersarobustframeworkforPCOS detection,balancingaccuracywithpracticalreliability.
Table 2 -ComparisonwithRelativeworks
Study Methodology Accurac y (%) AU C Key Features or Insights
Danaei Mehr and Polat
Random Forest with embedded feature selection 98.89Highlighted the importance of reducing feature redundancy.
Khanna et al. STACK-2with Salp Swarm Optimization (SSA) 98.00 1.00
This Study Stacking ensemble (ExtraTrees, AdaBoost, CatBoost, 93.00 0.99
Near-perfect classification performance.
Identified follicle counts, BMI, weight, LH, FSH, and TSH as critical
Study Methodology Accurac y (%) AU C Key Features or Insights Gradient Boosting) predictors.
Hosain et al. Custom PCONetCNN 98.58Focused on ultrasoundimage classification.
Gopalakrishna n et al.
SVM with feature extraction and preprocessin g 93.82Highaccuracyon ultrasoundimage classification.
This Study CNN with automated feature extraction 92.72 0.89 Zero misclassifications ; reliable and generalized performance.
Rahman et al. Random Forest and AdaBoost 94.00Demonstratedthe value of mutual information in featureselection.
Nilofer et al. ANN with adaptive clustering 97.50Highlighted the power of integrated modelsforfollicle classification.
Thisstudysuccessfullydevelopedandvalidatedtworobust modelsforPCOSdetection,integratingmachinelearningand deep learning approaches. The stacking ensemble model achievedhighaccuracy(93%)andanAUCof0.99forclinical data, identifying critical predictors such as follicle counts, BMI, weight, and hormonal markers (LH, FSH, and TSH). Similarly,the CNN model achievedanaccuracyof92.72% and an AUC of 0.89 for ultrasound images, with zero misclassificationsandminimaloverfitting,demonstratingits reliabilityinimage-basedclassification.
The integration of association rule mining, correlation analysis,andmutualinformationenhancedfeatureselection, ensuring a balance between performance and interpretability.Thesefindingsalignwithandextendprior research, offering a comprehensive framework for PCOS diagnosisbycombiningclinicalandimagingdata.
Additionally,thecreationofaStreamlit-baseduserinterface makes the application accessible, enabling real-time predictions and seamless interaction for clinicians and researchers.ThisstudycontributestoadvancingAI-driven solutionsforwomen'shealth,pavingthewayforaccurate, earlydetectionofPCOSandimprovedpatientoutcomes.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
13 -SnapshotofStreamLitappforPCOSDetection
Figure 14 -SnapshotofStreamLitappforPCOSDetection
ACKNOWLEDGMENTS
Funding:nil
Ethical Statement
Theauthorsareresponsibleforthework'scontentandwill addressanyconcernsaboutitsaccuracyorintegrity.
Conflicts of Interest
The authors declare no conflicts of interest related to this work.
Authors' Contributions
All authors contributed significantly to the research, including conceptualization, data analysis, model
development,andmanuscriptpreparation.Eachauthorhas reviewedandapprovedthefinalversionofthemanuscript.
Data Availability Statement: The clinical laboratory datasetandpelvicultrasoundimagesusedinthisstudyare publicly available on Kaggle: Polycystic Ovary Syndrome (PCOS)Dataset
[1]AzzizR,CarminaE,ChenZ,DunaifA,LavenJS,LegroRS, Lizneva D, Natterson-Horowtiz B, Teede HJ, Yildiz BO, "Polycysticovarysyndrome," Nat. Rev. Dis. Primers,vol. 2, Aug.2016,p.16057,doi:10.1038/nrdp.2016.57.
[2] Dong J, Rees DA, "Polycystic ovary syndrome: pathophysiologyandtherapeuticopportunities," BMJ Med., vol. 2, no. 1, Oct. 2023, p. e000548, doi:10.1136/bmjmed2023-000548.
[3]LegroRS,ArslanianSA,EhrmannDA,HoegerKM,Murad MH, Pasquali R, Welt CK, "Diagnosis and treatment of polycystic ovary syndrome: an Endocrine Society clinical practiceguideline," J. Clin. Endocrinol. Metab.,vol.98,no.12, Dec.2013,pp.4565–4592,doi:10.1210/jc.2013-2350.
[4]TayCT,MousaA,VyasA,PattuwageL,TehraniFR,Teede H, "2023 international evidence-based polycystic ovary syndrome guideline update: insights from a systematic reviewandmeta-analysisonelevatedclinicalcardiovascular diseaseinpolycysticovarysyndrome," J. Am. Heart Assoc., vol. 13, no. 16, Aug. 2024, p. e033572, doi:10.1161/JAHA.123.033572.
[5] Goodarzi MO, Dumesic DA, Chazenbalk G, Azziz R, "Polycystic ovary syndrome: etiology, pathogenesis, and diagnosis," Nat. Rev. Endocrinol.,vol.7,no.4,Apr.2011,pp. 219–231,doi:10.1038/nrendo.2010.217.
[6]SinghS,PalN,ShubhamS,SarmaDK,VermaV,MarottaF, Kumar M, "Polycystic Ovary Syndrome: Etiology, Current Management,andFutureTherapeutics," J. Clin. Med.,vol.12, no.4,2023,p.1454,doi:10.3390/jcm12041454.
[7]WorldHealthOrganization,"PolycysticOvarySyndrome," WHO Fact Sheet,Jan.2025,accessedJan.6,2025.Available: https://www.who.int/news-room/factsheets/detail/polycystic-ovary-syndrome
[8]BharaliMD,RajendranR,GoswamiJ,SingalK,Rajendran V,"Prevalence of Polycystic Ovarian Syndrome in India: A SystematicReviewandMeta-Analysis," Cureus,vol.14,no. 12,Dec.2022,p.e32351,doi:10.7759/cureus.32351.
[9] Joshi B, Mukherjee S, Patil A, Purandare A, Chauhan S, Vaidya R, "A cross-sectional study of polycystic ovarian syndrome among adolescent and young girls in Mumbai, India," Indian J. Endocrinol. Metab.,vol.18,no.3,May2014, pp.317–324,doi:10.4103/2230-8210.131162.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
[10] Balaji S, Amadi C, Prasad S, Bala Kasav J, Upadhyay V, SinghAK,SurapaneniKM,JoshiA,"Urbanruralcomparisons ofpolycysticovarysyndromeburdenamongadolescentgirls in a hospital setting in India," Biomed. Res. Int., vol. 2015, 2015,p.158951,doi:10.1155/2015/158951.
[11]GillH,TiwariP,DabadghaoP,"Prevalenceofpolycystic ovary syndrome in young women from North India: A community-basedstudy," IndianJ. Endocrinol. Metab.,vol.16, Suppl. 2, Dec. 2012, pp. S389–S392, doi:10.4103/22308210.104104.
[12] Nidhi R, Padmalatha V, Nagarathna R, Amritanshu R, "Prevalence of polycystic ovarian syndrome in Indian adolescents," J. Pediatr. Adolesc. Gynecol.,vol.24,no.4,Aug. 2011,pp.223–227,doi:10.1016/j.jpag.2011.03.002.
[13] Patel S, "Polycystic ovary syndrome (PCOS), an inflammatory,systemic,lifestyleendocrinopathy," J. Steroid Biochem. Mol. Biol., vol. 182, Sep. 2018, pp. 27–36, doi:10.1016/j.jsbmb.2018.04.008.
[14]GanieMA,VasudevanV,WaniIA,BabaMS,ArifT,Rashid A,"Epidemiology,pathogenesis,genetics&managementof polycysticovarysyndromeinIndia," Indian J. Med. Res.,vol. 150, no. 4, Oct. 2019, pp. 333–344, doi:10.4103/ijmr.IJMR_1937_17.
[15] Carmina E, Lobo RA, "Polycystic ovary syndrome: Arguably the most common endocrinopathy is associated with significant morbidity in women," J. Clin. Endocrinol. Metab., vol. 84, no. 6, Jun. 1999, pp. 1897–1899, doi:10.1210/jcem.84.6.5803.
[16] Rosenfield RL, Ehrmann DA, "The pathogenesis of polycystic ovary syndrome: The hypothesis of PCOS as functional ovarian hyperandrogenism revisited," Endocr. Rev., vol. 37, no. 5, Oct. 2016, pp. 467–520, doi:10.1210/er.2015-1104.
[17] Baptiste CG, Battista MC, Trottier A, Baillargeon JP, "Insulinandhyperandrogenisminwomen withpolycystic ovarysyndrome," J. Steroid Biochem. Mol. Biol.,vol.122,no. 1–3,Oct.2010,pp.42–52,doi:10.1016/j.jsbmb.2009.12.010.
[18]GonzalezF,"Inflammationinpolycysticovarysyndrome: Underpinningofinsulinresistanceandovariandysfunction," Steroids, vol. 77, no. 4, Mar. 2012, pp. 300–305, doi:10.1016/j.steroids.2011.12.003.
[19] ZhaoH,ZhaoY,LiT,LiM,LiJ,LiR,LiuP,YuY,QiaoJ, "Metabolismalterationinfollicularniche:Thenexusamong intermediary metabolism, mitochondrial function, and classic polycystic ovary syndrome," Free Radic. Biol. Med., vol. 86, 2015, pp. 295–307, doi:10.1016/j.freeradbiomed.2015.05.013.
[20] Escobar-Morreale HF, "Polycystic ovary syndrome: Definition, aetiology, diagnosis and treatment," Nat. Rev.
Endocrinol., vol. 14, no. 5, May 2018, pp. 270–284, doi:10.1038/nrendo.2018.24.
[21] Day F, Karaderi T, Jones MR, Meun C, He C, Drong A, KraftP,LinN,HuangH,BroerL,etal.,"Large-scalegenomewidemeta-analysisofpolycysticovarysyndromesuggests sharedgeneticarchitecturefordifferentdiagnosiscriteria," PLoS Genet., vol. 14, no. 12, Dec. 2018, p. e1007813, doi:10.1371/journal.pgen.1007813.
[22] Diamanti-Kandarakis E, "PCOS in adolescents," Best Pract. Res. Clin. Obstet. Gynaecol.,vol.24,no.2,Apr.2010,pp. 173–183,doi:10.1016/j.bpobgyn.2009.09.005.
[23]SinhaU,SinharayK,SahaS,LongkumerTA,BaulSN,Pal SK, "Thyroid disorders in polycystic ovarian syndrome subjects: A tertiary hospital-based cross-sectional study fromEasternIndia," Indian J. Endocrinol. Metab.,vol.17,no. 2, Mar.–Apr. 2013, pp. 304–309, doi:10.4103/22308210.109714
[24]FauserBC,TarlatzisBC,RebarRW,LegroRS,BalenAH, Lobo R, et al., "Consensus on women's health aspects of polycysticovarysyndrome(PCOS)," Hum. Reprod.,vol.27, no.1,Jan.2012,pp.14–24,doi:10.1093/humrep/der396.
[25]JonardS,RobertY,Cortet-RudelliC,PignyP,DecanterC, DewaillyD,"Ultrasoundexaminationofpolycysticovaries:Is itworthcountingthefollicles?" Hum. Reprod.,vol.18,no.3, Mar.2003,pp.598–603,doi:10.1093/humrep/deg115.
[26]DewaillyD,AndersenCY,BalenA,BroekmansF,Dilaver N, Fanchin R, et al., "The physiology and clinical utility of anti-Mullerianhormoneinwomen," Hum. Reprod. Update., vol. 20, no. 3, May–Jun. 2014, pp. 370–385, doi:10.1093/humupd/dmt062.
[27]ChristJP,GunningMN,FauserBCJM,"Implicationsofthe 2014 Androgen Excess and Polycystic Ovary Syndrome Society guidelines on polycystic ovarian morphology for polycystic ovary syndrome diagnosis," Reprod. Biomed. Online., vol. 35, no. 4, Oct. 2017, pp. 480–483, doi:10.1016/j.rbmo.2017.06.022.
[28] Moran LJ, Teede HJ, "Metabolic features of the reproductive phenotypes of polycystic ovary syndrome," Hum. Reprod. Update.,vol.15,no.4,Jul.–Aug.2009,pp.477–488,doi:10.1093/humupd/dmp008.
[29] Dokras A, Witchel SF, "Are young adult women with polycysticovarysyndromeslippingthroughthehealthcare cracks?" J. Clin. Endocrinol. Metab.,vol.99,no.5,May2014, pp.1583–1585,doi:10.1210/jc.2013-4190.
[30]NormanRJ,DewaillyD,LegroRS,HickeyTE,"Polycystic ovarysyndrome," Lancet ,vol.370,no.9588,Aug.2007,pp. 685–697,doi:10.1016/S0140-6736(07)61345-2.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 01 | Jan 2025 www.irjet.net p-ISSN: 2395-0072
[31] Gopalakrishnan C, Iyapparaja M, "Multilevel thresholding-based follicle detection and classification of polycysticovarysyndromefromtheultrasoundimagesusing machine learning," Int. J. Syst. Assur. Eng. Manag., 2021, doi:10.1007/s13198-021-01203-x.
[32] NiloferM,AhmedS,PradeepC,"Improvedfuzzyfirefly optimizationalgorithmforANNinPCOSdiagnosis," Neural Comput. Appl.,2021,doi:10.1007/s00521-021-06095-x
[33] Hosain AKMS, Mehedi MHK, Kabir IE, "PCONet: A convolutional neural network architecture to detect polycysticovarysyndrome(PCOS)fromovarianultrasound images," arXiv.,2022,doi:10.48550/arXiv.2210.00407.
[34]MaheshwariS,TiwariP,"PCOS-WaveConvNet:Awavelet convolutionalneuralnetworkforpolycysticovarysyndrome detection using ultrasound images," 9th Int. Conf. Inf. Technol. Trends (ITT).,2023.
[35] Khanna VV, Chadaga K, Sampathila N, Prabhu S, BhandageV,HegdeGK,"Adistinctiveexplainablemachine learning framework for detection of polycystic ovary syndrome," Appl. Syst. Innov., vol. 6, no. 2, 2023, p. 32, doi:10.3390/asi6020032.
[36] DanaeiMehrH,PolatH,"Diagnosisofpolycysticovary syndrome throughdifferent machinelearningandfeature selection techniques," Health Technol., vol. 12, 2021, doi:10.1007/s12553-021-00613-y.