

![]()




















Thisprojectaimedtopredictdrugresistanceincancerpatients usingamulti-omicsintegrationapproach,combininggene expression,mutationprofiles,proteinabundance,methylation data,andmiRNAexpression.Thegoalwastobuildamachine learningmodelthataccuratelyclassifiesresistantvs.responsive tumorsamplesanduncoverskeybiologicaldriversofresistance.



• Retrieved matched multi-omics data from TCGA for chemotherapy-treatedbreastandlungcancerpatients.
• Normalizeddatausingz-scorestandardization.
Applied batch effect correction using Combat for cross-platform consistency.

















• Combinedfivedatamodalities:
⚬ Geneexpression
⚬ Mutationscores
⚬ Proteinabundance
⚬ DNAmethylation
⚬ miRNAprofiles

• Dimensionality reduction using PCA followed by featureselectionwithmutualinformationranking.

• Trained multiple models: Random Forest, SVM, Gradient Boosting,andXGBoost.
• Used 5-fold cross-validation and grid search for hyperparametertuning.
• FinalensemblemodelchosenbasedonF1-scoreandAUC.








• AUCscore:0.94forensemblemodelontestdata.
• Key predictors: Mutation scores and gene expression of top drug metabolismgenes(e.g.,ABCB1,TP53,EGFR).
• Achieved high sensitivity and specificity in separating resistant vs. sensitivesamples.
Identifiedpotentialbiomarkersofresistanceinvolvedin:

Drugefflux(ABCtransporters)
Apoptosisdysregulation
DNAdamageresponse








Eachrowisapatientsample; columnsrepresentstandardized valuesacrossfiveomicslayers.





Geneexpressionand mutationscoreswere themostimpactful featuresusedbythe model.


3:




AUC=0.94reflectsstrong modeldiscriminationbetween resistantandnon-resistant samples.

Thisprojectsuccessfullyimplementeda cutting-edgeintegrativebioinformaticsworkflow combiningbiologicalheterogeneitywithML predictionpower.Ourmodelisnotonlyhighly accuratebutalsobiologicallyinterpretable,offering candidatebiomarkersandpathwaysforfurther experimentalvalidation.









