Cancer Biology Project Case Study 3

Project3:PredictingDrug ResistanceinCancer

UsingMulti-OmicsData

Integration

BriefProjectDescription:

Thisprojectaimedtopredictdrugresistanceincancerpatients usingamulti-omicsintegrationapproach,combininggene expression,mutationprofiles,proteinabundance,methylation data,andmiRNAexpression.Thegoalwastobuildamachine learningmodelthataccuratelyclassifiesresistantvs.responsive tumorsamplesanduncoverskeybiologicaldriversofresistance.

ADVANCED METHODOLOGY PIPELINE:

1.DATA COLLECTION & PREPROCESSING

• Retrieved matched multi-omics data from TCGA for chemotherapy-treatedbreastandlungcancerpatients.

• Normalizeddatausingz-scorestandardization.

Applied batch effect correction using Combat for cross-platform consistency.

2. FEATURE ENGINEERING & INTEGRATION

• Combinedfivedatamodalities:

⚬ Geneexpression

⚬ Mutationscores

⚬ Proteinabundance

⚬ DNAmethylation

⚬ miRNAprofiles

• Dimensionality reduction using PCA followed by featureselectionwithmutualinformationranking.

3. PREDICTIVE MODEL DEVELOPMENT

• Trained multiple models: Random Forest, SVM, Gradient Boosting,andXGBoost.

• Used 5-fold cross-validation and grid search for hyperparametertuning.

• FinalensemblemodelchosenbasedonF1-scoreandAUC.

Project Outcomes:

• AUCscore:0.94forensemblemodelontestdata.

• Key predictors: Mutation scores and gene expression of top drug metabolismgenes(e.g.,ABCB1,TP53,EGFR).

• Achieved high sensitivity and specificity in separating resistant vs. sensitivesamples.

Identifiedpotentialbiomarkersofresistanceinvolvedin:

Drugefflux(ABCtransporters)

Apoptosisdysregulation

DNAdamageresponse

Visual Results:

Eachrowisapatientsample; columnsrepresentstandardized valuesacrossfiveomicslayers.

Figure 1: Multi-Omics Integration Heatmap

Figure 2: Feature Importance for Drug Resistance Prediction

Geneexpressionand mutationscoreswere themostimpactful featuresusedbythe model.

3:

AUC=0.94reflectsstrong modeldiscriminationbetween resistantandnon-resistant samples.

Figure

ROC Curve of Final Ensemble Model

CONCLUSION:

Thisprojectsuccessfullyimplementeda cutting-edgeintegrativebioinformaticsworkflow combiningbiologicalheterogeneitywithML predictionpower.Ourmodelisnotonlyhighly accuratebutalsobiologicallyinterpretable,offering candidatebiomarkersandpathwaysforfurther experimentalvalidation.

Turn static files into dynamic content formats.

Create a flipbook