Paper Presentation Final_Logic

Page 1

Objectives Introduction Notations Example References

ANALYSIS OF ISCHEMIC HEART DISEASE DATA USING LOGIC REGRESSION

Nadeem Shafique Butt Muhammad Qaiser Shahbaz Asif Hanif 27 Nov 2010


Objectives Introduction Notations Example References

Many observational studies establish whether certain risk factors are associated with a disease. In some situations it is important to study higher order interaction but with commonly available methods it is difficult to study higher order interactions specifically in case of binary covariates. We describe “Logic Regression� method proposed by Ruczinski et al. (2003). For illustration we use data collected from January 2006 to December 2008 from patients admitted in cardiology department at Mayo Hospital Lahore


yylr/ y  YiX  x  HT  i S  i

Objectives

Introduction Notations Example References

Regression is most important tool in field of Statistics to analyze data and make inference about associations between predictor and response. However, in most regression problems a model is developed that only relates the predictors as they are main effects to response. Interactions between predictors are considered as well but usually kept very simple (2-way or 3-way at maximum)


yylr/ y  YiX  x  HT  i S  i

Objectives

Introduction Notations Example References

Logic Regression: Given a set of binary predictors X, “Logic Regression” try to create new and better predictors for the response by considering Boolean combination of those binary predictors: Example: If the response variable is binary as well, the method attempt to find decision rules such as if X1, X2,X3 and X4 are true, or X5 or X6 but not X7 then response is more likely to be close to zero The method try to find Boolean statements involving the binary predictors that enhance the prediction for the response variable


y i/  2 pi Y i HT

 i S  i

Objectives

Introduction Notations Example References

The aim of this technique is to find those combinations of binary variables that have the highest predictive power for the response. These combination are Boolean Logic Expression and since the predictors are binary, any combination of predictors will be binary.


y i/  2 pi Y i HT

 i S  i

Objectives

Introduction Notations Example References

This can be easily seen from the table that higher order interaction will have all zeros. Solution in this case is to find out a classification rule that correctly assign a case to either Y=0 or Y=1 using Boolean Equation


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives

Introduction Notations Example References

Search Algorithms: Given a fixed number of predictors, there are only finite many Boolean expression that yield different predictions. If there are “k” predictors then there are 2^2^k different prediction scenarios. And if there are “n” cases and “k” predictors then there might be up to k^n different logic trees. Greedy search algorithm is used find out the best Boolean Combination of the predictors that maximize the predictability


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives

Introduction Notations Example References

In this paper we have used Logic Regression technique to model ISCHEMIC HEART DISEASE DATA as proposed by Ruczinski et al. (2003) using “LogicReg” package of “R”


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction

Notations Example References

 

c c   L   X1  X 2   X 3  X 4  X 5  X 6  X 7   Operators

 : AND  :OR c

X : NOT


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction

Notations Example References


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction Notations

Example References

Details of Data Data Collection Duration:

Jan 2006 – Dec 2008

Venue:

Cardiology Department, Mayo Hospital Lahore

Patient Definition:

Under treatment of chest pain, cardiac failure and Syncope.

History Recorded:

Clinical features, cardiovascular risk factors such as hypertension, DM, smoking habits and dyslipidaemic.

Exclusion criteria:

contained sever liver disease, CLD, acute and chronic inflammatory diseases, immunological diseases and sever anemia.

Finally coronary angiography was done on all patients by Judkin’s technique.


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction Notations

Example References

Variable Label

Codes

HD

CHD

0=Normal, 1=CHD

Gender

Gender

0=Female, 1=Male

DM

Diabetes Mellitus

0=No, 1= Yes

HTN

Hypertension

0=No, 1= Yes

IHDF

Family history ischemic heart disease

0=No, 1= Yes

FH

Family history of hypertension

0=No, 1= Yes

DF

Family History of Diabetes Mellitus

0=No, 1= Yes

Smoking

Smoking History

0=No, 1= Yes

Viral

Viral ailment

0=No, 1= Yes


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction Notations

Example References

Logic Regression Model L1: +4.15 * ((((not DM) or (not Gender)) or (HTN or (not IHDF))) and ((Smoking or DF) and ((not DF) or (not HTN))))

L2: -2.35 * (((DM or (not DF)) and IHDF) or ((Gender and (not DF)) or (FH and (not Gender))))

L3: +3.6 * (((IHDF and (not DM)) or ((not Gender) or (not Smoking))) and ((DF or Smoking) or (FH or Gender)))


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction Notations

Example References

• Logic Trees


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction Notations

Example References


y i/  2N pi pYi i

d HT   S 2pi i i 1 i1

Objectives Introduction Notations

Example References


Objectives Introduction Notations Example

References

1.


Thanks


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.