Meeting of the Minds digest, 2017 by Carnegie Mellon University in Qatar

M E E T I N G OF THE

Minds

U n d e r g r a d u a t e Research Symposium

APRIL 25 | 2017

eeting of the Minds is an annual symposium at Carnegie Mellon University that gives students an opportunity to

present their research and project work to a wide audience of faculty, fellow students, family members, industry representatives and the larger community. Students use posters, videos and other visual aids to present their work in a manner that can be easily understood by both experts and non experts. Through this experience, students learn how to bridge the gap between conducting research and presenting it to a wider audience. A review committee consisting of industry experts and faculty members from other universities will review the presentations and choose the best projects and posters. Awards and certificates are presented to the winners.

Table of Contents Page

A Message from Dean Ilker Baybars

Carnegie Mellon University in Qatar Leadership

Judges

Biological Sciences Posters •

Alkaline phosphatase isozymes and their use in gastrointestinal therapy

•

Potential food poison analysis of phage DNA collected from Al Khor

• BCL2L11 as gene target of mir-92a and mir-10a: Gene expression, interaction evaluation and implication in type 2 diabetes and obesity

• Biofilm formation in water systems in Doha

• Developing CRISPR mutagenesis components for S. Cerevisiae

•

Caffeine as an inhibitor of calf intestinal alkaline phosphatase

•

MAPK14 minor intron splicing as a novel biomarker for breast cancer

• Effect of glucagon-like peptide-1 analog on modulating metabolic stress: Possible role of heat shock response

• Expression, purification, and characterization of stem cell transcription factors Brn2, Sox17 and its mutant

• Crystallization and characterization of HMG domain of stem cell transcription factors Sox7, Sox17 and its mutant

• Effect of hydrogen peroxide at 100 µM on Calf Intestinal Alkaline Phosphatase (CIAP) enzyme kinetics

•

Mechanisms of breast cancer escape from Natural Killer (NK) anti-tumor immunity

•

Oxidative stress in kidney cells – effects of aspartame

•

The effects of Mg and Zn on human placental alkaline phosphatase (PALP) activity

• Study of the role of Lactate Dehydrogenase C (LDHC) in the aggressive behavior of •

triple negative breast cancer

Application based learning to reinforce academic concepts in Qatar biology curriculum

Computer Science Posters •

Acoustic Analysis of Text (AAT): Extracting sound out of words

•

An agile platform for distributed computation in smart IoT environments

•

Lifestyle disease surveillance using spatio-temporal search intensity models

•

PolyHJ: A polymorphic main-memory hash join paradigm for multi-core machines

•

Sherlock: A crowdsourced system for automated semantic tagging of indoor floorplans

• The Hive: An on-edge middleware solution for context and resource sharing in the Internet of Things

Information Systems Posters •

Optimizing electricity consumption in GEMTEC

•

To read or to listen? A study of user engagement in a digital heritage artifact

•

Trustmarks and trust in Qatar

•

Influence of culture on social media advertisements through eye-tracking

•

The effect of culture on image appeal and social presence in Arab e-commerce websites

Postgraduate Posters •

Arabic author profiling for cyber-security

•

Multi-Arabic dialect lexicon extraction

A Message from Dean Ilker Baybars The Meeting of the Minds student research symposium is a highlight of the academic year, a celebration of the ingenuity, hard work, scientific exploration and intellectual curiosity that characterizes students in all disciplines at Carnegie Mellon University in Qatar. Research is an essential element of the undergraduate experience. For some students, the process of hypothesis, experimentation and analysis will inspire them to pursue further study, perhaps even a career in scientific research. For others, the intellectual rigor of research is invaluable experience in problem solving, and they can apply these skills in their professional careers, regardless of the industry. The fundamental process of scientific research is to bring together creativity and reason. The work that CMU-Q students are presenting is a showcase of this process: each project shows originality of thought and careful analysis. I encourage you to explore the projects, ask questions and learn about the unique perspectives that our students bring to scientific questions. The entire CMU-Q community can be exceptionally proud of this body of work.

Kind regards,

Ilker Baybars Dean and CEO Carnegie Mellon University in Qatar

Leadership Ilker Baybars Dean and CEO John O’Brien Associate Dean Selma Limam Mansar Associate Dean, Education Kemal Oflazer Associate Dean, Research

Contact Dean’s Office: deans-office@qatar.cmu.edu Research Office: cmuq-research@qatar.cmu.edu Admission Office: ug-admission@qatar.cmu.edu Media Inquiries: mpr@qatar.cmu.edu

Judges External Judges •

Dr. Essam Abdelalim, Assistant Professor, Hamad Bin Khalifa University

•

Dr. Ashraf Aboulnaga, Research Director, Qatar Computing Research Institute

•

Ms. Hayfa Ahmed, Incubation Manager, Qatar Science and Technology Park

•

Dr. Ali Alaboudy, Program Officer, Qatar National Research Fund

•

Dr. Marco Ameduri, Associate Dean for Pre-medical Education, Weill-Cornell Medicine-Qatar

•

Dr. Omar Boukhris, Postaward Administrator, Qatar National Research Fund

•

Dr. Julie Decock, Assistant Professor, Hamad Bin Khalifa University

•

Dr. Mohammed Dehbi, Assistant Professor, Hamad Bin Khalifa University

•

Dr. Omar El-Agnaf, Professor, Hamad Bin Khalifa University

•

Dr. Sebti Foufou, Professor, Computer Science and Engineering Department, Qatar University

•

Dr. Henning Horn, Assistant Professor, Hamad Bin Khalifa University

•

Dr. Karl Richard Alexander Knuth, Medical Director and CEO, National Center for Cancer Care and Research, Hamad Medical Corporation

•

Dr. Qutaibah Malluhi, Professor, Computer Science and Engineering Department, Qatar University

•

Dr. Lluis Marquez, Principal Scientist, Qatar Computing Research Institute

•

Dr. Hamid Menouar, Scientist, Qatar Mobility Innovation Center

•

Dr. Mourad Ouzzani, Principal Scientist, Qatar Computing Research Institute

•

Dr. Walid Qoronflech, Director of Biotechnology Development, Qatar Biomedical Research Institute

•

Dr. Ahmed Rebai, Program Officer, Qatar National Research Fund

•

Dr. Klaus Schoenbach, Senior Associate Dean, Northwestern University in Qatar

•

Dr. Munir Tag, Program Manager, ICT, Qatar National Research Fund

•

Dr. Ingmar Weber, Senior Scientist, Qatar Computing Research Institute

•

Dr. Stephan Vogel, Research Director, Qatar Computing Research Institute

•

Dr. Barak Yehya, Expert, Ministry of Development Planning and Statistics

Carnegie Mellon University in Qatar Judges •

Dr. John Gasper

•

Dr. Susan Hagan

•

Dr. Niraj Khare

•

Dr. Kemal Oflazer

•

Dr. Giselle Reis

•

Dr. Ihab Younis

Special Awards Carnegie Mellon University in Qatar acknowledges and thanks the Ministry of Development Planning and Statistics and Qatar National Research Fund for recognizing students and researchers with special awards. Note: All student and advisor affiliations in the posters are with Carnegie Mellon University in Qatar unless otherwise noted.

Alkaline phosphatase isozymes and their use in gastrointestinal therapy Authors

Nairuz Elazzabi Boshra Al Sulaiti

Advisor

Annette Vincent

Category

Biological Sciences

Abstract Alkaline phosphatases (AP) are homodimeric enzymes with catalytic sites containing three metal ion cofactors, two Zn and one Mg. AP enzymes hydrolyze alkyl phosphates, hence the use of p-nitrophenyl phosphate (pNPP) as the standard substrate in the experiment. In addition, the use of AP enzymes in therapeutic applications is increasing such as the development of a recombinant AP enzyme (ChimAP) to treat gastrointestinal inflammation. ChimAP is generated by substituting the crown domain of human intestinal AP (IAP) with that of human placental AP (PLAP). However, this substitution resulted in decreasing the ChimAP specificity towards negatively charged molecules such as ATP and PPi. Therefore, to assess bacterial AP (BAP) candidacy for the ChimAP, the active site stability of BAP is tested and compared with that of PLAP in the presence of EDTA. Thus, this research concludes that PLAP has a more stable active site than BAP, hence BAP loses its candidacy for substituting PLAP in the recombinant ChimAP to treat gastrointestinal inflammations.

Alkaline Phosphatase Isozymes and Their Use in Gastrointestinal Therapy Nairuz Elazzabi and Boshra Al Sulaiti and Dr. Annette Vincent

Abstract

Alkaline phosphatases (AP) are homodimeric enzymes with catalytic sites containing three metal ion cofactors, two Zn and one Mg [1]. AP enzymes hydrolyze alkyl phosphates, hence The use of p-nitrophenyl phosphate (pNPP) as the standard substrate in the experiment [2]. In addition, the use of AP enzymes in therapeutic applications is increasing such as the development of a recombinant AP enzyme (ChimAP) to treat gastrointestinal inflammation [3]. ChimAP is generated by substituting the crown domain of human intestinal AP (IAP) with that of human placental AP (PLAP). However, this substitution resulted in decreasing the ChimAP specificity towards negatively charged molecules such as ATP and PPi. Therefore, to assess bacterial AP (BAP) candidacy for the ChimAP, the active site stability of BAP is tested and compared with that of PLAP in the presence of EDTA. Thus, this research concludes that PLAP has a more stable active site than BAP, hence BAP loses its candidacy for substituting PLAP in the recombinant ChimAP to treat gastrointestinal inflammations.

Introduction

A chimeric alkaline phosphatase can be engineered to have specific desired properties or characteristics belonging to different AP isozymes. KifferMoreira et al. (2014) modified intestinal AP with the crown domain of placental AP for the purpose of treating intestinal problems such as acute kidney injury, inflammatory bowel diseases and gut dysbiosis. The chimeric AP, ChimAP, that was constructed was meant to have the active site and thermal stability of placental AP but substrate specificity of intestinal AP. However, it showed that it has low binding affinity for some substrates such as negatively charged molecules like ATP and PPi, which is a phenotype corresponding to that of PLAP. However, in order to maintain normal homeostatic gut microbiota, the ChimAP should be able to bind to all the substrates that IAP binds to carry out all its function and maintain the intestinal environment. In this paper, the candidacy of bacterial AP is tested for use in the construction of ChimAP instead of placental AP. Placental AP is known to be the most thermostable human isozyme, and bacterial AP is also very thermostable compared to other human isozymes, hence its candidacy for ChimAP. However, the use of placental AP in the ChimAP was mainly for its active site stability. Therefore, the active site stability of the bacterial AP is tested and compared to that of placental AP, and it is expected that bacterial AP should match or have similar stability as the human placental isozyme.

Methods

To find the optimal enzyme concentration of alkaline phosphatase that should be used in assays, concentrations of enzyme were varied (0.03 units-2.0 units) and human placental AP (1 unit/µl) or bacterial AP (0.2 units/µl) were assayed with excess pNPP (11.2 mM) in 1.0 M diethanolamine buffer pH 9.8 with 0.50 mM MgCl2 and 20 µM ZnCl2. The rate of o-nitrophenol formation was measured using a UV spectrophotometer at 410 nm. A curve of enzyme concentration (units) vs rate of reaction (absorbance/min) was constructed and optimal concentrations chosen; 0.2 units for bacterial AP and 2.0 units for human placental AP[4]. To determine the Vmax and Km, each enzyme was assayed with varying concentrations of pNPP substrate (0.025 mM-0.5 mM), 1.0 M diethanolamine buffer pH 9.8 with 0.50 mM MgCl2 and 20 uM ZnCl2, and the previously determined enzyme concentrations. A Michaelis-Menten plot and a Lineweaver-Burk plot were constructed to find the Vmax and Km. The same assay was then repeated with the addition of 250 µM EDTA, and once again the Vmax and Km were determined.

Figure.1 Lineweaver-Burk plot of bacterial alkaline phosphatase activity in the presence of 20 µM Zn2+. The activity of bacterial alkaline phosphatase was determined by measuring the velocity of onitrophenol formation from NPP substrate in the presence of buffer containing MgCl2 and ZnCl2 without EDTA or treated with 250 µM EDTA. The reciprocal substrate concentrations were plotted against the reciprocal velocities and a line equation obtained with a regression of 0.93164 for assay without EDTA and 0.82061 for assay with 250 µM EDTA) Figure.2 Concentration vs absorbance plot of human placental alkaline phosphatase activity in the presence of 20 µM Zn2+. The activity of bacterial alkaline phosphatase was determined by measuring the velocity of o-nitrophenol formation from NPP substrate in the presence of buffer containing MgCl2 and ZnCl2 without EDTA or treated with 250 µM EDTA. The substrate concentrations were plotted against the velocities to obtain a sigmoidal curve.)

Analysis

To determine the activity of BAP and PLAP, an appropriate enzyme concentration must be chosen to be used in assays, this is to ensure that further kinetic analysis is not dependent on enzyme concentration. The rate of the reaction is measured by keeping the substrate in excess while changing enzyme concentration gradually. Thus, the data suggested that the BAP enzyme concentration needed is 0.2 Units/µl and for PLAP is 2.0 Units/µl[4]. Next, the reciprocal of velocity and various substrate concentrations of p-NPP are plotted to construct a Lineweaver Burk plot. For BAP, the Km and Vmax were determined to be 0.043 mM and 0.41abs/min respectively. The experiment was repeated in the presence of EDTA and Km and Vmax were 0.21mM and 0.52abs/min respectively. This is not the case for the PLAP due to its allosteric properties, which does not follow Michaelis-Menten kinetics. From the substrate vs velocity graph the Km and Vmax were estimated to be 0.072mM and 0.14abs/min respectively. While in the presence of EDTA the values change to 0.15mM and 0.29abs/min. The increase in Km values were expected to increase in the presence of EDTA for both enzymes, since EDTA is a chelating reagent that sequesters Zn2+ away from the enzyme resulting in decreasing enzyme affinity towards its substrate and hence increasing Km. However, the published Km value of AP E. coli is 0.013mM. Thus, the inconsistency between published and experimental values could be due to the low R2 value, >0.95, obtained from the Lineweaver Burk plot (Fig.1). Low R2 imply poor reproducibility as the data poorly fit into the regression line. Hence, in this experiment, solid conclusions about the comparison of Km values cannot be made due to poor precision. In conclusion, it was hypothesized that bacterial AP would show a slower or equal Zn2+ dissociation rate and km in the presence of EDTA compared to PLAP. However, the experimental data demonstrated that PLAP has a lower Zn2+ dissociation rate and Km . Since the Zn2+ dissociates more slowly from crown domain of PLAP, it proves that its active site is more stable and PLAP is more prone to having enhanced affinity towards its substrate than the BAP. Hence, the hypothesis of the possibility of substituting PLAP with bacterial AP in the recombinant ChimAP to treat gastrointestinal inflammations is rejected. References:

[1] Millan, J. L. Alkaline phosphatases structure, substrate specificity and functional relatedness to other members of a large superfamily of enzymes Purinergic Signal 2006 2. [2] Say JC, Ciuffi K, Furriel RP et al. Alkaline phosphatase from rat osseous plates: Purification and biochemical characterization of a soluble form. Biochim Biophys Acta 1991; 1074: 256–62. [PubMed] [3] Kiffer-Moreira, T., Sheen, C. R., da Silva Gasque, K. C., Bolean, M., Ciancaglini, P., van Elsas, A., ... & Millán, J. L. (2014). Catalytic signature of a heat-stable, chimeric human alkaline phosphatase with therapeutic potential. PLoS One, 9(2), e89374. [4] Albader L. & Ahmad A. (2017) The effects of Mg+2 and Zn+2 on human placental alkaline phosphatase (PLAP) activity.

Potential food poison analysis of phage DNA collected from Al Khor Authors

Raghid Bsat

Advisor

Valentin Ilyin

Category

Biological Sciences

Abstract The aim of this research is to annotate a bacteriophage genome collected from Al Khor (25°44’21.9”N 51°32’51.0”E) and identifying its identity. Using different DNA assembly programs (ex: Mira, Spades, Newbler) which joins different sequences of the DNA obtained by DNA sequences Ion Torrent S5 machine. This has joined all chunks of DNA into one huge genome of 43,795 base pairs. Later on, DNA-Master was used to annotate different parts of the DNA, where each part is a potential gene which is identified by this program. This program has given 71 possible gene locations in the phage genome. After using this program, an online database called “ncbi.org” was used to look up each gene (sequence of amino acids product) to recognize its different role. Final results have shown 34 of 71 genes that code for known proteins which make up the genome for food poison Bacillus cereus. (50% identity for Bacillus cereus) For the final project, five proteins were deeply studied: site-integrase, Clp-protease, HNH endonuclease, sitespecific integrase, AraC transcriptional regulator, and N-acetylmuramoyl-L-alanine amidase.

Figure 1 showing the tetradecameric structure of Atp-dependent clp protease found in DNA Bacteriophage

Figure 2 showing the tetrameric structural proteins of AraC transcriptional Regulator, most specifically Rob protein found in the bacteriophage DNA

The aim of this research is to annotate a bacteriophage genome collected from Al Khor (25°44'21.9"N 51°32'51.0"E) and identifying its identity. Using different DNA assembly programs (ex: Mira, Spades, Newbler) which joins different sequences of the DNA obtained by DNA sequences Ion Torrent S5 machine. This has joined all chunks of DNA into one huge genome of 43,795 base pairs. Later on, DNAMaster was used to annotate different parts of the DNA, where each part is a potential gene which is identified by this program. This program has given 71 possible gene locations in the phage genome. After using this program, an online database called “ncbi.org” was used to look up each gene (sequence of amino acids product) to recognize its different role. Final results have shown 34 of 71 genes that code for known proteins which make up the genome for food poison Bacillus cereus. (50% identity for Bacillus cereus) For the final project, five proteins were deeply studied: site-integrase, Clpprotease, HNH endonuclease, site-specific integrase, AraC transcriptional regulator, and N-acetylmuramoyl-L-alanine amidase.

Abstract

Raghid Bsat ; Professor Valentin Ilyin

Figure 3 is a gene map that shows the locations of genes in the tested genome, where alternation of colors represent different reading frames.

Using Unix command line, it was possible to assemble bacteriophages’ DNA by different programs (Spades, Mira, Newbler). Assembly results have given a better view of the phage’s genome, as it brings them together into huge sequence called “a contig”. Previously, the genome is a file that’s made up of thousands of “reads” which are obtained using DNA sequencer Ion Torrent S5. After using different parameters and sizes for the file, using mira, a huge contig of 43,795 base pairs was found, with a 58.76 coverage( >50). Most bacteriophages have an average size of a 50,000 base pairs genome. These results now resemble a “possible” sequence of the bacteriophage genome. After DNA Assembly comes Annotation of the genome. Annotation of a genome means the process of identifying the location of genes and trying to figure out what each gene’s role is. DNAmaster is a program that’s used for annotating DNA. It has the option of automatic or manual annotation. Using DNAmaster’s automatic annotation, it was possible to get the result of 71 genes.

DNA Assembly & Annotation

The result was a product of letters that represent the amino acids which make up the gene. To understand better on what each gene does, I have used the databank given in Ncbi website. The amino acid product has to be “blasted” on the website, and the server gets back with possible same sequences with a high identity ratio. Blast is an option in the website that compares nucleotide or protein sequences to sequence databases and calculates the statistical similarity between the two. After blasting all 71 genes, these were the final results: 37/71 genes that code for “hypothetical/possible proteins” 34/71 genes that code for known proteins which make up the genome for food poison: Bacillus cereus. Although half of the genes were not fully explained, but not all genes are discovered so far. Having a 50% identity of Bacillus cereus, the tested bacteriophage could possibly be this particular food poison. Computational Biology is a huge field which, by using mathematical algorithms, and has various questions that are yet to be answered.

•

BLAST: Basic Local Alignment Search Tool. (n.d.). Retrieved March 21, 2017, from https://blast.ncbi.nlm.nih.gov/Blast.cgi 1D5Y: Crystal Structure Of The E. Coli Rob Transcription Factor In Complex With Dna - NCBI structure. (n.d.). Retrieved March 21, 2017, from https://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv .cgi?Dopt=s&uid=71830 C. (n.d.). Learn UNIX in 10 minutes. Retrieved March 21, 2017, from http://freeengineer.org/learnUNIXin10minutes.html Annotation and Bioinformatic Analysis of Bacteriophage Genomes A User Guide to DNA Master

References

•

• •

•

Conclusion

Potential Food Poison Analysis of Phage DNA collected from Al Khor

BCL2L11 as gene target of mir-92a and mir-10a: Gene expression, interaction evaluation and implication in type 2 diabetes and obesity Author

Alya Al-Kurbi

Advisor

Mohamed Chikri, Qatar Biomedical Research Institute

Category

Biological Sciences

Abstract Type 2 Diabetes (T2D) is the most common metabolic disorder and is an expanding global health problem that affects millions of people worldwide and has become one of the major public health concerns [2], given the population aging and the high obesity in Qatar, T2D is expected to reach 30% by 2025 and affect about 366 million people by 2030 [1]. The precise cellular and molecular mechanisms underlying insulin resistance is still not fully understood, however, a potential involvement of miRNAs has been recently emerged [2]. MicroRNAs are a class of small conserved noncoding RNA that act as negative regulators of gene expression by inducing degradation and/or translational inhibition by pairing with the 3â&#x20AC;&#x2122;UTR of potentially hundreds of target mRNAs. Based on preliminary data from previous studies on the circulating miRNAs in Qatari T2D/obesity patients, pilot study, mir-92a was significantly down regulated in Type 2 Diabetic-Non Obese human samples compared to normal human-Non Obese samples. One of the predicted mRNA targets of this miRNA is the pro-apoptotic factor Bim. The Bim protein encoded by the BCL2L11 gene contributes to cell apoptosis and programed cell death. Data from the KEGG significant pathways suggest that mir-92a and mir-10a are involved in the apoptosis pathways, which means that it is possible for these miRNAs to affect BCL2L11 gene and other genes involved in the apoptosis pathway. Thus, it is predicted that mir-92a and mir-10a inhibits the expression of BCL2L11 gene in which it could target Bim directly, by binding to the 3â&#x20AC;&#x2122;UTR region of the mRNA, and play important role in inhibiting beta cell apoptosis or insulin sensitive tissues such as muscle, liver or adipose. Based on our preliminary data from the pilot study, we assume that differential expression of mir-92a and mir-10a, as reflected by their circulating levels, could be important biomarkers for early T2D and may contribute to the development of T2D.

BCL2L11 as gene target of mir-‐92a and mir-‐10a: Gene Expression, Interac>on evalua>on and implica>on in type 2 diabetes and obesity 1 Carnegie

Alya Al-Kurbi 1 Mentor: Dr. Mohamed Chikri 2

Mellon University In Qatar 2 Qatar Biomedical Research Institute (QBRI)

Abstract

Type 2 Diabetes (T2D) is the most common metabolic disorder and is an expanding global health problem that affects millions of people worldwide and has become one of the major public health concerns (Whi@ng, D.R., et al. 2011), given the popula@on aging and the high obesity in Qatar, T2D is expected to reach 30% by 2025 and affect about 366 million people by 2030 (SCH. 2015). The precise cellular and molecular mechanisms underlying insulin resistance is s@ll not fully understood, however, a poten@al involvement of miRNAs has been recently emerged. MicroRNAs are a class of small conserved noncoding RNA that act as nega@ve regulators of gene expression by inducing degrada@on and/or transla@onal inhibi@on by pairing with the 3’UTR of poten@ally hundreds of target mRNAs (Bartel DP. 2004). Based on our preliminary data from pilot study, we assume that differen@al expression of mir-‐92a and mir-‐10a, as reflected by their circula@ng levels, could be important biomarkers for early T2D and may contribute to the development of T2D and obesity.

Background Informa@on Ø  BCL2L11 gene: •  The protein encoded by this gene belongs to the BCL-‐2 protein family. •  BCL-‐2 family members form hetero or homodimers and act as an@-‐ or pro-‐apopto@c regulators involved in variety of cellular ac@vi@es. •  The Bim protein( Bcl2l11) contribute to programmed cell death and apoptosis. Ø  Mir-‐92a, Mir-‐10a, and BCL2L11 gene: •  Pro-‐apopto@c factor Bim/BCL2L11 gene is one of the predicted mRNA targeted by mir-‐92a and mir-‐10a. •  Mir-‐92a inhibit the expression of BCL2L11 gene •  These miRNAs could target Bim directly, and play important roles in inhibi@ng cell apoptosis. KEGG Signiﬁcant pathways on puta>ve target genes (3-‐UTR region) of input miRNAs

Previous Work on mir-‐92a and mir-‐10a

Based on our preliminary data from the circula@ng miRNAs in Qatari T2D/obesity pa@ents, pilot study.

The predicted binding sites of miR-‐92a in the mRNA of BCL2L11 and the base-‐pairing interac@on between mir-‐92a and BCL2L11 mRNA.

Current Work NGT-‐NO: Normal Glucose Tolerance, n=11 IR: Insulin Resistance T2D-‐NO: Type 2 Diabetes Non Obese, n=20 T2D-‐OB=type 2 Diabetes Obese, n=20

Circula@ng mir-‐92a and miR-‐10a are down regulated in T2D-‐NO and T2D-‐OB compared to NGT-‐ NO group

Ra>onal: miR-‐92a and miR-‐10a is down regulated in T2D and obese pa@ents

Aim: Study the BCL2L11 expression in normal and T2D diabe@c human @ssues by TaqMan qPCR as a ﬁrst step to understand the molecular func@on of the miRNAs.

BCL2L11 gene expression in normal & T2D human >ssues

Op@miza@on of miRNA transfec@on in C2C12/3T3-‐L1

Untransfected%

Mock%

35%nM1miRNA%

40%nM1miRNA%

50%nM1miRNA%

3T3#L1&preadipocytes&

Figure 1: miRNA Inhibitor transfec@on into 3T3-‐L1 preadipocytes cells with 5uL Lipofactamine 2000

Figure 3: Gene Expression Assay of BCL2L11 in Normal and Type 2 Diabe@c cDNA human samples. Expression of BCL2L11 was assayed by Real-‐Time Taq Man-‐qPCR and normalized by the housekeeping beta-‐ac@n gene. Results are represented by the rela@ve expression (RQ) of BCL2L!1 using a triplicate cDNA samples.

Results and Conclusion •  Results Conﬁrm the elevated expression of BCL2L11 gene in T2D human muscle @ssues. •  The expression of the pro-‐apopto@c gene BCL2L11 is up-‐regulated in the muscle of T2D pa@ents and probably lead to diabe@c muscle atrophy. •  Conﬁrma@on have to be preformed in animal T2D and cell line models. Figure 2: miRNA Inhibitor transfec@on into C2C12 cells with 5uL Lipofactamine 2000

Acknowledgments I would like to thank Ms. Namat and Ms. Elham for their constant help and support throughout the research lab work.

Ongoing and future Work

•  To conﬁrm the qPCR result by western blot analysis using An@-‐Bim on normal and T2D human @ssues. •  Use Luciferase analysis assay to further asses the interac@on between the miRNAs and the mRNA gene target specially the mir-‐92a and mir-‐10a BCL2L11 target.

Biofilm formation in water systems in Doha Author

Khawla Al-Darwish

Advisor

Annette Vincent

Category

Biological Sciences

Abstract Most water systems try to limit microbial growth in drinking water distribution systems by using filters and disinfections to limit bacterial growth. Nevertheless, the drinking water distribution systems harbor high concentration of diverse bacteria. So, it is important to understand the diversity of bacterial community by identifying the different types of bacteria and their relative abundance in the water systems in order to determine the species that are closely associated with water systems. Aim: The goal of this research is to evaluate the efficiency of the water systems by characterizing the bacterial community structure of the water. In this study, the efficiency of two water systems were evaluated, one using drinking water samples and the other using wastewater of samples. The water samples were isolated from different stages of the water filtration system in Doha. The stages of the drinking water are: city water/source (A); influent water (B); water from filter (D); deposits after passing through the filter (POU); effluent water (C). The stages of the wastewater are: Inlet Works (1); Aeration tanks- anaerobic (2); Aeration tanks-anoxic (3); Sedimentation tanks (4); after 10 sedimentation unite (5); balancing and chlorination (6); after sand filter samples (7); after sample (8); after UF sample (9); final polishing stage (10). Method: To achieve this goal, next generation sequencing (NGS) was conducted and β-diversity analysis was used to compare the bacterial community of water samples across the stages. In this research, the DNA was isolated from the water samples at the different stages and the integrity of the DNA was tested using IDT 16s PCR. PCR amplification was conducted using Ion 16STM Metagenomics Kit, which uses primers targeting bacterial 16s DNA. Two sets of primers were used in this research; one set targeting the hypervariable regions 3-6,7-9, and the other set targeting the HV regions 2-4-8. The amplification products were purified using Agencourt® AMPure® XP beads. The library was prepared using Ion Plus Fragment Library Kit and was sequenced on Ion S5 system using the 330 TM chips. Finally, the Ion ReporterTM software was used to analyze the sequences and the different groups of bacteria in the water samples. The β-diversity based analyses were used to generate a graph of the relative abundance of bacterial groups in the water samples. In addition to the 16S Metagenomics, the reduction of the microbial contaminants was determined by evaluating three faecal indicators: Escherichia coli was used for evaluating the reduction in bacteria; F-specific coliphages was used for viruses; Clostridium perfringens for protozoa. PCR was used for the quantifications of pathogenic bacteria, a human-associated faecal marker, and tetracycline resistant bacteria. Finally, the metal ion analyses of the water samples were used to relate with biofilm formation.

PCR amplification was conducted using Ion 16STM Metagenomics Kit (Cat. no. A26216), which uses primers targeting bacterial 16s DNA. The primers used in this research targeted the hypervariable regions 3, 6-7 and 9. The amplification products were purified using Agencourt® AMPure® XP beads. Then, Agilent® 2100 Bioanalyzer® instrument was then used to calculate DNA input for library preparation. The library was prepared using Ion Plus Fragment Library Kit (Cat. no. 4471252). To prepare the library, the amplicons were end-repaired, purified, ligated and neck-repaired. Then, the adapter-ligated and neckrepaired DNA was purified. Bioanalyzer instrument was used to determine library concentration. Each library was diluted to 26pM for template preparation, which was done using the IonChef System. The library was sequenced on Ion S5 system using the 330 TM chips. Finally, the Ion ReporterTM software was used to analyze the sequences and the different groups of bacteria in the water samples. The β-diversity based analyses were used to generate a graph of the relative abundance of bacterial groups in the water samples.

16S Metagenomics and Next generation sequencing

The DNA was isolated from the waste water samples at the different stages using PowerWater© DNA isolation kit using 25 ml of the water samples. Then, the integrity of the isolated DNA was tested using IDT 16s PCR. The DNA concentration of the isolated DNA was determined using Qubit, which uses a fluorescent dye that specifically binds to the DNA to determine its concentration.

DNA Extraction

METHOD

The goal of this research is to characterize the bacterial community structure of the drinking water of samples isolated from different stages of the water filtration system in Doha. The different stages are: city water/source (A); influent water (B); water from filter (D); deposits after passing through the filter (POU); effluent water (C). Another goal was to then determine and identify the species that are abundant in each particular stage of filtration system. To achieve this goal, next generation sequencing (NGS) was conducted and βdiversity analysis was used to compare the bacterial community of water samples across the five stages.

Most drinking water systems try to limit microbial growth in drinking water distribution systems by using filters and disinfections to limit bacterial growth. Nevertheless, the drinking water distribution systems harbor high concentration of diverse bacteria. So, it is important to understand the diversity of bacterial community by identifying the different types of bacteria and their relative abundance in the drinking water systems in order to determine the species that are closely associated with water systems.

INTRODUCTION

Biofilm formation in water systems in Doha

0.2

0.4

0.6

0.8

1.2

0.88

0.9

0.92

0.94

0.96

0.98

1.02

Source

Influent Water

Filter

Effluent Water

POUs

DATA

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Proteobacteria

Firmicutes

Actinobacteria

Unclassified

Bacteroidetes

Spirochaetes

Cyanobacteria

Figure 2. Relative abundance of bacterial classes of the dominant phylum, Proteobacteria. The water samples derived from five locations: The water samples were derived from five locations: Source (A), influent water (B), filter (D), effluent water (C) and POUs (water deposits).

Figure 1. Relative abundance of bacterial phyla of drinking water samples. The water samples were derived from five locations: Source (A), influent water (B), filter (D), effluent water (C) and POUs (water deposits).

Khawla Al-Darwish Annette Vincent PhD Biological Sciences Program, Carnegie Mellon University in Qatar

Relative Abundance Relative Abundance

Pinto, A., Xi, C., & Raskin, L. (2012). Bacterial Community Structure in the Drinking Water Microbiome Is Governed by Filtration Processes. Environmental Science & Technology, 46(16), 88518859. http://dx.doi.org/10.1021/es302042t Vincent, A., (2016). Microbial diversity of oil-water samples derived from Maersk. Ion16S Metagenomics Kit User Guide. (2015). Retrieved from https://tools.thermofisher.com/content/sfs/manuals/MAN0010799_Ion_16S_ Metagenomics_UG.pdf

REFERENCE

For the future, I will extend the project by identifying the species that are closely associated with drinking water systems and then use the residues to grow them and try to isolate the phages. Also, I will study biofilm formation and how it relates to antibiotic resistance.

FUTURE DIRECTION

The figures show that there is a variety of different groups of bacteria in the water samples and the relative abundance of those groups change when moving from one location to another through the water treatment process. We can conclude that the Proteobacteria phylum persists through the entire filtration process, however, the relative abundance of its subclasses change as the water gets filtered.

CONCLUSION

Figure 2 shows that for all the locations, except effluent water (C), the dominant subclass of the Proteobacteria is Gammaproteobacteria. However, for effluent water, the dominant subclass is Alphaproteobacteria. Also, according to Figure 2, moving from water in the source to the effluent water, the relative abundance of Gammaproteobacteria decreased from 95% in the city water to 33% in the effluent water. Meanwhile, the abundance of Alphaproteobacteria increased hugely from ~2% in the source water to approximately 59% in the effluent water.

Figure 1 shows the bacterial phylum classification for the sequences of the water samples derived from five locations. According to the figure, the dominant phylum for all the locations is Proteobacteria. Figure 1 also shows that when moving from Source to the filter, the relatve abundance of Proteobacteria increased from 93.7% to 99.7 %; the abundance of Firmicutes decreased from 4.8% to 0.18%; the abundance of Actionbacteria decreased from 1.2% to 0% in filter. However, after passing the filter, the relative abundance of Proteobacteria decreased from 99.7% in the water in the filter to 96.7% in the water effluent. Also, the abundance of Firmicutes increased from 0.18% to 2.9%.

DISCUSSION

Developing CRISPR mutagenesis components for S. Cerevisiae Author

Dina Nayel

Advisors

Aaron Mitchell, Carnegie Mellon University

Category

Biological Sciences

Abstract CRISPR is a gene editing technique derived from the natural mechanism that bacteria use to protect themself from invading viruses. CRISPR is composed of gRNA, Cas9 enzyme and a target DNA sequence where mutagenesis is desired. gRNA is designed to match the target sequence thus making CRISPR a cutting edge technique: mutagenesis can be targeted anywhere in the genome by altering gRNA sequence. gRNA and Cas9 coding sequences are constructed in a plasmid and inserted into the cells. However, constructing these plasmids is a time-consuming way to implement CRISPR. If plasmid construction could be eliminated, the use of CRISPR would be both time and cost efficient gene editing technique. Earlier on, the Mitchell lab worked on expressing linear CRISPR components in the yeast C. Albicans, most recently the Mitchell lab was working on developing linear CRISPR components in another yeast: S. Cerevisiae.

Abstract

LEU2

ADE2

LEU2

ADE2

Cas9

LEU2

sgRNA

Cas9

Experiment 2: Perform CRISPR with DNA fragments that encode Cas9 and sgRNA

ADE2

Experiment 1: Perform transformation with plasmid that codes for Cas9 & sgRNA sgRNA

ADE2

Control: Linear LEU2 alone

Lack of ADE2 gene in S. cerevisiae produces a red phenotype which is easily identifiable

Method

Can CRISPR be performed with linear DNA to express Cas9 and sgRNA in S.cerevisiae?

Question:

Making plasmids is a time-consuming way to implement CRISPR/ Cas9. If plasmid construction could be eliminated, the use of CRISPR in yeast would be simplified for researchers everywhere.

Problem with CRISPR:

- Double stranded DNA:target - Cas9 enzyme: scissors - Single guide RNA (sgRNA): directs Cas9 enzyme to target

CRISPR Components:

Traditional gene editing is costly, time consuming and error prone. A groundbreaking system derived from the natural mechanism that bacteria use to protect themselves against viruses is CRISPR.

sgRNA

Figure 2

Figure 1: Transformation into damcells (C2992) was unsuccessful

LEU2

ADE2

- Designing/ ordering primers - PCR LEU2 - Initial run on agarose - PCR purification - Run on Agarose to check size

ADE2

Designing and making the recombination construct

Figure 2: Agarose gel results indicate that no supercoiled plasmid which should be at â&#x2030;&#x2C6; 3Kb, so ligation wa not successful

Results

Ligate PML104 and sgRNA Transform into damcells

Cas9

pML104

Insert sgRNA oligos into Cas9 containing plasmid

- Transform pML104 into dam- Overnight culture - Plasmid prep - Restriction digest with SwaI and BclI

Create CRISPR component plasmid

Method Continued

Dina Nayel, Dr. Aaron Mitchell

Dr. Beckie Campanaro, Carol Woolfort, Dr. Gordon Rule, Yingli Seigh, Zeyu Hu

Acknowledgements

3) When PCR products obtained, an experiment can be performed to see if DNA fragments for each CRISPR component can be used in S. cerevisiae mutagenesis.

2) All procedures could be repeated to obtain components needed for CRISPR experiments.

Follow-up Study

1) Troubleshoot PCR to obtain sgRNA product.

In conclusion, throughout the work with CRISPR/ Cas9 system, and specifically with circular DNA (plasmids) it was apparent that the goal of the experiment was to eliminate the painstaking procedures of constructing plasmids used for genetic modification. However, due to the difficulties faced while ligating pML104 plasmid, a single set of both linear DNA pieces and plasmid DNA with CRISPR coding regions could not be made in time.

Conclusion

Discussion

Possible explanation for the failure of ligation might be that in transformation step, the chemically competent cells may have been incompetent enough to take up the ligated pML104 plasmid, or that ligase enzyme used in ligation step needs to be deactivated. The two possible factors were addressed by deactivating ligase at 75Ë&#x161;C for 5 minutes after the overnight ligation was completed, and new electrocompetent cells were ordered and tested.

Developing CRISPR Mutagenesis Components for S.Cerevisiae

Caffeine as an inhibitor of calf intestinal alkaline phosphatase Authors

Maria Ali Nourhan ElKhatib

Advisor

Annette Vincent

Category

Biological Sciences

Abstract Background literature states that caffeine is an inhibitor of intestinal alkaline phosphatase (IAP). The purpose of the experiment was to determine whether caffeine is a competitive or noncompetitive inhibitor of the enzyme. We hypothesize that caffeine is a nonC competitive inhibitor based on its structural similarity to other nonC competitive inhibitors of IAP. This is to reduce AP levels in cancer cells. Using varying concentrations of caffeine and pC NPP, we performed the NPP assay to test our hypothesis. The experimental Km and Vmax values suggest that caffeine is an activator, disregarding the initial hypothesis.

ABST ACT

METHODS

www.PosterPresentations.com

RESEARCH POSTER PRESENTATION D ESIGN © 2 012

All tests were performed using 1.5 units/µL of IAP. This is the concentration at which the enzyme is not related to the substrate. Caffeine concentrations of 2mM, 0.5mM and 0.125 mM were tested. The required concentrations were prepared with CutSmart Buffer from a 20mM stock solution of caffeine dissolved in DMSO. These concentrations were incubated with IAP at 37°C in a water bath for 5 minutes. The residual activity of IAP was measured using the NPP assay. The effects of each concentration of caffeine were tested against 4 different concentrations of p-NPP (1.12mM, 5.6mM, 2.8mM, and 1.4mM). Rate of enzyme activity was measured using the UV-Vis spectrophotometer.

Caffeine

IAP is a membrane bound glycoprotein that facilitates the removal of phosphatase monoesters under basic conditions. Elevated levels of alkaline phosphatase were found in cancer patients. Caffeine has a structure similar to that of known non competitive inhibitors of AP. This experiment tests whether caffeine behaves similarly to the known non competitive inhibitors of AP.

NT ODUCT ON

Background literature states that caffeine is an inhibitor of intestinal alkaline phosphatase (IAP). The purpose of the experiment was to determine whether caffeine is a competitive or noncompetitive inhibitor of the enzyme. We hypothesize that caffeine is a non-competitive inhibitor based on its structural similarity to other noncompetitive inhibitors of IAP. This is to reduce AP levels in cancer cells. Using varying concentrations of caffeine and p-NPP, we performed the NPP assay to test our hypothesis. The experimental Km and Vmax values suggest that caffeine is an activator, disregarding the initial hypothesis.

Data and esults

The published Km of pure E.coli AP was 0.013mM, while that of pure calf intestinal AP obtained from MM plot was found to be 1.80mM. CIAP has a much higher Km than E.coli AP, which suggests that AP obtained from different species does not have same activity although having same function. R2 value obtained from Lineweaver Burke was 0.97484, very close to 1. However, Km was negative indicating experimental error especially at lower NPP concentrations as they have greater effects on value of slope. R2 value does not indicate accuracy but rather reliability. There is no correlation between the R2 value and Km which was not as predicted because the substrate concentration cannot be negative.

Using varying enzyme concentrations, a MichaelisMenton (MM) curve was plotted which showed 1.5 units/µL as the optimal enzyme concentration that converts all p-NPP to nitrophenol. Lineweaver Burke was plotted based on the linear portion of the MM plot, to obtain the Km and Vmax. However, a negative Km from the x-intercept and a Vmax of 15.08 Abs/min from the yintercept were obtained. Therefore, MM was used to obtain Vmax where the slope of the curve is zero, it plateaus. Vmax is when the enzyme is no longer related to the substrate, which was found to be at 12 Abs/min. It is the maximum number of moles of substrate that can be broken by 1 unit of enzyme per unit time. Km value of 0.0018M corresponds to the V (6 Abs/min) which is half of the Vmax. Vmax obtained from MM plot was lower than that of Lineweaver Burke plot, suggesting an error or inconsistency while loading the sample or due to changes of incubation of buffer.

Kinetic analysis of CIAP

Analysis and Discussion

Advisor: Annette Vincent, Ph.D.

ED EMENTS 1. Chloe Glynn, Teaching Assistant 2. Maya KemalDean, Lab Technician

ACKNO

1. Raghav, N., & Ravish, I. (2014). 2. Chaudhuri G. G. (2013).

E E ENCES

In conclusion, the Km of calf intestinal AP is greater than that of E. coli suggesting that the two isoforms have different activities and efficiencies depending on the environment they are in and the amount of substrate they need to break per unit time. Caffeine proved to not be a competitive, non-competitive or even an uncompetitive inhibitor, but rather could be an activator. Therefore hypothesis was proven incorrect. Future directions would be to perform the same experiment using lower caffeine concentrations and to ensure that the control and other tests are all performed on the same day.

CONC US ON

We expected caffeine to be a non-competitive inhibitor of IAP. We expected Km to stay constant and Vmax to decrease. However, we found Km to decrease and Vmax to increase. This suggests that treatment with caffeine led to an increase in affinity for the substrate and an increase in efficiency of enzyme activity. In literature, Km = 0.76 mM and Vmax = 3.12 min-1unit-1. The literature values cannot be compared to the experimental values as both were obtained under different conditions.

Lineweaver Burke plot was formed from different results obtained at varying caffeine concentrations. The highest R2 value was obtained from the curve of 0.125mM caffeine concentration, which was found to be 0.98833 while the lowest R2 value of 0.96151 was from 0.5mM caffeine concentration. All curves had the same overall trend, with little variability in the data.. This does not agree with the hypothesis stated previously. Carrying out experiments on the same day could tighten up the data.

Maria Ali, Nourhan ElKhatib, Biological Sciences Program, CMU-‐Q

Caffeine as an inhibitor of calf intestinal alkaline phosphatase

MAPK14 minor intron splicing as a novel biomarker for breast cancer Authors

Ettaib El Marabti

Advisors

Ihab Younis

Category

Biological Sciences

Abstract In eukaryotes, gene expression requires the removal of introns from the pre-mRNA through a process called splicing. The majority of introns in eukaryotic pre-mRNA are spliced out by the major spliceosome. Nevertheless, a small subset of introns (minor introns ~700 in humans) have a distinct sequence requiring a specialized, lowly abundant â&#x20AC;&#x153;minorâ&#x20AC;? spliceosome comprised of U11, U12, U4atac, U5 and U6atac snRNPs. Minor introns are highly conserved with regards to their position and sequence within a gene. Interestingly, minor introns are enriched in genes essential for information processing, DNA damage repair and cell cycle. The dysregulation of these processes is considered a hallmark of cancer and transformation. Hence, our interest in breast cancer, which is a major cause of cancer-related deaths in women. Upon RNA-seq data analysis of breast cancer cell lines several minor intron containing genes with altered splicing were identified. From these we focused on the splicing of the minor intron in MAPK14. The latter gene encodes the stress-induced p38 MAPK14 signaling kinase and dysregulation of minor intron splicing in MAPK14 is expected to reduce the mRNA level or generate a truncated p38 MAPK with altered activity. Taking into account recent work showing that p38 MAPK functions in the regulation of the minor spliceosome in cells under stress1, our data suggest a complex regulatory network with feedback loops that converge on minor intron splicing. We conclude that, in breast cancer cell lines, minor intron splicing is deregulated. 1Minor Introns Are Embedded Molecular Switches Regulated by Highly Unstable U6atac snRNA. Younis I, DittmarK, Wang W, Foley SW, Berg

MG, Hu KY, Wei Z, Wan L, and Dreyfuss G. 2013. eLife. 2: e00780.

MAPK14 Minor Intron Splicing as a Novel Biomarker for Breast Cancer Ettaib El Marabti and Ihab Younis Biological Sciences Program, Carnegie Mellon University in Qatar Summary

In eukaryotes, gene expression requires the removal of introns from the pre-mRNA through a process called splicing. The majority of introns in eukaryotic pre-mRNA are spliced out by the major spliceosome. Nevertheless, a small subset of introns (minor introns ~700 in humans) have a distinct sequence requiring a specialized, lowly abundant “minor” spliceosome comprised of U11, U12, U4atac, U5 and U6atac snRNPs. Minor introns are highly conserved with regards to their position and sequence within a gene. Interestingly, minor introns are enriched in genes essential for information processing, DNA damage repair and cell cycle. The dysregulation of these processes is considered a hallmark of cancer and transformation. Hence, our interest in breast cancer, which is a major cause of cancer-related deaths in women. Upon RNA-seq data analysis of breast cancer cell lines several minor intron containing genes with altered splicing were identified. It is important to identify new genes that could serve as biomarkers for cancer. This enhances diagnosis and prognosis for cancer patients and enlarges the list of potential breast cancer therapeutic targets. From the genes identified we focused on the splicing of the minor intron in MAPK14. The latter gene encodes the stress-induced p38 MAPK14 signaling kinase and dysregulation of minor intron splicing in MAPK14 is expected to reduce the mRNA level or generate a truncated p38 MAPK with altered activity. Taking into account recent work showing that p38 MAPK functions in the regulation of the minor spliceosome in cells under stress1 , our data suggest a complex regulatory network with feedback loops that converge on minor intron splicing. We conclude that, in breast cancer cell lines, minor intron splicing is deregulated.

Overall Experimental Design

We downloaded publically available RNA-seq data (GSE71862) of nontransformed breast cells (MCF10A) and breast cancer cells (MCF7) and analyzed them as shown below. We identified MAPK14 as one of the top genes whose minor intron is differentially spliced in breast adenocarcinomas. MAPK14 gene encodes p38 MAPK, a Serine/Threonine kinase activated by various environmental stresses and pro-inflammatory cytokines. We have previously shown that activation of p38 MAPK stabilizes the minor spliceosome’s catalytic and low abundance snRNP, U6atac, and increases its levels leading to increased minor intron splicing.

Extracted RNA from cells in log phase

Converted to cDNA using reverse transcriptase

Ribo-depleted RNA-seq data (GSE71862) for MCF7 and MCF10A

Exon and intron specific primers to amplify different regions of MAPK14

75x106 MCF10A reads and 81x106 MCF7 reads were mapped to a reference human genome using TopHat and Bowtie

1 Minor

Introns Are Embedded Molecular Switches Regulated by Highly Unstable U6atac snRNA. Younis I, Dittmar K, Wang W, Foley SW, Berg MG, Hu KY, Wei Z, Wan L, and Dreyfuss G. 2013. eLife. 2: e00780.

Data analysis

Objective

Expression level(FPKM)

In cancer biology there’s a need to identify more biomarkers for better diagnosis and prognosis. This project aims to do in deep analysis of breast cancer transcriptome targeting specifically the process of splicing to identify novel breast cancer biomarkers,

PCR and agarose gel electrophoresis

Visual analysis of 605 genes

Gene functions and relevance to cancer

Figure 1. Splicing consequences Minor splicing is efficient

Ranking of genes

Minor splicing is NOT efficient

MAPK14

Figure 4. Predicted 3D and 2D structures of p38-MAPK based on splicing outcomes The kinase domain of p38 p38-MAPK MAPK is highlighted in red in all the figures

Full length protein

Degradation of mRNA by NMD

Truncated protein (non functional or dominant negative) Exon 9a

Figure 2. RNA binding proteins (RBPs) regulate intron splicing

Full length p38-MAPK from a fully spliced MAPK

p38-MAPK-9a isoform from a fully spliced MAPK

Truncated p38-MAPK due to unspliced minor intron 8

Internally deleted P38-MAPK due to misplicing

Figure 5. Differential splicing of the minor introns in MAPK14 Genome browser view of MAPK14 RNA-seq data in MCF10A and MCF7 cells. Red reads indicate those corresponding to the minor introns in MAPK14. Below is a set of RT-PCR results confirming that minor intron splicing in MAPK14 in differentially regulated is various breast cancer subtypes.

RNA mis-splicing in disease. Scotti, M. M. & Swanson, M. S.Nat. Rev.Genet. 2015; 17: 19–32

Figure 3. MAPK14 expression correlates with breast cancer prognosis

We analyzed the correlation between MAPK14 expression and breast cancer patients’ survival rate. Interestingly, we observed a significant inverse correlation (p value < 0.03) between the expression level of MAPK14 and survival.

RPKM >0.3738 (402 BRCA samples) RPKM < 0.03596 (401 BRCA samples) RPKM 0.03596 – 0.3738 (397 BRCA samples)

Figure 6. MAPK14 n differential exon expression in breast cancer patients As shown in the bracketed area, Exons 9a and 9b are differentially regulated. These exons surround the minor introns in MAPK14.

Figure 7. MAPK14 Expression and minor intron splicing correlates with the expression of various RNA binding proteins Many RBPs expression either correlates, or inversely correlates with MAPK14. Similar data were obtained for Exon 8- Exon 10 expression, indirectly suggest that these correlations hold for minor intron splicing

Conclusions and future directions • Minor introns of many genes are differentially regulated in breast cancer vs. non-cancer cells and across different breast cancer subtypes.• MAPK14, which encodes p38 MAPK, expression correlates with its minor introns splicing and with breast cancer survival.• We will next test the consequences of this splicing alteration in MAPK14 on the protein product and the ability of cancer cells to respond to stress.• We will also investigate mechanism of MAPK14 differential splicing, especially the role of RBPs.• The other genes whose minor introns are differentially spliced will also be investigated

Effect of glucagon-like peptide-1 analog on modulating metabolic stress: Possible role of heat shock response Authors

Asma AlNaama

Advisor

Mohammed Dehbi, Qatar Biomedical Research Institute

Category

Biological Sciences

Abstract Insulin resistance (IR) and β-cell failure are the two core metabolic defects that lead to type 2 diabetes (T2D). These defects occur as a consequence of chronic metabolic stress that includes chronic low-grade inflammation, imbalance in the redox system, persistent ER stress. Failure of the heat shock response (HSR) to mitigate these various forms of metabolic stress is an early event that precedes IR as manifested by impaired expression of heat shock proteins (HSPs). A new class of anti-diabetic drugs referred to as, Incretin hormones have become available and they showed efficacy with a higher therapeutic index (Drucker). They exert important actions that contribute to glucose homeostasis by stimulating insulin secretion by β-cell and improving its sensitivity at target tissues, reducing central satiety, promoting weight loss and mitigating metabolic stress (Nauck et al. Diabetologia 1986; 29:46–52). However, their effect on the heat shock response such as DNAJB3 has never been investigated. Therefore measuring the HSPs will give an idea of effect of GLP-1 drug. The objective of this study is to investigate whether GLP-1 analogs modulate the heat shock response? If yes, elucidate the molecular mechanisms underlying this effect? Then know the relevance of the effect to metabolic stress and glucose homeostasis? To answer these questions C2C12 cells were used treated with either GLP-1R or Exendin, and western blot used to observe the effect. We concluded based on the preliminary data that HSP-60, HSP-90 and CryAB decrease after treating with Exendin. We see unexpected band for HSP-72 at lower molecular weight, which is very important for us to elaborate more. Further studies will be carried to understand the affect deeply such as Molecular profile, Metabolic Profile and Clinical Profile.

Eﬀect of Glucagon-‐Like Pep5de-‐1 Analog On Modula5ng Metabolic Stress: Possible Role Of Heat Shock Response Asma AlNaama1, Mohammed Dehbi2

1. Carnegie Mellon University, 2. Qatar Biomedical Research InsAtute

A A

BACKGROUND

EXendin

GLP-‐1R

Marker

Insulin resistance (IR) and β-‐cell failure are the two core metabolic defects that

DMSO

RESULTS A)

lead to type 2 diabetes (T2D). These defects occur as a consequence of cB hronic metabolic B

stress that includes chronic low-‐grade inflammaAon, imbalance in the redox system, persistent ER stress. Failure of the heat shock response (HSR) to miAgate these various forms of metabolic stress is an early event that precedes IR as manifested by impaired expression of heat shock proteins (HSPs). Developing strategies that miAgate metabolic stress or restore the HSR hold the promise to improve insulin sensiAvity and prevent β-‐cell failure in individuals at high risk, thereby, prevenAng the epidemic spread of T2D. IncreAn hormones emerged recently as potent and pleiotropic anA-‐ hyperglycemic drugs that exert their beneficial effects with a higher therapeuAc index (1).

Fig.1: A) Analysis of treated protein samples on SDS-‐PAGE Gel stained with Coomassie Blue dye. B) Western blot Analysis of proteins from C2C12 cells Treated with either GLP-‐1R or Exendin, at diﬀerent exposure 5me.

A) B) These hormones are made by the gastrointesAnal tract system and they consist of Glucagon-‐ Vehicle PMA Vehicle PMA GAGGGACTTTCCCAG.. GAGGGACTTTCCCAG.. GAGGGACTTTCCCAG.. SV40 promoter Luciferase p3xκB-Luc.

OBJECTIVES

METHODS

60 60 40 40 20 20 0 0

Vehicle: Vehicle: Vehicle: TNF-a PMA: PMA:

pCMV DNAJB3 pCMV DNAJB3 + + - - + - - + pCMV DNAJB3 pCMV DNAJB3 pCMV DNAJB3 pCMV DNAJB3 pCMV DNAJB3 pCMV DNAJB3 + + - - + - - +

pCMV pCMV pCMV

400.0 350.0

PBS

300.0

TNFa

250.0 200.0 150.0 100.0 50.0 0.0

Cell culture: Skeletal muscle (C2C12) were obtained from ATCC (USA). They were maintained in DMEM media supplemented with 10% Fetal Bovine Serum, anAbioAcs and Glutamine. GLP-‐1R agonist and Exendin-‐4 were purchased from Sigma. Cells were induced at 80%

IL6 DNAJB3

Hsp-40/DNAJB3 Hsp-40/DNAJB3

18000 16000 14000 12000 10000 8000 6000 4000 2000 0

B) a)

of various components of the heat shock response. DMSO was used as negaAve control (vehicle).

Western Blot: A_er inducAon with GLP-‐1 analogs, cells were washed with PBS, collected and lysed in RIPA buﬀer (50 mM Tris pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% Triton ×100, 0.5% Sodium deoxycholate, 0.1% SDS). Protein concentraAon was determined by Bradford method using gamma globulin as a reference. For Western blot, 50-‐80 μg of proteins were ﬁrst

IL6 CMV

Fig.2: A) NF-‐κB ac5va5on in response to TNF-‐α or PMA in HEK293. B) IL-‐6 promoter in response to TNF-‐ α in C2C12 pCMV 2-‐NBDG 150 ug/ml 20000 DNAJB3 A)

conﬂuence with either Exendin-‐4 or GLP-‐1RA at at 400 nM for 16h to monitor the expression

80 80

Glucose uptake (Fluorescence Intensity)

Ø  Relevance of the eﬀect to metabolic stress and glucose homeostasis?

* * * *

100 100

Ø  To invesAgate whether GLP-‐1 analogs modulate the heat shock response? Ø  If yes, elucidate the molecular mechanisms underlying this eﬀect?

PMA PMA

Surface Glut4/Total Glut4

improving its sensiAvity at target Assues, reducing central saAety, promoAng weight loss and miAgaAng metabolic stress (3). However, their eﬀect on modulaAng the C hC eat shock response Fig. 4: Illustra/on of the Luciferase based-reporter Fig. 4: Illustra/on of the Luciferase based-reporter systems (A). Overexpression of Hsp-40/DNAJB3 systems (A). Overexpression of Hsp-40/DNAJB3 has never been invesAgated. abolished both JNK-1-mediated ac/va/on of abolished both JNK-1-mediated ac/va/on of AP-1-Luciferase ac/vity (B) and IKKβ-mediated AP-1-Luciferase ac/vity (B) and IKKβ-mediated ac/va/on of κB-Luciferase ac/vity (C). ac/va/on of κB-Luciferase ac/vity (C).

A A

120 B 120 B

Rela5ve luciferase ac5vity (%)

acAons that contribute to glucose homeostasis by sAmulaAng insulin secreAon by β-‐cell and

Vehicle Vehicle

Rela6ve luciferase ac6vity (%) Rela6ve luciferase ac6vity (%)

like pepAde (GLP-‐1) and Gastric inhibitory polypepAde (GIP) (2). They exert important

1 0.8 0.6 0.4 0.2 0

pCMV DNAJB3

Fig.3: A) Eﬀect of Over-‐expression of DNAJB3 on glucose uptake level in HEK293 cells. HEK293 cells were

resolved on 10% SDS-‐PAGE gels and then, the proteins were transferred onto PVDF

cells starved for 2 hours in glucose free DMEM at 37°C incubator. Cells treated with 2NBDG + 200uM insulin to facilitate the uptake.

membranes, blocked with 5% non-‐fat dried milk in Tris-‐buﬀered saline containing 0.05%

immunostaining. a) Structure of the HA-‐GLut4-‐eGFP construct. In non-‐permeabilized cells, the staining of the HA tag represents the surface Glut4 while the GFP at the C terminal represent the total Glut4. the raAo of the surface over the total Glut4 reflects the percentage of the glucose transporter at the plasma membrane. B) example of 2 HEK293 cells transfected with HA-‐Glut4-‐GFP construct. Red: Staining of the HA epitope and reflects the surface Glut4 (membrane). Green: GFP reflect total Glut4 (membrane + cytosol). Blue: DAPI. c) Comparison of pCMV DNAJB3 expressing cells to see if DNAJB3 enhances the translocaAon of GLUT4. Cells heat shocked for 1h and recovered for 4hs. pCMV: 45.5% of GLUT4 on membrane. DNAJB3: 67.5% of GLUT4 on membrane.

Tween 20 (TBST) for 1 h at RT and then probed with diﬀerent primary anAbody for overnight at 4°C. A_er washing, the membranes were incubated conjugated secondary anAbody for 1 h at RT. Finally, protein bands were visualized by chemiluminescence and the images were captured by using the Versadoc 5000 system (BioRad, Hercules, CA). Primary anAbodies used were Hsp-‐40/DNAJB3, Beta-‐AcAn, GRP78, GRP-‐94, Crystallin AB, Hsp-‐60, Hsp-‐70, Hsp-‐72-‐ Hsp-‐90.

Luciferase assay: To monitor the anA-‐inﬂammatory eﬀect of DNAJB3, C2C12 cells were co-‐

Incubated for 1 hour at 37C. Fluorescence read at 485nm ex./535nm em. B) GLUT4 transloca5on on the membrane using

CONCLUSION •  Based on the preliminary data, we observe a decrease in HSP-‐60, HSP-‐90 and CryAB a_er treaAng with Exendin. •  We see unexpected band for HSP-‐72 which is very Important for us to elaborate more.

transfected by electroporaAon with DNAJB3 and a Luciferase reporter plasmid driven by either NF-‐κB (p3x-‐κB –Lucif.) or by IL-‐6 promoter. pCMV was used as negaAve control. 24 a_er transfecAon, cells were sAmulated for 16h with TNF-‐a (25ng/ml) or PalmiAc acid (200 μM) and then the luciferase acAvity was determined using Luciferase kit (Promega). Protein concentraAon was measured and taken into consideraAon to correct for transfecAon eﬃciency.

Glucose uptake: The glucose uptake was monitored in C2C12 transfected with DNAJB3 or pCMV (negaAve control) using the 2-‐NBDG glucose uptake assay kit (Cayman). The translocaAon of Glut-‐4 was carried out in C2C12 a_er their co-‐transfecAon with Glut-‐4 (tagged with HA and GFP) and either DNAJB3 or pCMV.

GLUT4 transloca5on: The localizaAon of Glut-‐4 was monitored by confocal microscopy.

FUTURE DIRECTION Ø  To study the: •  Molecular profile: Micro-‐RNAs, InflammaAon, CRP, immune modulaAon, OxidaAve stress. •  Metabolic Profile: Insulin/C-‐pepAde, Insulin sensiAvity/resistance, Lipid profile, Glucose profile. •  Clinical Profile: Fat mass, BMI, Blood pressure, VO2 Max

ACKNOWLEDGMENTS Dr. Abdelilah Arredouani, Ms. Namat Khaoab, Ms. Ilham Ilham Bensmail

Expression, purification, and characterization of stem cell transcription factors Brn2, Sox17 and its mutant Author

Dana Mazen Abou Samhadaneh

Advisor

Balasubramanian Moovarkumudalvan, Qatar Biomedical Research Institute Prasanna Kolatkar, Qatar Biomedical Research Institute Valentin Ilyin

Category

Biological Sciences

Abstract Distinct Sox/POU transcription factor pairs have been implicated as key regulators of cellular fates. Some of the transcription factors are very similar in structure and interact with similar DNA motifs, however, perform distinct functions due to their specific combinations. Both Sox2 and Sox17 interact with Oct4 and are required for pluripotency and endoderm development, respectively. A single mutation in Sox17 (Sox17EK) allows it to gain the ability to specify pluripotency like Sox2. In addition, the POU family members, Oct6 and Brn2, are expressed in embryonic stem cells and during neural development. They bind specifically to the MORE motif, therefore, a defect in it or the transcription factors can lead to neurological disorders. The potential use of PSCs in treating diseases, such as type 2 diabetes, provides promise. Elucidating the interactions of transcription factors and how they bind to DNA can allow us to reprogram and enhance their functionality, to potentially prevent developmental disorders. Therefore, the aim of the project is to study how Sox17, Sox17EK, and Brn2 interact with their cognate DNA and carry out transcription. This is done through protein expression and purification, Electrophoretic Mobility Shift Assay (EMSA), crystallization and X-ray diffraction.

Expression, Purification, and Characterization of Stem Cell Transcription Factors Brn2, Sox17 and its mutant Dana Abou Samhadaneh1, Balasubramanian Moovarkumudalvan2, Prasanna Kolatkar2 and Valentin Ilyin1 1Biological Sciences, Carnegie Mellon University Qatar 2Qatar Biomedical Research Institute, Hamad Bin Khalifa University

Sox2

Oct4

Sox2

Brn2

Neural development

Sox17

Oct4

214 mAU

Significance: The potential in using PSCs in treating diseases, such as type 2 diabetes, provides promise. Elucidating the interactions of transcription factors and how they bind to DNA can allow us to reprogram and enhance their functionally, to potentially prevent developmental disorders. Embryonic stem (ES) cells

Conclusions

Results

Introduction

275 mAU

• Pure samples of Sox17, Sox17EK, and Brn2 were successfully obtained • Both Sox17 WT and Sox7EK binds to Lama1 DNA with a binding affinity of ~5nM

• Sox17EK-Lama1 DNA crystals were successfully obtained

128 mAU

• Brn2-MORE DNA crystals were successfully

Future Directions

Endodermal cells Figure.1. Chromatograms showing purified Sox17 (A), Sox17EK (B) and Brn2 (C) after purification.

Background: Distinct Sox/POU transcription factor pairs have been implicated as key regulators of cellular fates. Some of the transcription factors are very similar in structure and interact with similar DNA motifs, however, perform distinct functions due to their specific combinations.

B 1

-35kDA

X-Ray Diffraction

Sequencing

Crystallization & EMSA

Transformation

• Refine the preliminary structure of Sox17EK • Use the Sox17EK structure to better analyze Sox17/Oct4 and Sox2/Oct4 interactions • Optimize the Brn2-MORE DNA crystal condition and perform X-ray diffraction to obtain a structure

-25kDA

Brn2 -15kDA

-15kDA

The POU family members, Oct6 and Brn2, are expressed in embryonic stem cells and during neural development. They bind specifically to Contro l the MORE motif. B

Site-directed Mutagenesis

-35kDA

-25kDA

-15kDA

Sox17EK

Sox17 -10kDA

-10kDA

-40kDA

Both Sox2 and Sox17 interact with Oct4 and are required for pluripotency and endoderm Figure.2. 12% SDS-PAGE gel Sox17EK (B) and Brn2 (C). development, respectively. A single mutation Control 1nM 2.5nM in Sox17 (Sox17EK) allows it to gain the ability A to specify pluripotency like Sox2.

Methods

-40kDA

Sox17

Aim: Study how Sox17, Sox17EK, and Brn2 interact with their cognate DNA and carry out transcription.

-10kDA

images showing purified Sox17 (A) 5nM

7.5nM

10nM

100nM

Sox17/ DNA

• Try combinatorial studies of Brn2 with its partners • Use the pluripotent transcription factors for stem cell engineering

DNA 0.1nM

1nM

2.5nM

5nM

7.5nM

10nM

25nM

50nM

Sox17EK/ DNA DNA

Figure.3. Electrophoretic Mobility Shift Assay (EMSA) gel showing binding of Sox17 (A) and Sox17EK (B) to Lama1 DNA with binding affinity of ~5nM.

Acknowledgement Additional thanks to Dr. Fatma Abdallah, Rinchu Mathew and Mohamad Aldaw for their support and guidance.

References Figure.4. Sox17EK-Lama1 (A) and Brn2-MORE (B) crystals.

Purification

Jauch, R., et al. (2010). Crystal structure of the dimeric Oct6 (POU3f1) POU domain bound to palindromic MORE DNA. Proteins: Structure, Function, and Bioinformatics, 79(2), 674–677. doi:10.1002/prot.22916

Culturing

Expression

Figure.5. X-ray diffraction pattern obtained from Sox17EK+Lama crystals, with a resolution of 3.07Å.

Mistri, T. K., Devasia, A. G., Chu, L. T., Ng, W. P., Halbritter, F., Colby, D., … Wohland, T. (2015). Selective influence of Sox2 on POU transcription factor binding in embryonic and neural stem cells. EMBO Reports, 16(9), 1177–1191. doi:10.15252/embr.201540467

Crystallization and characterization of HMG domain of stem cell transcription factors Sox7, Sox17 and its mutant Authors

Safa Salim

Advisors

Balasubramanian Moovarkumudalvan, Qatar Biomedical Research Institute Prasanna R. Kolatkar, Qatar Biomedical Research Institute Valentin Ilyin

Category

Biological Sciences

Abstract Since the discovery of induced pluripotent stem cells (iPSCs), several factors have been used to elicit the most efficient approach for iPSCs generation. Oct4 and Sox2 stand out as two of the most critical factors for iPSC generation. Sox2 is required for development of the epiblast and it is a pluripotency factor, while Sox7 and Sox17 are involved in endodermal differentiation. Sox7 and Sox17 cannot replace the functionality of Sox2 in reprogramming somatic cells into iPSCs. However, recent discovery showed that a single mutation of Glutamate (E) to Lysine (K) in Sox7 and Sox17 allows them to gain the ability to specify pluripotency with increased efficiency. The aim of the project was to study the interactions of Sox7, Sox17 and its mutant Sox17EK with their cognate DNA (a 16-mer DNA element derived from the LAMA1 enhancer). This study also attempted to determine their structures through X-ray crystallography. Understanding these wild-type and mutant protein structures will reveal the distinction in protein-DNA interactions at the molecular level. Information gathered from these interactions will contribute to efficient iPSC generation. iPSCs have the ability to differentiate into multiple cell types. Their application to regenerative therapy, especially in diseases such as type 2 diabetes, has limitless potential. For instance, pancreatic progenitor cells derived from iPSCs can differentiate into fully-functional beta cells, and alleviate the condition.

Large-scale overnight cultures

Small-scale overnight cultures

Transformation into E.coli BL21 (DE3) cells using electroporation

Site-directed mutagenesis

Plasmid DNA purification and quantification

Confirmation of wild-type and mutant gene sequences

Crystallization and X-ray diffraction

Electrophoretic Mobility Shift Assay (EMSA)

Purification using AKTA express

Since the discovery of induced pluripotent stem cells (iPSCs), several factors have been used to elicit the most efficient approach for iPSCs generation. Oct4 and Sox2 stand out as two of the most critical factors for iPSC generation. Sox2 is required for development of the epiblast and it is a pluripotency factor, while Sox7 and Sox17 are involved in endodermal differentiation. Sox7 and Sox17 cannot replace the functionality of Sox2 in reprogramming somatic cells into iPSCs. However, recent discovery showed that a single mutation of Glutamate (E) to Lysine (K) in Sox7 and Sox17 allows them to gain the ability to specify pluripotency with increased efficiency. The aim of the project was to study the interactions of Sox7, Sox17 and its mutant Sox17EK with their cognate DNA (a 16-mer DNA element derived from the LAMA1 enhancer). This study also attempted to determine their structures through X-ray crystallography. Understanding these wild-type and mutant protein structures will reveal the distinction in protein-DNA interactions at the molecular level. Information gathered from these interactions will contribute to efficient iPSC generation. iPSCs have the ability to differentiate into multiple cell types. Their application to regenerative therapy, especially in diseases such as type 2 diabetes, has limitless potential. For instance, pancreatic progenitor cells derived from iPSCs can differentiate into fully-functional beta cells, and alleviate the condition.

Introduction

Sox17

55kDa

70kDa

180kDa 130kDa 100kDa

10kDa

15kDa

25kDa

35kDa

40kDa

Sox17EK

180kDa 130kDa

10kDa

15kDa

25kDa

35kDa

55kDa

70kDa

40kDa

55kDa

B 100kDa

70kDa

180kDa 130kDa 100kDa

Sox7

Figure.2. 12% SDS-PAGE gel images showing purified (A) Sox17, (B) Sox17EK, and (C) Sox7

10kDa

15kDa

25kDa

35kDa

40kDa

Figure.1. Chromatograms showing purified (A) Sox17 and (B) Sox17EK after IonExchange Chromatography, and (C) Sox7 after gel filtration

109mAU

214mAU

275mAU

1nM

Control 0.1nM

Control

1nM

5nM

2.5nM

5nM

10nM

7.5nM

25nM

10nM

50nM

100nM

DNA

Sox17EK -DNA

DNA

Sox17 -DNA

Figure.5. X-ray diffraction pattern obtained from Sox17EK-Lama crystals, with a resolution of 3.07Å

Figure.4. Light microscopy images of (A) Sox17EK-Lama1 and (B) Sox7-Lama1 crystals

Figure.3. EMSA gel images showing binding of Sox17 and Sox17EK to DNA with binding affinity of ~5nM

Results

Acknowledgement

Refine the preliminary structure of Sox17EK mutant obtained, and build a model Use the Sox17EK mutant structure to compare Oct4Sox17 and Oct4-Sox2 interactions Optimize and perform X-ray diffraction for Sox7 and Sox17, and obtain the structure Perform combinatorial studies of Sox7 with its interacting partners

Future Directions

Pure Sox17 (~9.7kDa), Sox17EK (~9.7kDa), and Sox7 (~9.7kDa) samples were obtained EMSA study shows that both Sox17 and Sox17EK bind to Lama1 DNA with a binding affinity of ~5nM Sox17EK-Lama1 and Sox7-Lama1 crystals were successfully obtained Sox17EK-Lama1 crystals exhibited an X-ray diffraction pattern of 3.07Å.

•

Jauch, R., Aksoy, I., Hutchins, A. P., Ng, C. K., Tian, X. F., Chen, J., . . . Kolatkar, P. R. (2011). Conversion of Sox17 into a Pluripotency Reprogramming Factor by Reengineering Its Association with Oct4 on , DNA. Stem Cells, 29(6), 940-951. doi:10.1002/stem.639 Merino, F., Ng, C. K. L., VeerapandianV., Schöler, H. R., Jauch, R., & Cojocaru, V. (2014). Structural basis for the SOX-Dependent Genomic redistribution of OCT4 in stem cell differentiation. Structure, 22(9), 1274–1286. doi:10.1016/j.str.2014.06.014 Ng CKL, Palasingam P, Venkatachalam R, Baburajendran N, Cheng J, Jauch R, Kolatkar PR. Purification, crystallization and preliminary X-ray diffraction analysis of the HMG domain of Sox17 in complex with DNA. Acta Crystallographica Section F Structural Biology and Crystallization Communications. 2008;64(12):1184–1187.

References

I would like to thank Dr. Fatma Abdallah, Rinchu Mathew, and Mohamad Eldaw for their support and guidance.

•

Conclusions

Safa Salim1, Balasubramanian Moovarkumudalvan2, Prasanna R. Kolatkar2, and Valentin Ilyin1 1Biological Sciences Program, Carnegie Mellon University, Qatar 2Qatar Biomedical Research Institute, Hamad Bin Khalifa University, Qatar

Crystallization and Characterization of HMG Domain of Stem Cell Title Sox17 and its mutant Transcription Factors Sox7,

Effect of hydrogen peroxide at 100 ÂľM on Calf Intestinal Alkaline Phosphatase (CIAP) enzyme kinetics Author

Amwaj Ahmed Sherif Mostafa Suzan Elzafarany

Advisor

Annette Vincent

Category

Biological Sciences

Abstract The purpose of this research is to investigate the impact of hydrogen peroxide on the activity of calf intestine alkaline phosphatase. We intended to answer the question: Can hydrogen peroxide only cause an increase in the activity of alkaline phosphatase? Our hypothesis was that hydrogen peroxide is an activator molecule that will increase the activity of alkaline phosphatase. To test our hypothesis, different concentrations of the enzyme substrate PNPP was added to calf intestine alkaline phosphatase in the presence of hydrogen peroxide, and the initial velocity of each reaction was measured. This was also done to a control but in the absence of hydrogen peroxide. The data was then used to construct Michaelis - Menton and Lineweaver Burke plots to determine the Km and Vmax values. The Km is the concentration of substrate which permits the enzyme to achieve half Vmax. As the value of Km increases, the enzyme affinity to its substrate decreases and vice versa. Vmax is the rate of reaction when the enzyme is saturated with substrate. As the value of Vmax increases, this indicates that the enzyme is catalyzing the reaction at a faster rate. We predicted that the Km value in the presence of hydrogen peroxide will be lower than the control while the Vmax value in the presence of hydrogen peroxide will be higher than the control. The results showed that the Km value was lower while the Vmax was higher in the presence of hydrogen peroxide. The results supported our predictions, and our hypothesis was valid.

Abstract

It has binding sites for both zinc and magnesium ions that are the co-factors necessary for the activity of the enzyme. In contrast, its activity can be inhibited by chelating agents including EDTA. These chelating agents inhibit the activity of the enzyme by binding and removing the ions necessary for its activity. Calf intestine alkaline phosphatase functions optimally at pH 11 and 45C°. In background literature, a study conducted by researchers in CUNY and UMDNJ found that the activity of alkaline phosphatase in colon cancer cells increased in the presence of hydrogen peroxide-producing flavonols. They found that flavonols inhibit the cancer cells proliferation and induce their differentiation. They also found that the activity of the differentiation marker alkaline phosphatase increases in colon cancer cells once they differentiate. The authors suspected that the increase in alkaline phosphatase activity was due to hydrogen peroxide that were produced by the flavonols but did not conduct any further studies to test if the reason was hydrogen peroxide or any other factor in the experiment. Our research is a follow-up study that investigates the impact of the presence of hydrogen peroxide only in alkaline phosphatase activity. This research is important since it contributes to our understanding of the enzyme’s behavior and activity.

Alkaline phosphatase is an enzyme that functions in removing phosphate group from organic molecules. The enzyme is present in both eukaryotes and prokaryotes but with different isoforms. The isoform that we used in our research is extracted from calf intestine tissues and known as calf intestine alkaline phosphatase. Calf intestine alkaline phosphatase is a homo-dimeric metalloenzyme.

Introduction

The purpose of our research was to investigate the impact of hydrogen peroxide on the activity of calf intestine alkaline phosphatase. Our hypothesis was that hydrogen peroxide is an activator that would increase alkaline phosphatase activity. To test this, different concentrations of the enzyme substrate pNPP was added to the enzyme in the presence of hydrogen peroxide, and the initial velocity of each reaction was measured. The data was then used to construct the Michaelis Menton and Lineweaver Burke plots to determine the Km and Vmax values of the enzyme is all concentrations. The Km is the concentration of substrate which permits the enzyme to achieve half Vmax. As the value of Km increases, the enzyme affinity to its substrate decreases and vice versa. Vmax is the rate of reaction when the enzyme is saturated with substrate. As the value of Vmax increases, this indicates that the enzyme is catalyzing the reaction at a faster rate. We predicted that the Km value in the presence of hydrogen peroxide will be lower than the control while the Vmax value in the presence of hydrogen peroxide will be higher than the control. The results showed that the Km value was lower while the Vmax was higher in the presence of hydrogen peroxide. The results supported our predictions, and our hypothesis was valid.

Amwaj Ahmed

Suzan Elzafarany

0.5

1.5

100 mM H2O2

Substrate concentration (mM) Control

2.5

Control

1/[S] (mM-1)

µM H2O2

y = 1.0518x + 5.9604 R² = 0.99028

100 uM H2O2

µM H2O2

y = 0.4469x + 3.034 R² = 0.98432

Figure 2: Line weaver plot of the inverse of Vmax and inverse of substrate concentration

0.05

0.1

0.15

0.2

0.25

0.3

Figure 1: Plot of Substrate concentration versus rate of reaction using 1.5mM of enzyme.

Results

Day2: The optimal amount of enzyme needed is used while varying the amount of substrate to determine the optimal amount needed to reach plateau. The range used was 0.5 to 2.5mM of pNPP to reach plateau. When the optimal range was reached 100mM of hydrogen peroxide was used to plot a lineweaver plot to determine whether hydrogen peroxide is an activator or an inhibitor.

Day1: First, the amount of maximum enzyme needed to achieve plateau needed to be identified by allowing it to be independent of the amount of substrate. Using a calf intestine alkaline phosphatase stock of 1Unit/µL and a pNPP stock of 112mM, the efficient amount needed from each component was determined. In order to dilute the pNPP to 11.2mM 30µL of the stock solution was placed in the cuvette. A 10X cutsmart buffer was used and diluted into stock solution tubes of 1X tubes. 270µL of the 1X cutsmart buffer was placed with 30µL pNPP to blank the UV-Vis spectrophotometer at 450nm to determine the maximum amount of CIAP needed to reach a plateau in the Enzyme Velocity graph. A range of 0.5 to 2.5 Units of CIAP was used to determine the km of enzyme needed to reach Vmax.

Methods

Sherif Mostafa Conclusion

0.167 1.05 0.176

Vmax (abs/min) Km/Vmax Km (mM)

0.147

0.447

0.330

3.03

100 µM Hydrogen peroxide

Future directions

5.96

1/Vmax (min/abs)

Control

Using the Michaelis Menton Kinetics curve, with varying enzyme concentration against reaction rate in abs/min, the reaction rate was first order till 1 AP enzyme unit, until the curve flattened and had a constant rate of 0.126 abs/min. 1.5 U was chosen as the value lied within the plateau. The Vmax, which is the maximum rate at which the AP catalyzes the reaction, was obtained from the lineweaver burke plot from the y intercept of the linear equation as y = 1/Vmax, therefore Vmax was 1/y-intercept, and it was seen on the substrate versus velocity graph as the maximum rate at which the curve plateaued. Km, which is the substrate concentration at half Vmax, was calculated from the slope of the lineweaver burke plot as slope = Km/Vmax, therefore Km = slope X Vmax., and it is observed on the substrate versus velocity graph. The published Km value of pure AP E. coli is 0.013mM and our calculated calf intestinal AP used in this experiment was 0.176 mM. This means that the calf intestinal AP binds to its substrate with a lower affinity than that of bacterial AP. The control R2 value was 0.990. Since the Km is calculated from the slope of the graph (slope = Km/Vmax), therefore the higher the R2 the more accurate the Km value. The Vmax of the control and the condition with 100 µM hydrogen peroxide are 0.167 abs/min and 0.330 abs/min respectively, suggesting that hydrogen peroxide acted as an activator at 100 uM, increasing enzyme efficiency. Table 1. Km and Vmax values of both the control and 100 uM Hydrogen Peroxide

Chaudhuri, G., Chatterjee, S., Venu-Babu, P., Ramasamy, K., & Thilagaraj, W. R. (2013). Kinetic behaviour of calf intestinal alkaline phosphatase with pNPP. Shin, J., Carr, A., Corner, G. A., Tögel, L., Dávaos-Salas, M., Tran, H., ... & Buchanan, D. D. (2014). The intestinal epithelial cell differentiation marker intestinal alkaline phosphatase (ALPi) is selectively induced by histone deacetylase inhibitors (HDACi) in colon cancer cells in a Kruppel-like factor 5 (KLF5)dependent manner. Journal of Biological Chemistry, 289(36), 25306-25316. Lea, M. A., Ibeh, C., Deutsch, J. K., & Hamid, I. (2010). Inhibition of growth and induction of alkaline phosphatase in colon cancer cells by flavonols and flavonol glycosides. Anticancer research, 30(9), 36293635 Ahmed-Belkacem, A., Pozza, A., Muñoz-Martínez, F., Bates, S. E., Castanys, S., Gamarro, F., ... & PérezVictoria, J. M. (2005). Flavonoid structure-activity studies identify 6-prenylchrysin and tectochrysin as potent and specific inhibitors of breast cancer resistance protein ABCG2. Cancer research, 65(11), 48524860. Helland, R., R. L. Larsen, Asgeirsson, B. (2009). "The 1.4 Å crystal structure of the large and cold-active Vibrio sp. alkaline phosphatase." Biochim Biophys Acta 1794(2): 297-308.

References

Our experiment suggests that our there is evidence to support our hypothesis that hydrogen peroxide can act as an activator that increases CIAP’s enzyme activity and allow it to better bind to its substrate. This is important to understand the environmental conditions that surround colon cancer cells and prevent them from being exposed to hydrogen peroxide, possibly from flavanol intake, to prevent further CIAP activity. These data could also be used as a step towards raising awareness of eating food rich in oxidative species or flavanols that could act as prooxidants. A future direction to this study is to look at more hydrogen peroxide concentrations and observe when enzyme activity and binding are the highest.

•

Dr. Annette Vincent

Effect of hydrogen peroxide at 100 µM on Calf Intestinal Alkaline Phosphatase (CIAP) enzyme kinetics

Reaction rate (abs/min) 1/[V] (min/abs)

Mechanisms of breast cancer escape from Natural Killer (NK) anti-tumor immunity Authors

Reem Hasnah

Advisor

Manale Karam, Qatar Biomedical Research Institute

Category

Biological Sciences

Abstract Natural Killer (NK) cells are lymphocytes of the innate immune system that have potent cytotoxic activity against cancer cells. They express a wide range of activating and inhibitory receptors on their surface that bind to ligands expressed on cancer cells; thus allowing their recognition and killing. However, cancers develop mechanisms to escape NK anti-tumor immunity in order to progress. Our aim is to study these mechanisms in breast cancer, which is the major cause of death by cancer in women. Therefore, peripheral NK (pNK) cells were isolated from the blood of healthy donors by using gradient density centrifugation to separate peripheral blood mononuclear cells (PBMCs) from which NK cells (CD56+/CD3-) were isolated by negative magnetic separation. Next, the cytotoxic activity of pNKs was analyzed against a range of nine breast cancer cell lines, one normal-like breast epithelial cell line and one hTERT-transformed breast epithelial cell line by using the calceinAM-release assay. K562 leukemia cell line was used as positive control for NK cytotoxic activity because it is known to be very sensitive to NK-mediated killing. The purity of the isolated NK cells was analyzed by flow cytometry (CD56+/CD3-) and was found to be 92.4%. The cytotoxicity assay results allowed the distribution of the cells in two groups: NK-sensitive (7 cell lines) and NK-resistant (5 cell lines). Next, one potential mechanism of NK-resistance (i.e. regulated expression of activating and inhibitory ligands for NK receptors in breast cancer cells) was analyzed. The expression of 24 known NK-receptor ligands was tested by semi-quantitative RT-PCR. The results allowed the identification of MICA mRNA expression as a biomarker for the susceptibility to NK-mediated cytotoxicity and generated several hypotheses that can be tested in future studies for a better understanding of breast cancer resistance to NK and the development of efficient NK-based immunotherapeutic strategies.

Mechanisms Of Breast Cancer Escape From Natural Killer (NK) AntiTumor Immunity ,2 1 Reem Hasnah , Manale Karam

1Carnegie Mellon University Qatar, 2Qatar Biomedical Research Institute, Hamad Bin Khalifa University

Introduc)on

Results

Unstained

Unstained Unstained

1.  Primary NK cells were used: peripheral blood samples of healthy donors were collected from HMC blood donor center (Figure b). Using gradient density centrifuga@on, peripheral blood mononuclear cells (PBMCs) were separated, then NK cells were isolated from PBMCs by nega@ve magne@c separa@on. Finally, NK cell purity was determined by staining of the isolated cells with CD56PE and CD3-FITC ﬂuorochrome-coupled an@bodies and ﬂow cytometry analysis.

CD56-PE

0.22

CD3-FITC

CD3-FITC staining

CD56-PE staining

CD56-PE/CD3-FITC staining 0.30

0.0011

5.63

0.089

1.14

92.4

CD56-PE

0.93

CD56-PE

93.4

0.50

99.2

CD3-FITC

0.65

5.82

CD3-FITC

Figure 1. Purity of peripheral blood-isolated NK cells Blood samples were collected from healthy donors. Using gradient density centrifuga@on, PBMCs were separated, then peripheral NK cells were isolated from PBMCs by nega@ve magne@c separa@on. Collected cells were stained with CD56-PE and CD3-FITC ﬂuorochrome-coupled an@bodies and analyzed by ﬂow cytometry. This is a representa@ve experiment. Values correspond to percentages. NK cells (CD56+/CD3-) represent 92.4% of the collected frac@on. (A)

0 10:1 5:1 2.5:1 1.25:1 0.6:1 1 2 3 4 5

Ra)o NK:Target

% Cytotoxicity

140 120 100 80 60 40 20 0

0 10:1 5:1 2.5:1 1.25:1 0.6:1 1 2 3 4 5

Ra)o NK:Target

0 1 2 3 4 5 10:1 5:1 2.5:1 1.25:1 0.6:1 Ra)o NK:Target

(B)

0 10:1 5:1 2.5:1 1.25:1 0.6:1 1 2 3 4 5

0 1 2 3 4 5 10:1 5:1 2.5:1 1.25:1 0.6:1

R2 140 120 100 80 60 40 20 0 0 1 2 3 4 5 -20 10:1 5:1 2.5:1 1.25:1 0.6:1 Ra)o NK:Target

140 120 100 80 60 40 20 0 1 2 3 4 5 10:1 5:1 2.5:1 1.25:1 0.6:1 -20 0 Ra)o NK:Target

Ra)o NK:Target

S1 % Cytotoxicity

140 120 100 80 60 40 20 0

Ra)o NK:Target

R4 % Cytotoxicity

140 120 100 80 60 40 20 0

R3 % Cytotoxicity

S6 % Cytotoxicity

0 1 2 3 4 5 10:1 5:1 2.5:1 1.25:1 0.6:1 Ra)o NK:Target

140 120 100 80 60 40 20 0 0 1 2 3 4 5 10:1 5:1 2.5:1 1.25:1 0.6:1 Ra)o NK:Target

% Cytotoxicity

140 120 100 80 60 40 20 0

% Cytotoxicity

0 10:1 5:1 2.5:1 1.25:1 0.6:1 1 2 3 4 5

Ra)o NK:Target

140 120 100 80 60 40 20 0

Transformed (S2)

Normal-like 140 120 100 80 60 40 20 0

% Cytotoxicity

140 120 100 80 60 40 20 0

% Cytotoxicity

Materials and methods

0.32

99.4

0.20

99.7

FSC-A

% Cytotoxicity

The objec@ve of this project is to study breast cancer suscep@bility and resistance to NK immunosurveillance and iden@fy mechanisms of immune escape. In this study, two aims were defined: 1.  Determine the cytotoxic ac@vity of NK cells against breast cancer cells in vitro 2.  Analyze the expression of stress ligands for NK receptors by breast cancer cells This will allow the iden@fica@on of a possible rela@onship between NK-receptor ligands’ expression and suscep@bility to NK-mediated cytotoxicity. Thereby, NK-receptor ligands might become poten@al biomarkers for the response to NK-based breast cancer treatment and new poten@al mechanisms of resistance might be iden@fied and targeted.

0.072

CD56-PE

SSC-A

Viable 80.7

K562

Objec)ve

Isotype controls 0.087

0.012

CD56-PE

Natural killer (NK) cells are a type of cytotoxic lymphocytes and the primary effectors of the innate immune response against virus-infected cells and cancer cells. They recognize their targets through a complex array of ac@va@ng and inhibitory receptors that are expressed on their surface. These receptors bind to ligands on target cells and the balance of ac@va@ng and inhibitory signal cascades determines NK cell ac@va@on (Figure a). Ac@va@on of NK cells induces apoptosis-mediated cytotoxicity in target cells by using a variety of Figure a. Ac@va@on and inhibi@on mechanisms of NK cells by mechanisms. These mechanisms include release of balancing signaling cascades granzymes, death receptor pathways and release of inflammatory factors. The laGer mechanism can also indirectly induce elimina@on of target cells by ac@va@on and recruitment of other immune cells. Many studies have shown that NK cells demonstrate cytotoxic ac@vity against many types of cancer including melanoma, renal cell-carcinoma, colorectal cancer, and leukemia. Furthermore, clinical trials using NK cells show increased an@-tumoral cytotoxicity in pa@ents as well as increased chances of remission. However, many cancers are resistant to NK cellmediated cytotoxicity. In fact, during tumor progression, cancer cells develop mechanisms to escape NK surveillance. Some mechanisms of NK-immune escape by cancer cells were proposed, including the modula@on of the expression of ligands for the NK receptors, the secre@on of inhibitory soluble factors and the development of resistance to cell death. A beGer understanding of these mechanisms would lead to the development of novel effec@ve immunotherapeu@c approaches for cancer treatment.

0 1 2 3 4 5 10:1 5:1 2.5:1 1.25:1 0.6:1

Ra)o NK:Target

140 120 100 80 60 40 20 0

0 1 2 3 4 5 10:1 5:1 2.5:1 1.25:1 0.6:1

Ra)o NK:Target

50% Cytotoxicity

K562 S-06 S1 S-05 S2 S-04 S3 S-03

Figure 1

S-02 S4

S-01 S5 R-06 S6 R-05 Normal R-04 R4 R-03 R3 R-02 R2 R-01 R1

Conclusion and perspec)ves •  Breast cancer cell lines have different suscep@bili@es to NK-mediated cytotoxicity. Their distribu@on into rela@vely resistant and sensi@ve groups provides models for the analysis of molecular mechanisms of breast cancer escape from NK immunity. •  MICA is an ac@va@ng ligand which posi@vely correlates with NK suscep@bility. However, in a resistant cell line (R2), the MICA mRNA level is high and comparable to the sensi@ve cells. Two hypothesis: 1) MICA protein might not be expressed or localized at the surface of the cells (western blot and flow cytometry studies can be used to test this hypothesis) 2) Other mechanisms of resistance can counteract the increased recogni@on of R2 cell line by NK cells If iden@fied and validated, these mechanisms can be targeted to increase NK-mediated cytotoxicity towards MICA mRNA-high breast cancer cells. •  Except for MICA gene, the mRNA expression of none of the other NK-receptor ligands tested correlated with the suscep@bility of breast cancer cells to NK-killing. This suggests that NK suscep@bility might be affected by other factors. Those could include modifica@on in protein expression, clivage of the ligands by soluble proteases, expression of soluble inhibitory factors by cancer cells and resistance of cancer cells to apoptosis. •  In future work, the expression of NK-receptor ligands at the surface of breast cancer cells will be iden@fied by flow cytometry using specific fluorochrome-labeled an@bodies. Finally, our results will be compared with those from databases on pa@ent tumors and check if there is a correla@on between the expression of the NK receptor ligands and the aggressiveness of the cancer and survival of the breast cancer pa@ents.

0 2 4 2.5:1 5:1 10:1 Ra)o NK:Target

MIC A MIC B ULBP1 ULBP2 ULBP3 ULBP4 ULBP5 ULBP6 B7H6 BAT3 CMV pp65 Nec@n2 NECL2 NECL5 CD48 CD70 CD72 AICL CEAMCAM1 SLAMF6 SLAMF7 SELL HLA-C HLA-A/B/C/E/F HLA-E HLA-A HLA-B/C CLEC2D COL4AC GAPDH

Water

K562

(A)

Figure 2. Suscep)bility of breast epithelial and breast cancer cell lines to NK-mediated cytotoxicity (A) Cytotoxic ac@vity of NK cells towards normal breast epithelial and breast cancer cell lines: NK cells were co-cultured for 4 hours with CalceinAM-pre-stained normal-like breast epithelial or breast cancer cells (R1, R2, R3, R4, S1, S3, S4, S5, S6 and Transformed) at different NK:Target ra@os. The target cells that were lysed by NK cells will release the calcein fluorescent dye in the supernatant. The fluorescence intensity in the supernatant was then measured by a plate reader to determine the percent of cytotoxicity. The results presented are the means +/- SD for two independent experiments. (B) Ra@o NK:Target at 50% cytotoxicity.

3.  Calcein-AM cytotoxicity assay was used to analyze the cytotoxic ac@vity of NK cells against cancer cells. 4.  To determine the expression of NK-receptor ligands in breast cancer cells, we performed total RNA extrac@on (by using the RNeasy Plus Mini Kit from Qiagen) and cDNA synthesis by RT-PCR (by using the SuperScript® IV First-Strand Synthesis System from ThermoFisher Scien@fic) on the different breast cancer cell lines. Next, mRNA expression was analyzed by PCR using specific primers. PCR products were then separated by electrophoresis using 2% agarose gel containing 0.5 µg/mL ethidium bromide followed by visualiza@on of the amplicons in a gel documenta@on system.

2.  One normal-like breast epithelial cell line, one hTERT-transformed breast epithelial cell line (S2) and nine diﬀerent breast cancer cell lines, named S1, S3, S4, S5, S6, R1, R2, R3 and R4, were used. K562 leukemia cell line was used as posi@ve control for NK cytotoxic ac@vity because it is known to be very sensi@ve to NK-mediated killing.

Figure b. Blood unit collected from HMC blood donor center

(B)

Ligand mRNA expression

Responsiveness to NK cells Pearson correla)on coeﬃcient P value Ac)va)ng ligands MICA + 0.65 0.02 MICB - 0.27 0.39 ULBP1 - 0.17 0.60 ULBP2 - 0.20 0.53 ULBP3 - 0.12 0.71 ULBP4 + 0.06 0.84 ULBP5 - 0.34 0.29 ULBP6 + 0.04 0.91 B7-H6 + 0.24 0.46 BAT3 - 0.11 0.74 CMV pp65 0.00 0.00 Nec@n 2 + 0.52 0.08 NECL2 + 0.22 0.48 NECL5 + 0.19 0.55 CD48 + 0.09 0.77 CD70 - 0.09 0.77 CD72 0.00 0.00 AICL + 0.09 0.77 CEACAM1 0.00 0.00 SLAMF6 + 0.03 0.92 SLAMF7 - 0.35 0.27 SELL - 0.22 0.50 Ac)va)ng and Inhibitory ligands HLA-C 0.12 0.71 HLA-A/B/C/E/F -0.31 0.33 HLA-E -0.25 0.44 Inhibitory ligands HLA-A -0.24 0.45 HLA-B/C 0.21 0.51 CLEC2D 0.22 0.49 COL4A6 -0.47 0.12

Figure 3. NK-receptor ligands’ mRNA expression in breast cancer cells and rela)onship with the suscep)bility to NK-mediated cytotoxicity (A) Gel images represen@ng the rela@ve mRNA expression level of a wide range of NK-receptor ligands by breast cancer cell lines that are placed from the least suscep@ble to NK-mediated cytotoxicity (most resistant) to the most suscep@ble (most sensi@ve) (lej to right). (B) The table represents the rela@onship between ligand expression and suscep@bility to NK-mediated cytotoxicity, as determined ajer quan@ﬁca@on of the intensity of amplicon bandes represen@ng the rela@ve expression level (using ImageJ sojware) and calcula@ng Pearson correla@on coeﬃcient (using GraphPad Prism sojware).

Oxidative stress in kidney cells – effects of aspartame Authors Maria Ali

Advisor Annette Vincent

Category

Biological Sciences

Abstract In this project, the effect of aspartame was tested on Human embryonic kidney cells (HEK-293). Varying concentrations of aspartame were tested against the cells and a concentration of 250μg/mL was found to cause a significant decrease in cell viability. The results show that aspartame at a concentration of 250μg/mL has negative effects on cell viability of HEK in vitro. This suggests that aspartame, consumed by pregnant women conditioned with GDM, has a negative effect on kidney development in fetuses.

Oxidative stress in kidney cells – effects of aspartame Maria Ali Advisor : Annette Vincent

ABSTRACT

In this project, the effect of aspartame was tested on human embryonic kidney cells (HEK-293). Varying concentrations of aspartame were tested against the cells and a concentration of 250μg/mL was found to cause a significant decrease in cell viability. The results show that aspartame at a concentration of 250μg/mL has negative effects on cell viability of HEK in vitro. This suggests that aspartame, consumed by pregnant women conditioned with GDM, has a negative effect on kidney development in fetuses.

The t test probability for 250μg/mL of aspartame for the 30 minutes and the 60 minutes incubations were found to be less than 0.05, showing that the average OD values for this concentration are significantly different from the control OD values.

INTRODUCTION

Diabetes was found to cause 38.4% of all kidney failure cases. Around 239,100 cases of diabetes were found in Qatar in 2015 [1]. This project is concerned with the effect of aspartame in the diets of pregnant women conditioned with gestational diabetes. It is a condition that develops during pregnancy when the mother has unstable blood glucose levels that affect the fetus as well. Women with GDM have a higher risk of developing hypertensive disorders during pregnancy like gestational hypertension and eclampsia [2]. Diabetic patients, including women with GDM, mostly consume foods consisting of low glucose and artificial sweeteners. Here, we look at the effect of different concentrations of aspartame on HEK293 cells through the MTT assay to assess for cell viability, and DCF treatment to observe oxidative stress in the cells.

METHODS

HEK-293 cells were grown in cell culture media, then were incubated with 0, 10, 20, 50, 100 and 250 μg/mL for 30 and 60 minutes. The MTT assay was performed to measure cell viability and the OD values were measured at 570 nm using the fluorescence microscopic reader. The cells were also treated with DCF and Mitosox red (double probes), then incubated for 30 minutes with 250 μg/mL of aspartame. Images of the cells were taken using the EVOS microscope.

RESULTS AND DISCUSSION MTT Assay for HEK-293 incubated with different concentrations of aspartame at 1 hour

0.9

0.8

0.7

0.6

OD at 570nm

MTT Assay for HEK-293 incubated with different concentrations of aspartame at 30 minutes

0.5 0.4

0.4 0.3

0.2

0.1 0

CONCLUSION

0.5

0.3

0.1 0

Aspartame concentration μg/mL

100

250

The images above show the results of fluorescence microscopy for HEK293 cells. The cells in image A were treated with HBSS glucose, cells in image B were treated with 50 μM aspartame and the cells in image C were treated with 250 μM aspartame. The bright green stained cells are under oxidative stress and dying. Even though the control has just as many intensely stained cells as the ones in 250 μM aspartame treated cells, the cells in 50 μM aspartame treated cells are not as brightly stained as the ones in 250 μM aspartame treated cells. This suggest that the 250 μM aspartame concentration causes oxidative stress in HEK293 cells, causing the cells to die. HEK293 cells are fragile and delicate, so washing the cells with buffer to remove cell culture media was avoided. This could have caused the cells in the control to be stained bright green, as the components of cell culture media could have interrupted with the fluorescence.

100

250

Aspartame concentration μg/mL

Results for the MTT assay for HEK-293 incubated with aspartame show no significant difference in OD readings for concentrations of 0, 10, 20, 50, and 100 μg/mL in both the 30 minutes and 60 minutes incubations. However, a significant drop in OD readings is observed in both graphs for 250 μg/mL. This suggests that aspartame causes a decrease in cell viability at this concentration. Higher concentrations should be tested in the future for possible trends.

The results show that aspartame at a concentration of 250μg/mL negatively affects the cell viability of HEK-293. This suggests that pregnant women with GDM should not consume aspartame as it can cause developmental issues in the kidneys of fetuses. In order to improve the results, it is imperative that when the experiment is repeated, the cell line should be treated with care due to weak focal adhesions.

REFERENCES

1. Qatar, International Diabetes Federation Middle East and North Africa (2015). http://www.idf.org/membership/mena/qatar 2. Alfadhli, E. M. (2015). Gestational diabetes mellitus. Saudi Medical Journal, 36(4), 399–406. http://doi.org/10.15537/smj.2015.4.10307

The effects of Mg2+ and Zn2+ on human placental alkaline phosphatase (PALP) activity Author

Aulia Ahmad Latifa Al Badr

Advisors

Annette Vincent

Category

Biological Sciences

Abstract This research aims to investigate the effects of the divalent cations, Mg2+ and Zn2+, on human placental alkaline phosphatase (PALP) activity. The hypothesis of this experiment states that high concentrations of Mg2+ will inhibit PALP activity regardless of Zn2+ concentration. Moreover, a high concentration of Zn2+ has little inhibitory effects on PALP activity. The goal of this experiment was investigated by performing standard enzyme assay measuring the rate of dephosphorylation of pNPP to nitrophenol and finding the reactions Vmax and Km. The effect was measured by varying Mg2+ concentration, Zn2+ concentration, and varying both cationsâ&#x20AC;&#x2122; concentrations at a specific ratio. The obtained results showed an allosteric effect in which one cation binding was found to regulate the binding of the other cation. In this case, we have found that the Zn2+ acts as an allosteric activator to the binding of the Mg2+ cations. In conclusion, although Mg2+ exhibited inhibitory effect on the PALP activity, the presence of Zn2+ ions demolished this effect by binding to allosteric site and activating PALP activity.

Abstract

www.PosterPresentations.com

The kinetic parameters were evaluated with the plot of the reciprocal of initial velocities (VI) against corresponding reciprocal of substrate concentrations [4-NPP], that is, 1/Vi versus 1/[S] (Lineweaver and Burk, 1934).

Evaluation of kinetic parameters

The enzyme assay was carried out by varying the added Mg2+ concentrations 0.016M and 0.063M. Zn2+ concentration 7x10-5M. Zn2+:Mg2+ concentration 7x10-5M:0.016M and 1.5x10-4M :0.016M.

Effect of divalent cations, Zn2+ and Mg2+, on enzyme activity

The enzyme assay was carried out by methods described by Farah et al. (2008), but with minor modifications. The enzyme activity was measured with increasing enzyme concentration. Placental alkaline phosphatase activity was determined by measuring the increase in absorbance at wavelength of 410 nm at a 30 seconds interval for 3 minutes. Pnitrophenyl phosphate in phosphate buffered saline was used as the substrate for the reaction at 25°C.

Determination of placental alkaline phosphatase activity

Materials and Methods

To determine whether an increase in concentrations of Zn2+ or Mg2+ or both will increase or decrease the activity of human placental alkaline phosphatase (PALP), compared to baseline activity. This study is expected to be useful because Increased level of PALP in patients with ovarian cancer, so information on how to decrease activity of PALP would be allow for a possible treatment.

Objective

The Km of the placental alkaline phosphatase is 5.55 mM (Saini et al., 2005). This Km is expected to increase indicating a decreased affinity. The decreased affinity represents inhibitory effect. On the contratry, decreased km indicate increased affinity and activating effect.

Placental Alkaline Phosphatase (PALP) is an enzyme that catalyzes the hydrolysis of esters of phosphoric acids. Alkaline phosphatases have pH optima of ~ 9.0. This enzyme is a dimeric metalloenzyme. Each monomeric subunit contains three divalent cations, 2 Zn2+ and 1 Mg2+. In a previous study it was found that the addition of MgCl2 decreased ALP activity while the addition of ZnCl2 did not affect the ALP activity (Farah et al, 2012).

ntroduction

This research aims to investigate the effects of the divalent cations, Mg2+ and Zn2+, on human placental alkaline phosphatase (PALP) activity. The hypothesis of this experiment states that high concentrations of Mg2+ will inhibit PALP activity regardless of Zn2+ concentration. Moreover, a high concentration of Zn2+ has little inhibitory effects on PALP activity. The goal of this experiment was investigated by performing standard enzyme assay measuring the rate of dephosphorylation of pNPP to nitrophenol and finding the reactions Vmax and Km. The effect was measured by varying Mg2+ concentration, Zn2+ concentration, and varying both cations’ concentrations at a specific ratio. The obtained results showed an allosteric effect in which one cation binding was found to regulate the binding of the other cation. In this case, we have found that the Zn2+ acts as an allosteric activator to the binding of the Mg2+ cations. In conclusion, although Mg2+ exhibited inhibitory effect on the PALP activity, the presence of Zn2+ ions demolished this effect by binding to allosteric site and activating PALP activity.

Figure 3. Lineweaver Burke plot of PALP. This graph was constructed by obtaining the reciprocals of the velocity and substrate concentration from the Michaelis-Menten curve (Figure 2). The purpose of this graph is to determine the Km values and using it as the baseline to compare the effects of Zn2+ and Mg2+ on PALP.

Figure 2. Michaelis-Menten plot of PALP with increasing substrate concentration, ranging from 0.1 to 0.6 umol/L. This graph was constructed by measuring the rate of product being produced per minute. The substrate used was pNPP, which gets cleaved by PALP, producing a yellow product that absorbs at 410 nm and can be measured using the spectrophotometer.

Figure 1. Enzyme saturation curve of PALP. The purpose of this graph was to determine the minimum amount of enzyme to be used in the experiment that would produce maximum activity. This was done by increasing the concentration of the enzyme while keeping the substrate concentration constant. The enzyme was saturated at 2 U/uL.

Results

The Km value of PALP from this experiment was determined to be 0.0612 uM. The published Km value of alkaline phosphatase from E. coli was recorded as 0.013 mM, or 13 uM. The value of Km for E. coli AP was much larger than human placental AP. The relation between the R-squared value and accuracy of the results that the higher the R-squared, the higher the accuracy. The R-squared value in Figure 3 was 0.522, which was considered to be low.

With enzyme concentration kept constant at 2 U/uL, a Michaelis-Menten plot was constructed that showed change in enzyme activity with increasing substrate concentration (Figure 2). From this figure, a Lineweaver Burke plot was created by obtaining the reciprocal of both velocity and concentration values. This allowed for a more accurate determination of Km and Vmax values. Vmax is the maximum rate of reaction that the enzyme can obtain, and Km is the substrate concentration at which the enzyme activity is half that its maximum activity. From the Lineweaver Burke plot, the y-intercept is the reciprocal of Vmax ( i.e. yintercept = 1/Vmax), and the x-intercept is the negative reciprocal of the Km (i.e. x-intercept = -1/Km). In this experiment, a total of 5 different manipulations were done on PALP, and so 5 Lineweaver Burke plots were created. All 5 plots were constructed on the same axes as shown in Figure 4 to allow for a clearer comparison between each manipulation.

The enzyme saturation curve for human PALP (Figure 1) showed an increase in enzyme activity with increasing enzyme concentration, which eventually plateaued as the enzyme concentration reached 2 U/uL. This meant that at 2 U/uL, the enzyme was saturated, and therefore 2 U/uL was the minimum concentration of enzyme chosen to examine the effects of Mg2+ and Zn2+ions. At concentrations higher than 2 U/uL, enzyme activity was no longer dependent on enzyme concentration. This point was called zero order kinetics. When enzyme concentrations were lower than 2 U/uL, enzyme activity increased proportionally to increasing enzyme concentration, which meant that the enzyme was in its first order of kinetics.

Analysis

Figure 4. Lineweaver Burke plots of PALP with different amounts of Mg2+ and Zn2+ added. The purpose of this graph is to study the effects of the ions on the enzyme activity by comparing the Km values of each plot. The slope of the plots are equivalent to the negative reciprocal of the Km. Steeper slope means a smaller Km value. Decrease in Km value would indicate an increase in affinity of the enzyme and hence represent activating effect.

Experimental Biochemistry (Spring 2017), Carnegie Mellon University Qatar

Aulia Ahmad, Latifa Al Badr, Dr. Annette Vincent

Sartori, M. J., Lin, S., Frank, F. M., Malchiodi, E. L., & De Fabro, S. P. (2002). Role of placental alkaline phosphatase in the interaction between human placental trophoblast and Trypanosoma cruzi. Experimental and molecular pathology, 72(1), 84-90. Lineweaver H, Burk D (1934). The determination of enzyme dissociation constants. J. Am. Chem. Soc. 56: 658-666. Saini D, Kala M, Jain V, Sinha S (2005). Targeting the active site of the placental isozyme of alkaline phosphatase by phage-displayed scFv antibodies selected by a specific uncompetitive inhibitor. BMC Biotech. 5(33): http://www.biomedcentral.com/1472. Farah, Husni S., Ali A. Al-Atoom, and Gaber M. Shehab.(2012). "Explanation of the decrease in alkaline phosphatase (ALP) activity in hemolysed blood samples from the clinical point of view: In vitro study." Jordan J Biol Sci 5.2: 125-128.

In conclusion, the hypothesis of this experiment was rejected because this experiment showed that Mg2+ had less effect on PALP activity than does Zn2+. This could be due to the physiological function of PALP, which includes protecting the villi and preventing parasites from penetrating them (Sartori, Lin, et al, 2002). It could be possible that the role of PALP and the environment that surrounds PALP requires the enzyme to have this specific characteristic. In addition, the results also showed that Zn2+and Mg2+ have an effect on each other, such that alkaline phosphatase is an allosteric enzyme. Further studies include looking into the environment of PALP, particularly related to Zn2+ and Mg2+ concentrations, and determining the allosteric activity effects of the enzyme in relation to the ions. References

Conclusion

The expected outcome was that Mg2+ would decrease PALP activity in a dose-dependent manner, regardless of the addition of Zn2+. However, results from Km values show different outcomes. From the Km values in this experiment, it was shown that adding Mg2+ reduced enzyme activity. Adding Zn2+ alone increased enzyme activity. When both Mg2+ and Zn2+ were added in low concentrations, there was no difference in enzyme activity possibly because the decrease in activity caused by Mg2+ cancelled out the increase in activity caused by the Zn2+. However, when Zn2+ was added at a high concentration in combination with low Mg2+ concentration, there was an increase in activity. This meant that the effect caused by Zn2+ is greater than the effect caused by Mg2+.

Although this was the case, there was an overall trend that could be observed from the Km values determined from the Lineweaver Burke plots. The pure PALP had a Km value of 0.0612 uM, as stated earlier. This value was taken to be the control Km value with which all other Km values were compared to in order to determine the overall effects of the enzyme. With addition of Mg2+ at 0.016 M, the Km value was 0.079 uM. This value was higher than the control Km, which meant that Mg2+ at this low concentration decreased the activity of PALP. With the addition of higher Mg2+ concentration, at 0.063 M, the Km value was 0.283 uM. This was even higher than the previous Mg2+ addition, and so this higher Mg2+ concentration decreased PALP activity even further. With Zn2+ added at 7.0 x 10-5 M, the Km value was 0.0214 uM. This Km value was lower than control Km, which meant that Zn2+ at this concentration increased PALP activity. When two ions were added, the Km value of the one that received Mg2+ at 0.016 M and Zn2+ at 7.0 x 10-5 M was 0.0607 uM. This value is very close to control Km, so this combination of ions did not change PALP activity very greatly. The enzyme that received Mg2+ at 0.016 M and Zn2+ at 1.5 x 10-4 M was 0.0107 uM. This value was lower than control Km and the lowest value among all Km values calculated in this experiment. Therefore, Mg2+ and Zn2+ at this specific concentration resulted in an increase in PALP activity.

The R-squared values of all Lineweaver Burke plots constructed varied differently, as shown in Figure 4. The highest R-squared value was for Mg2+ at 0.063M, which was 0.919. The R-squared value for Mg2+ at 0.016M was 0.602, for Zn2+ at 7 x 10-5M was 0.511, and for the combination of Mg2+ and Zn2+ at 0.016M and 7 x 10-5M respectively was 0.566. These R-squared values were very low. However, the lowest Rsquared value was for Mg2+ and Zn2+ at 0.016M and 1.5 x 10-4M respectively, which had an R-squared value of 0.013. The low R-squared values indicated that there was a large variability among the data points, which could mean that the line of best fits for these plots were not very accurate. In this case, the x-intercepts of these plots would also not be accurate. Therefore, the Km values of each manipulated variable could not be stated with high confidence.

The effects of Mg2+ and Zn2+ on human placental alkaline phosphatase (PALP) activity

Study of the role of Lactate Dehydrogenase C (LDHC) in the aggressive behavior of triple negative breast cancer Author

Abdulrahman Al-Subaiey

Advisor

Remy Thomas, Qatar Biomedical Research Institute Julie Decock, Qatar Biomedical Research Institute , Hamad Bin Khalifa University

Category

Biological Sciences

Abstract Triple Negative Breast Cancer (TNBC) is marked by the absence of the estrogen, progesterone, and human epidermal growth factor receptors. TNBC tumors are an aggressive subtype of breast cancer; they are highly proliferative, genetically unstable, and respond poorly to treatment. During the process of tumor growth, cancer cells undergo a â&#x20AC;&#x2DC;metabolic switchâ&#x20AC;&#x2122; from aerobic glucose metabolism to anaerobic lactate metabolism, which provides a competitive advantage to the cancer cells over normal cells. Lactate dehydrogenases play an important role in the metabolism of lactate as they catalyze the conversion of pyruvate to lactate. Lactate Dehydrogenase C (LDHC) was studied in this project because it is a unique isoform that is almost exclusively expressed in the male human testis, yet it is markedly upregulated in numerous TNBC cell lines. In this study, knockdown (KD) of LDHC by RNA interference was confirmed with RT-qPCR and western blot. After KD, a significant decrease in motility was observed by the wound-healing assay and a significant increase in adhesion was observed by the cell adhesion assay. LDHC KD also resulted in the upregulation of the mesenchymal marker Vimentin and a downregulation of the epithelial marker E-cadherin, as observed by immunofluorescence.

Study of the Role of Lactate Dehydrogenase C (LDHC) in the aggressive behavior of Triple Negative Breast Cancer Abdulrahman Al-Subaiey1, Remy Thomas2, Julie Decock2,3 1Carnegie

Mellon University Qatar, 2Qatar Biomedical Research Institute, 3Hamad Bin Khalifa University asubaiey@qatar.cmu.edu

Lactate dehydrogenase expression

Abstract

MDA-MB-231

MCF7

Cell-matrix adhesion

MDA-MB-453

MDA-MB-231 % adhered cells

siLDHC

siCTR

siLDHC

siCTR

siLDHC

250

siCTR

LDHC

200 150

LN V (5ug/ml)

siCTR

siLDHC

MCF7

LDHA % adhered cells

250

LDHB β-Actin

200 150

CN I (5ug/ml)

100

FN (5ug/ml) LN V (5ug/ml)

50 0

Figure 3. Reduction in LDHC expression does not alter expression of the LDHA and LDHB isoforms of lactate dehydrogenase. Cells were treated for 72 hours with either 20nM siLDHC or siCTR. Β-actin, loading control.

Cell motility

siCTR

siLDHC

MDA-MB-453 250 200 150

CN I (5ug/ml) FN (5ug/ml)

100

LN V (5ug/ml)

50 0

A) 120

siCTR

siLDHC

100

Wound closure (%)

LDHC/RPLPO (%)

FN (5ug/ml)

β-actin

LDHC mRNA expression

CN I (5ug/ml)

100

% adhered cells

Triple Negative Breast Cancer (TNBC) is marked by the absence of the estrogen, progesterone, and human epidermal growth factor receptors. TNBC tumors form an aggressive subtype of breast cancer; characterized by genetic instability, high proliferation rate and poor treatment response. We investigated the role of Lactate Dehydrogenase C (LDHC) in various biological processes that facilitate tumor progression; including cell motility and cell survival. Lactate Dehydrogenase C (LDHC) is a cancer testis antigen that is almost exclusively expressed in the male human testis, yet is markedly upregulated in numerous TNBC cell lines. LDHC knockdown was performed in 2 TNBC cell lines (MDA-MB231, MDA-MB-453) and 1 non-TNBC cell line (MCF7), as demonstrated by RT-qPCR, western blot, and immunofluorescence. We found that a reduction in LDHC expression was associated with a significant decrease in cell motility and a significant increase in cell adhesion to collagen I, fibronectin and laminin V. In addition, LDHC knockdown resulted in a change in phenotype of the MDA-MB-231 TNBC cell line to a less mesenchymal phenotype as shown by a reduction in the expression of the mesenchymal marker Vimentin. These preliminary data support further research into the role of LDHC in TNBC progression.

siCTR

siLDHC

Figure 6. LDHC knockdown increases cell adhesion to various extracellular matrix (ECM) components. Cells were treated for 72 hours with either 20nM siLDHC or siCTR. Cell adhesion to 5 ug/ml collagen I, fibronectin, and laminin V was determined using crystal violet, with absorbance measurements at 590nm. Data are expressed relative to siCTR for each cell line.

20 siCTR

siLDHC MDA-MB-231

siCTR

siLDHC MCF7

siCTR

siLDHC MDA-MB-453

Figure 1. LDHC mRNA expression in TNBC and non-TNBC cell lines. Cells were treated for 48 hours with either 20nM LDHC siRNA (siLDHC) or scrambled control siRNA (siCTR). Data are expressed relative to expression in cells treated with siCTR. Bar chart represents 3 biological replicates with 2 technical replicates each. LDHC mRNA expression was determined by real time qRT-PCR and normalized to the endogenous control RPLPO.

LDHC protein expression and cellular localization siLDHC

MDA-MB-231

siCTR

* p < 0.05 ** p < 0.01

MDA-MB-231

MCF7

MDA-MB-453

LDHC expression

siCTR 0 hours

siLDHC 24 hours

0 hours

24 hours

 LDHC mRNA expression was significantly reduced in all cell lines after knockdown, as determined by real time qRT-PCR.  Knockdown of LDHC protein expression was confirmed using immunofluorescence and western blot, albeit at varying degrees for different cell lines.  Protein expression of the LDHA and LDHB isoforms was not affected by LDHC knockdown, as demonstrated by western blot.

Cell Motility Figure 4. LDHC knockdown reduces cell motility as measured by the wound healing assay. Cells were treated with 20nM siLDHC or siCTR, and scratch was performed 48 hours later. A) Percentage wound closure measured 24 hours after the scratch. Bar chart for MDA-MB-231 and MCF7 represents 2 biological replicates with 3 technical replicates each. B) Representative pictures of wound closure for MDA-MB-231 cells.

Epithelial-Mesenchymal markers siCTR

siLDHC

E-cadherin

MDA-MB-453

Vimentin

Figure 2. LDHC protein expression and cellular localization in TNBC and non-TNBC cell lines. Cells were treated with 20nM siLDHC or siCTR for 72 hours. Cell nuclei, blue (DAPI); LDHC, green (alexa fluor 488 secondary antibody).

Summary

Figure 5. LDHC knockdown induces a change in cell morphology towards a more epithelial phenotype. The expression of ECadherin, epithelial marker, and vimentin, mesenchymal marker, was determined in the MDA-MB-231 cell line as they showed the most significant change in cell motility. MDA-MB-231 cells were treated with 20nM siLDHC or siCTR for 72 hours. Cell nuclei, blue (DAPI); LDHC, red (alexa fluor 594 secondary antibody).

 LDHC knockdown significantly reduced the wound closure rate of MDA-MB-231 cells and to a lesser extent of MCF7 cells.  A change in cell phenotype could be observed after LDHC knockdown in the MDA-MB-231 cell line. Cells showed a more epithelial phenotype, as demonstrated by a reduced expression of the mesenchymal marker vimentin.

Cell Adhesion

 LDHC knockdown resulted in a significant increase in cell adhesion to collagen I, fibronectin and laminin V for MDA-MB231 and MCF7 cells. A similar increase could only be observed for collagen I and fibronectin in the MDA-MB-453 cell line.

Conclusions Our findings suggest that LDHC may contribute to the progression of breast cancer, in particular of TNBC and to a lesser extent of non-TNBC. Knockdown of LDHC resulted in a significant reduction in cell motility, a significant increase in cell-ECM adhesion to various extracellular matrix components, as well as an induction of mesenchymal-to-epithelial transition. Further research is warranted to identify the molecules involved in the LDHC-regulated processes.

Acknowledgments This work was funded by the QBRI summer research and QBRI-CMU honor’s research internship programs. We would like to acknowledge Miss Ghaneya Al-Khadairi for her support and Dr. Mohamed Emara for guidance with microscopy

Application based learning to reinforce academic concepts in Qatar biology curriculum Author

Mohammad Osaama Bin Shehzad

Advisors

Annette Vincent Saquib Razak

Category

Biological Sciences

Abstract The need for incorporation of scientific inquiry into the local Qatari Biology curriculum was highlighted through surveys conducted among high school students and teachers. To address this, the Biological Sciences Program has developed a protein assay kit that can be adopted readily in any classroom to conduct an inquiry based series of Biology lessons. To further develop learning among the high school students, this proposal aims to develop an iOS based application that will reinforce the concepts addressed in sections I and III of the kit. The app will teach the students how to plot a colorimetric assay based standard curve through an interactive approach and the concept of dilutions (Section I). The app will also allow for students to gain insights on proteins in food to aid their hypothesis testing in section III of the kit. This section can also address the need for nutritional balance in the studentsâ&#x20AC;&#x2122; diet hence addressing social issues of obesity and diabetes.

Application based learning to reinforce academic concepts in Qatar Biology curriculum Introduction

Mohammad Osaama Shehzad Advisors: Annette Vincent, Saquib Razak

Educational surveys conducted at local Qatari schools indicated that 91% of students feel that inquiry and practical based biology lessons promoted critical thinking.

Methods

One of the learning objectives was to ensure that students learn how to hypothesize. Therefore, the first feature of the application was to sort different items into a pyramid depending on their protein content and then hypothesize by making educated guess about where any given food would fit into the pyramid.

91% Students who demonstrated the need for practical based learning Therefore, an experimental protein biology kit was developed based on chemical interaction between Coomassie Brilliant Blue dye and proteins. The color of the dye turns blue as protein concentration increases.

The second feature of the application deals with visual representation of serial dilutions. The interactive features of the application will show different physical consequences (color change) based on chemical interactions between protein and Coomassie dye and the concept of manipulating concentrations using serial dilutions technique. There is also a calculation feature which guides on how to calculate necessary volumes, concentrations, and dilution factors for serial dilutions

To reinforce the experimental techniques and the reasoning behind them, an iOS application was developed to facilitate interactive learning.

The last and final feature of application aims to raise awareness about recommended protein intake through a protein shooting game and a protein intake tracking system. The goal is to attempt fighting obesity in Qatar by raising awareness through applications â&#x20AC;&#x201C; a novel approach.

Results Due to budget issues, the application has not been deployed yet which is why the efficiency results of application based learning could not be shared.

Acoustic Analysis of Text (AAT): Extracting sound out of words Author

Rohith Krishnan Pillai Umair Waheed Qazi

Advisor

Bhiksha Raj

Category

Computer Science

Abstract We have had systems like NELL (Never Ending Language Learner) and NEIL (Never Ending Image Learner) for quite some time. Both these systems try to build knowledge for noun-phrases and objects in images respectively. However, to replicate the same for a sound based system requires a smaller task to be tackled; finding sound phrases from text. In Acoustic Analysis of Text (AAT), we try to build a system to extract word phrases that allude to sounds from text. We try to do this through using supervised machine learning techniques by building classification systems that can accurately generate these â&#x20AC;&#x153;sound descriptorsâ&#x20AC;?. This system can then later be added to the sound learner as a module, to allow for more complicated never-ending learning tasks with sounds. In this research we introduce a novel method of looking at creating this system by using tools like Stanford Parser, NTLK, Word2Vec and LIBSVM. Keywords: Document analysis, sound descriptors, Acoustic analysis of text, AAT

Acoustic Analysis of Text (AAT) Extracting sound out of words Rohith Krishnan Pillai & Umair Waheed Qazi Advisor: Bhiksha Raj

Introduction

Methodology

Document classification can be done on the basis on genre, sentiment, author; intelligently or manually. We consider a subset of this problem, which is novel, and has not been explored before. How do we extract phrases which express sounds? Consider the sentence; “Veli is a male”. ‘Veli’ is a proper noun, and ‘male’ is a common noun. This sentence describes a fact. But consider the sentence; “The alarm is ringing.” ‘Ringing’ is a verb, and this is an example of a sentence which expresses sound. We would like to explore the descriptors of such sentences and make a classifier which would identify such sentences, intelligently, based on Machine Learning (ML) techniques. Acoustic Analysis of Text (AAT), is a solution to the above stated research question that has applications in Artificial intelligence, Machine learning and Textual analysis. However, our project goes beyond simple text interpretation. For artificially intelligent agents to effectively comprehend and interact with the world, they must also be able to interpret sound. Sound differs fundamentally from other forms of information: it does not exist by itself. Instead, it results from the actions or interactions of objects. Therefore, it is much more challenging to identify a phrase that denotes a sound than it is to identify more direct factual statements such as those extracted by NELL (Mitchell & Fredkin, 2014) and NEIL (Shrivastava & Gupta, 2013). AAT, is a new domain that fits in this larger goal of enabling a machine to comprehend the world around us better through interpretation of sound. In order to comprehend the world completely the machine must be able to describe the sound in ways that permit additional inferences. Hence the first step is to determine what kind of descriptors allude to sounds in text. We are trying to solve this problem.

Since our research question deals with finding a good way to build a program that can detect a sound descriptor (SD), the procedures followed were an essential part of the research.

Results Since we had to use a supervised machine learning technique, and given the circumstances that the type of classification that was being done was binary (positive/negative example of a SD), we decided that a SVM would be the best possible technique in such a situation. The training dataset contained 1001 positive instances, but at a positive to negative instance ratio of 0.1783:1. The training data on the other hand contained 193 positive instances with a positive to negative ratio of 0.199:1. There was no experiment conducted without using the scaling technique as it helps to increase the accuracy or does not make any difference at all. Positive: Negative ratio in training data

Kernel

Accuracy

0.1783:1

Linear

81.1535% (788/971)

0.1783:1

Radial

81.1535% (788/971)

1:1

Linear

63.7195% (627/984)

1:1

Radial

62.1951% (612/984)

Discussion The methods and the techniques used in the research provided some good results in terms of accuracy of classifying a SD vs a non-SD. However, the methodology also brings with it some limitations that has in some ways affected the accuracy of the overall classifier developed such as smaller datasets. Another limitation is the fact that we decided to use only three parts of speech in the conversion of the labeled data instances into the numerical fields for each instance. Overfitting because of the ratio of the positives to negative instances being really small is a problem that we had encountered during the research project. One of the phenomena observed was the accuracy would be high at around 81% even though the predictions were almost all negative. This seems to be working in favor of a higher accuracy as the ratio of positive to negative instances is so small that such a program predicting only negative results would still show high accuracy. We hypothesize that this could be the effect of the very small ratio of positive to negative examples leading to overfitting by the SVM, which in turn could be providing only negative predictions.

• Settled on the concept of an auditory image in literature Definition of Sound Descriptor (SD)

Data Collection

Parsing and Cleaning the Corpus

• Wikipedia articles relating to sound to use for this purpose as it contained a good mix of both SDs and non-SD phrases, along with a consistent easy to parse format, and also an API for retrieving pages. • Also ensured that the Wikipedia article text was devoid of any symbols and numerals.

• The data that was collected and normalized is then processed through another program to extract the noun phrases and verb phrases from the text. • Noun phrases and verb phrases were heuristically cleaned for any anomalous or wrong parsing caused by the removing of symbols and numerals from the data collection stage.

• Since this is a new domain of study, there did not exist any pre-labeled datasets to be used in our research. This pushed us to retrieve the data ourselves and to manually label each of the noun phrases and verb phrases from the last step. Labeling Data for • we built our own labeling program that gave the user the ability to quickly label all the noun and verb phrases. We made the decision to have a training and testing dataset size that contained at least 1000 positives examples of a SD each. Supervised Learning Methods

• We trained a word2vec system on the entire corpus of text retrieved from Wikipedia, with the dimensions set to 300 to yield the model used to obtain 300 dimensional vectors. • Hence we used the Stanford POS Tagger to find the parts of speech for each of the noun/verb phrases. • For each noun/verb phrase using the word2vec model that was trained in the previous step, we found the sum of all the Converting Labeled 300-dimensional vectors for each of the three parts of speech (Nouns, Adjectives, and Verbs). The three Data into a 300-dimensional vectors were then appended in the order Nouns, Adjectives, Verbs to create a 900-dimensional vector. Quantitative Dataset

Training Support Vector Machines (SVMs)

• Separated the combined dataset that we have into two parts: the training dataset and the testing dataset. When creating the two files we made sure that the training data had exactly 1001 positive instances, while the testing file had the rest. • The training dataset was used to train a SVM using LIBSVM in two different experiments, using a linear and a radial (Gaussian) kernel.

Conclusion Since this research is one of the first on the domain of acoustic analysis of text, it could be explored in greater depths through other research projects. Future work relevant to this field could incorporate looking at other machine learning techniques like K-nearest neighbors, Decision Trees etc. to design AAT systems. Another possible direction is to look strictly at finding grammatically/syntactical patterns through parts of speech tagging to find SDs. This would be similar to Hearst Patterns being used in Ontology population. Rule learners are also a possible alternative to discovering such patterns from text. AAT is a new domain in the AI field and it requires a lot more of in depth study and research to be able to reach the prominence and commonality that text analysis has. Through our research we were able to shed light on a novel way to acoustically analyze texts from Wikipedia. We did this by using a supervised machine learning technique like SVM, to create a program or a classifier able to predict whether a given noun phrase or verb phrase contains a sound descriptor. This being only a preliminary research on the matter also sheds light on the many challenges and limitations of AAT. However, further work on this research is continuing to increase the accuracy, the precision and the recall of the current system. Further improvements would mean that we are several steps closer to being able to create a Never-Ending-Learner for sounds, much like its counterpart such as NELL and NEIL.

References Chen, X., Shrivastava, A., & Gupta, A. (2013). Neil: Extracting visual knowledge from web data. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1409-1416). Love, T. (2011). Analysing Sound. Retrieved from http://www2.eng.cam.ac.uk/~tpl/asp Mitchell, T., & Fredkin, E. (2014, October). Never Ending Language Learning. In Big Data (Big Data), 2014 IEEE International Conference on (pp. 1-1). IEEE.

An agile platform for distributed computation in smart IoT environments Author

Sannan Tariq

Advisors

Khaled Harras

Category

Computer Science

Abstract The current mobile cloud computing model that requires data to traverse long distances over the network places a load on the internet infrastructure, degrades application performance due to large round trip times and places private and sensitive user data at potential risk. All these problems are exacerbated in the case of an IoT rich environment consisting of many devices passively sensing and sending data for processing to the cloud. However, to our advantage, the increasing ubiquity of IoT devices has been coupled with an increase in their computational and communication capabilities. We leverage the presence of capable and idle IoT devices in a smart environment to allow communication and collaboration within a single wireless network to achieve a shared objective. We achieve this by allowing IoT devices to distribute tasks to other devices within a network, minimising internet utilisation, reducing latency and preventing private data from leaving the location. We build a framework that minimises the adverse effects of the underlying heterogeneity of the devices present in a potential smart environment in order to facilitate the development of platform agnostic applications. We also optimize the generic scheduling methods of our orchestration system for our test applications to fully leverage the proposed framework and display its potential.

Lifestyle disease surveillance using spatiotemporal search intensity models Authors

Shahan Ali Memon

Advisor

Ingmar Weber, Qatar Computing Research Institute Saquib Razak

Category

Computer Science

Abstract The worldwide growth in “Googling” about health-related information on the Internet over the past few years has created new possibilities of using web search data for public health surveillance. Diseases that are typically tracked at the population level can be categorized into two domains: communicable diseases and non-communicable diseases (NCDs). The poster child for tracking communicable diseases was Google Flu Trends (GFT), launched by Google to monitor the spread of influenza, a communicable disease. Although, the system shut down in 2015 for overestimating the influenza epidemics, it started a whole line of health care research using Google Trends to nowcast anything from fast-moving communicable diseases such as dengue fever and chickenpox to slow-moving “lifestyle diseases” such as diabetes and obesity. As lifestyle diseases are typically slow-moving, statistics for these are only available at an annual basis creating sparsity issues when training temporal models. Furthermore, these statistics are released with several months of lag and the data for 2016 is not yet available as of March 2017. However, even though the prevalence rates for these diseases only change by a few percentage points year-over-year, that small change still translates to billions of US dollars, motivating attempts to bring down the latency for creating these statistics. In our research we present novel spatio-temporal calibration approaches that overcome data sparsity issues by leveraging both temporal and spatial trends for model fitting of such slow-moving diseases and trends. Our approach takes into account regional variation in population sizes and Internet penetration. Furthermore, we show how the predictive performance of the fitted models can be further improved by combining both historic offline data and recent online data. We also suggest a bootstrapping method of feature selection using Google correlate and related-search queries. Finally, we describe important idiosyncrasies related to using Google Trends and suggest best practices.

Lifestyle Disease Surveillance Using Spatio-Temporal Search Intensity Models

Goal

Research by: Shahan A. Memon Advisors: Ingmar Weber & Saquib Razak

Regression Modeling

Lifestyle Diseases

Web Search Activity

LATENCY AND COST

Motivation Previous Work

Disease Prevalence

DATA RELIABILITY & AVAILABILITY

HYPOTHESIS TESTING

DATA SPARSITY

Google Correlate CREATING HEALTH STATISTICS THROUGH TRADITIONAL METHODS (SURVEYS, CLINIC VISITS) IS EXPENSIVE AND HAS HIGH TIME LAG – MONTHS OR YEARS.

BECAUSE THE DATA FOR DISEASES SUCH AS OBESITY AND DEPRESSION IS COLLECTED VIA SAMPLING, IT IS NOT VERY RELIABLE. IN MANY COUNTRIES SUCH DATA IS NOT EVEN AVAILABLE.

STATISTICS DO NOT HELP TEST HYPOTHESIS RELATED TO BEHAVIORS AND ATTITUDES OF GENERAL POPULATION SUCH AS “WHY PEOPLE ARE OBESE?”

AS LIFESTYLE DISEASES ARE TYPICALLY SLOW-MOVING, STATISTICS FOR THESE ARE ONLY AVAILABLE AT AN ANNUAL BASIS CREATING SPARSITY ISSUES WHEN TRAINING TEMPORAL MODELS.

RESEARCH GAP

Google Flu Trends

Most of the work is done on communicable

•

Google Trends is a public web facility of Google Inc., based on Google Search, that shows how often a particular searchterm is entered relative to the total searchvolume across various regions of the world.

Previous work does not meet the trivial “next

•

year is the same as last year” baseline. Previous work does not take into account

•

Using web Search to Nowcast NCDs

different normalization issues. Previous work does not take into

• yi = β0+βT xi+ ei with Lasso and 10-fold cross-validation

consideration temporal or global shifts.

Evaluation

Collect offline data for the target variable

Related Queries Semanticlink.com

Google Correlate

Collect Google Trends Data

Spatial Data value From Google Trends

Keyword Selection

Ratio of the Temporal increase from year r to year i

Results

Future Work

• • • •

1. 2.

Linear Regression Model

Mean Absolute Error L

Diabetes 0.02 0.42 0.95

0.88

0.94

0.47

0.74

0.51

4.61

7.49

5.06

Obesity

0.06 0.22 0.93

0.88

0.94

1.10

1.73

1.05

3.77

6.03

3.61

Exercise

0.02 0.41 0.91

0.80

0.88

2.92

2.58

2.43

3.92

3.44

3.25

Suicide

0.07

0.83

1.22

2.21

7.63

14.6

0.94

The table shows our performance on different models. Highlighted in red are the values where we beat the baseline B: Represents the baseline Model (This year’s prevalence is same as last year) N: Represents our linear regression model after appropriate normalization Note: For Suicide, there was no weighted combination found to improve the baseline

L: Represents our boosted model by combining B and N through a weighted linear combination (L = ωN+ (1-ω)B λ: Represents the tuning parameter for Lasso regularization used in N ω: Represents the weight assigned to our model prediction N in the model L

Bootstrap the keyword selection using Google Correlate, Google Trends related queries and Semantic-link.com Appropriately normalize Google Trends’ data, and include state-level population distribution and internet penetration Experimentally Validate normalization methods by improving over previously used methods Combine online and offline data for model improvement

Thin Nguyen, Truyen Tran, Wei Luo, Sunil Gupta, Santu Rana, Dinh Phung,Melanie Nichols, Lynne Millar, Svetha Venkatesh, and Steve Allender. 2015. Web search activity data accurately predict population chronic disease risk in the USA. Journal of Epidemiology and Community Health (2015). https://www.google.org/flutrends/about/

Spatial Data Spatial data refers to Google Trends’ web search intensity for a given year and a particular keyword normalized across different U.S. states.

Temporal Data Temporal data refers to Google Trends’ U.S. web search intensity for a particular keyword normalized across different years.

SMAPE

SMAPE*

Baseline Model

Combining Baseline Model with our predictive model in a weighted combination

Spearman’s R

• Experiment with Multi-level Models • Use Ensemble learning by using many predictors in a weighted linear combination • Experiment with other slow-moving target variables and other countries References:

Including population size and internet penetration

Data Normalization

Target Variable

Actual rates of diabetes for the year 2015 (top) closely matched our predictions (bottom).

Ratio of the sum of spatial data in year r to year i

Linear Regression with Lasso Regularization and 10-fold crossvalidation

Google Correlate is a tool which takes either a temporal or a spatial series as input and returns a ranked listed of the web search queries that are correlated across time or space.

Google Trends

diseases and use only temporal data*.

Select Target Variable (e.g. obesity)

Methodology

Lifestyle diseases is another term for NonCommunicable Diseases (NCDs) such as obesity, cancer, depression, smoking, diabetes, etc. They count for the major burden of deaths in the world.

Symmetric mean absolute percentage error (SMAPE) is an accuracy measure based on percentage (or relative) errors. It is usually defined as follows:

where At is the actual value and Ft is the forecast value.

Offline Data For any offline target variable (such as diabetes), offline data is the actual regional or global prevalence of the condition.

PolyHJ: A polymorphic main-memory hash join paradigm for multi-core machines Author

Omar Khattab

Advisor

Mohammad Hammoud

Category

Computer Science

Abstract Relational main-memory join is a fundamental data management operation, which highly influences the performance of almost every database query. Main-memory joins are typically classified into two major categories, hash joins and sort-merge joins. Recent studies show that hash joins generally surpass sortmerge ones, especially on large-scale multi-core machines. However, hash joins themselves are further categorized into No-Partitioning (NOP) and Partitioned Hash Join (PHJ) paradigms, which employ different approaches and demonstrate indecisive results against each other on modern hardware. In this work, we show that different workload characteristics and hardware configurations necessitate different main-memory hash join models. Subsequently, we suggest four join models that extend beyond NOP and PHJ and can subsume any hash-based join implementation. We characterize the relative merits of each model and propose a polymorphic join scheme named PolyHJ, which dynamically identifies the most effective model for the given workload features and hardware setting. Afterwards, it executes an efficient implementation of the selected model, which incorporates redesigned partitioning, building, and probing phases of hash joins. More precisely, PolyHJ involves a novel in-place, cache-aware partitioning (ICP) and collaborative building and probing (ColBP) mechanisms. ICP and ColBP serve substantially in enhancing scalability and improving multi-core cache and memory bandwidth efficiencies. In particular, ICP increases cache locality and saves memory bandwidth via re-using cached blocks in input relations. Besides, ColBP significantly reduces partitioning cost and escalates scalability through allowing each hash table to be as large as the total size of the last-level cache (LLC) in chip multi-core machines. This stems from our study of modern high-end CPUs, whereby we observed that per-thread cache has been largely unchanging for over a decade, while the capacity of LLC has been actually growing with larger numbers of cores. We implemented PolyHJ and thoroughly evaluated its performance. Our experimental results demonstrated that PolyHJ can successfully select the best hash join model for a wide range of datasets and hardware configurations. As a result, it outperformed the state-of-the-art NOP and PHJ schemes by averages of more than 1.8X and 2X, and up to 3.7X and 5.3X (excluding few edge cases, which lead up to 91X speedup), respectively.

PolyHJ:: A Polymorphic Main PolyHJ Main-Memory Hash Join Paradigm for Main-Memory Multi--Core Machines Multi Omar Khattab and Mohammad Hammoud BACKGROUND

Relational Join

No-Partitioning No Partitioning Hash Join (NOP) Paradigm

 A fundamental, pervasive, and (often) high-cost DBMS operation  Cartesian Product Filter: for each tuple in inner relation R, find matching tuples in outer relation S  Hash Joins build a hash table out of R and probe it from S  Hash Join Paradigms:  No-Partitioning Hash Join (NOP)  Partitioned Hash Join (PHJ)

 Builds a single, undivided hash table out of R  Hardware-oblivious

Partitioned Hash Join (PHJ) Paradigm  Partitions R and S into many small parts  Applies NOP-like join for each corresponding pair of parts  Hardware-conscious: Enhances cache-locality and parallelism

MOTIVATION

NOP vs. PHJ: The Size-Skew Dichotomy

Hardware Challenges

 Different combinations of input characteristics   

 Fast growth in numbers of cores and hardware contexts per chip  Yet, per-core bandwidth and cache remain limited  Bandwidth Wall and Cache Wall Problems

Size of Inner Relation, |R| Ratio of Sizes of Outer and Inner Relations, |S|/|R| ), z Skewness in Key Distribution (usually of S),

Throughput of the state-of-the-art NOP and PHJ representatives (NOPA and CPRA resp.)

The # of hardware contexts/threads and the last-level cache (LLC) capacity per hardware context/thread in MBs for 428 Intel Xeon CPU models from 2006 to 2016, sorted by date

POLYHJ

I) Polymorphic Behavior  Extends beyond NOP and PHJ via decoupling the partitioning of R and S  Exploits locality naturally (for small R and/or highly-skewed S) and controllably (for large R and/or low-to-mid skewness)  Selects dynamically the best paradigm for the given workload and hardware

II) Redesigned Hash Join Phases  In-place, Cache-aware Partitioning (ICP)  Limits random scatters to cached blocks  Results in only 1 cache/TLB miss per line/page  Collaborative Building and Probing (ColBP)  Allows each hash table to be as large as aggregate LLC capacity  Reduces coherence traffic and NUMA latency via scheduling write tasks on hash tables across LLCs

Simplified illustrations of ICP (left) and ColBP (right)

RESULTS and CONCLUSIONS

Throughput of PolyHJ against NOPA and CPRA as we scale the input sizes (left) and as we vary the skewness in the keys of S (right). Over all of our experiments, PolyHJ outperforms NOPA and CPRA by averages of more than 1.8X and 2X and up to 3.7X and 91X, respectively.

 Conclusions:  Future Work:  No one-size-fits-all hash join paradigm; dynamic selection  Generic, implementation-agnostic paradigm selection is important and effective  Collaborative partitioning and isolated building and  Aggregate LLC-awareness helps mitigate cache wall problem probing as approaches to tackle the cache wall problem

Sherlock: A crowdsourced system for automated semantic tagging of indoor floorplans Author

Muhammad A. Shah

Advisors

Khaled A. Harras Bhiksha Raj

Category

Computer Science

Abstract The existence of accurate semantically rich indoor maps can lead to a significant growth in indoor, location based applications. In recent year the problem of indoor mapping has received a lot of attention but, while the existing systems can generate accurate floorpans, they do not provide semantic tags for the spaces they map. Without knowledge of the environmental context of the user, location-based applications would remain highly limited in their efficacy. In this paper we propose Sherlock, a crowdsourced system for automatic tagging of indoor floorplans. Sherlock leverages the capability of modern smartphones to gather a wide variety of data pertaining to their physical environment and the behavior of their users, in order to infer the purpose of the surrounding space. The data is obtained from the myriad of sensors present on modern smartphones as well as the cameras and microphones. Sherlockâ&#x20AC;&#x2122;s novelty lies in tackling environment recognition as a learning problem rather than a matching problem (as has been the case in the current literature) in that it builds up confidence in a particular label as more data is available allowing the labels to be flexible with the current usage of the room. Furthermore, Sherlock is unique in its usage of acoustics to model the physical characteristics of the environment within the domain of indoor mapping. The implementation of Sherlock is evaluated on data collected at Carnegie Mellon University In Qatar.

The Hive: An on-edge middleware solution for context and resource sharing in the Internet of Things Author

Aliaa Essameldin

Advisors

Khaled A. Harras

Category

Computer Science

Abstract Today we are witnessing an unprecedented proliferation in smart-devices, but regardless of how many microphones a user has lying around the house, they still will not be able to reach a voice-activated application if they are not close enough to its hosting device. This is because application developers are restricted by the hardware capabilities of the device on which their application is running, the limitedranged microphone in this case. Despite the immense work that is being done on resource sharing in the Internet of Things, there are few systems that allow seamless sensors sharing and computational offloading between smart devices. Most of those systems either use the cloud, which is very costly, or require homogenous hardware drivers, which is very restrictive. In this project, we propose a full architecture and supporting protocol for our Middleware system, The Hive. The Hive relies on in device-to-device communication to permit collaboration between IoT devices on-edge while providing an interface which completely decouples the application layer in the network from the hardware layer. We also describe our implementation and evaluation of a prototype of the boot-up and sensors-sharing portions of the system.

The Hive: An On-Edge Middleware Solution for Context and Resource Sharing in The Internet of Things Aliaa Essameldin and Khaled A. Harras Carnegie Mellon University in Qatar I. MOTIVATION & BACKGROUND A) Motivation

B) Current Approach

Provide low-cost and standardized mechanism for devices in an IoT to communicate and offload computations on-edge Harness opportunities missed by devices in an IoT to utilize sensor, contextual and computational resources on the network

C) Our Vision

Cloud Servers are used to allow communication of context between applications on different devices. Sensing on an application is limited by the sensors the hosting device has on board.

Direct links permit on-edge communication of context between applications on different devices. Sensing on an application can be done seamlessly using any of the sensors on the same network

II. PROPOSED SYSTEM A) The Hive Protocol Overview The protocol has three classes. 100 class packets are on UDP while 200 300 classes are on TCP. The classes interact as shown:

C) An Example Scenario

Itâ&#x20AC;&#x2122;s centralizes in The Queen, which has the longest up-time. The election process is described below:

In this Example, ALG is a which checks if user is in the room as providing bee.

B) The Hive Architecture Overview

1) Registers application 2) Requests sound providers 3) Reserves Sound Provider 4) Sends Providers List 5) Confirms registration 6) Requests sound data 7) Streams sound Data

IV. EVALUATION AND RESULTS

III. AUDIO PROOF-OF-CONCEPT

To further evaluate the cost of using the Hive, we ran each of these experiments (on the same set-up) 10 times on recording lengths ranging from 1 to 5 and plotted averaged results:

We developed a prototype using an ALSA-based sound component and a localizer as a decider factor (ALG) for picking the data stream. The prototype ran on the set-up shown to the right. In this set-up, we captured the voice of a consistent beep that is steadily moving from Bee1 (the laptop, to Bee 2 (a raspberry Pi)

Amplitude

Sound on Laptop Sound on Raspberry-Pi

Hive senses on Laptop

User moves towards pi Sample ID

Sound as collected on Laptop with using the localizer

Exp1: Application on Bee1 requests data directly from the microphone Exp2: Application on Bee1 requests data from the hive, which collects it locally Exp3: Application on Bee2 requests data from the hive, which pulls it from Bee1 As shown, using the hive does not add delay in case of local sensing and adds negligible delay that is proportional to the data size in case of remote-sensing.

Hive senses on Pi

User moves towards pi

8) ALG returns FALSE 9) Registers with Queen 10) Reserves provider 11) Updates Providers List 12) Requests sound data 13) Streams sound data 14) ALG returns True (data is passed to application)

Sample ID

V. FUTURE WORK

Sound as collected on Laptop (Dark Blue) and pi (light blue) without using the localizer

As shown, when the hive detected that the user moved from next to the laptop to next to the pi it successfully started sensing from the pi instead of the laptop.

In the future, we want to incorporate computational offloading systems into our prototype and develop our architecture to support distributed context awareness.

[1] ÂŠ Wikibon IoT Project. Reference Models AWS IoT Service & Pivot3 Server SAN. Assumption Edge reduces IoT Traffic by 95%.

Optimizing electricity consumption in GEMTEC Author

Ameera Tag Maher Khan

Advisors

Selma Limam Mansar

Category

Information Systems

Abstract: General Electric has a factory in Dammam that consumes a lot of resources mainly electricity, which is associated with high cost that is unnecessary. For example, even in cooler winter months the AC units are running 24/7. GE wants to optimize the factory energy consumption in one of their factories, GEMTEC, and make it economically sustainable. The objective of this project was to acquire the data emitted by the various machines that consumes energy in GEMTEC and then to predict and to optimize energy savings in the form of power and other commodities at the factory. After getting the necessary data, the team performed data analysis and market research and created a prediction model based on linear regression. Based on these results, the team proposed a few solutions, one of which is a developed application, which suggests options on how to turn off certain parts of the air conditioning system while still maintaining the optimum temperature in the factory.

Optimizing Electricity Consumption in GEMTEC Ameera Tag & Maher Khan Advisor: Selma Limam Mansar Carnegie Mellon University Qatar

Introduction

General Electric has a factory in Dammam that consumes a lot of resources mainly electricity, which is associated with high cost that is unnecessary. For example, even in cooler winter months the AC units are running 24/7. GE wants to optimize the factory energy consumption in one of their factories, GEMTEC, and make it economically sustainable. The objective of this project was to acquire the data emitted by the various machines that consumes energy in GEMTEC and then to predict and to optimize energy savings in the form of power and other commodities at the factory. After getting the necessary data, the team performed data analysis and market research and created a prediction model based on linear regression. Based on these results, the team proposed a few solutions, one of which is a developed application, which suggests options on how to turn off certain parts of the air conditioning system while still maintaining the optimum temperature in the factory.

Methodology Data Analysis:

GEMTEC provided electricity consumption data for 6 months from August 2015 until January 2016. The data was screened to remove incomplete fields, and to correct calculations. Then we calculated all correlations and analyzed the data focusing on the dependencies. The result showed that the chillers, pump and compressors together consume about 50% of the total electrical energy.

Solution AHUs Optimizer:

We developed a solution, which is in the form of a small Java Script application. Based on the expected temperature for the next day and the predicted load on the machines in the factory, the app calculates the total heat that will need to be removed from the factory and then suggests the number of AHUs that need to be running. The application is expected to be used by the facilities management team at GEMTEC to optimize their use of resources.

Market Research:

Research on how other industries are optimizing their energy consumption at factories was conducted. The research findings concluded that GEMTEC will need to collect data on the load and energy consumption of all machines at the facility. However, this is a high cost for GEMTEC.

Prediction Model:

A prediction model was developed to predict how much energy will be consumed the next day based on the historic data given by GEMTEC. The output of that model had 81.66% accuracy according to the R-square value and the average error margin was approximately 1000KWH.

An Optimization Strategy:

Another part of the solution that we developed is a long-term optimization strategy using a graph theory approach known as finding the shortest weighted path. Basically, given a machine, we develop vertices for each state of the machine at each interval of time. Then, these vertices are connected with weighted directed edges, which represent the cost of changing from one state to another from one to the next interval of time. The cost of each edge is to be calculated using dynamic cost functions, which should take in as parameters temperature outside, temperature inside, number of people in the building, etc. The strategy then must calculate the minimum cost paths from one vertex to any other vertices at a future interval of time. The strategy thus will predict the future state changes of the machine in order to optimize the best use of resources.

Project Sustainability sd

Graph Theory:

To calculate the cost of switching on/off and running time of chillers, a graph theory approach known as finding shortest weighted path was explored which is a fundamental tool in optimization tasks. The team came up with a general optimization template that can be scaled for any number of states for a particular machine. GE and GEMTEC team and we agreed that graph theory is to be considered as a long-term recommendation in future improvements on the factory.

The team believe that using the AHUs Optimizer will help GEMTEC save energy based on switching the state of AHUs from On to Off and vice versa based on the outside temperature. The AHUs Optimizer is a sustainable solution and does not require advanced development skills due to its simplicity in telling GEMTEC the number of AHUs needed to maintain the optimal temperature desired. GEMTEC can focus on collecting data related to the load of a machine during a day by placing sensors on machines that are used heavily in production. Currently, it was not possible to conduct such analysis due to the limited number of sensors placed on machines and the fact that the load on machines was not tracked. Collecting such data will help GEMTEC to obtain a better prediction on energy consumption at production day vs non-production days and create a strategy to save more energy by changing the state of different machines.

To read or to listen? A study of user engagement in a digital heritage artifact Author

Sara Jumah Albaloshi

Advisor

Divakaran Liginlal

Category

Information Systems

Abstract: This research studies user engagement when subjected to an intangible digital heritage artifact. It specifically focuses on Arabs and whether they rely on their hearing sense more than visual senses. For instance, Arabs in the past relayed on their hearing sense in many aspects of their life such as finding water, animal whereabouts, for schooling, and even for entertainment with singing and telling stories. The study has two main contributions. First, is a digital artifact named “Batoola” that contains songs, stories, pictures and animations from Qatar’s cultural heritage. The main aim of the artifact is to preserve the fragile intangible culture and heritage of Qatar through a digital platform. The second contribution of this research is a methodology named “Batoola”. The Batoola methodology combines different strategies that are used to examine user engagement and affective response as users engage with a digital cultural artifact. The data collected will help gain insights into both design and content strategies for improving the rendering of digital intangible heritage materials.

To Read or to Listen? A Study of User Engagement in Digital Heritage Artifacts Sara Albaloshi | Professor Divakaran Liginlal

Batoola – A Cultural Heritage Artifact that helped to build a methodology This research studies user engagement with an digital artifact meant to preserve Qatari

‫َبطولة‬ ّ

heritage. It asks the question, “do Arabs prefer to read or to hear?” I explored this question by creating “Batoola” a digital artifact that contains songs, stories, pictures and animations from Qatar’s past. A major contribution of this research is a methodology named “Batoola”. The Batoola methodology combines three strategies to examine user engagement and affective response when users interact with a digital cultural artifact. The data collected provides insights into both design and content strategies for improving the rendering of digital intangible heritage materials.

Batoola as an artifact serves the purpose of preserving intangible cultural heritage of Qatar through a digital platform by displaying a collection of songs, stories, information, pictures and animations that reflect the past. The pictures used in Batoola are designed to interact with the user. For instance, some pictures have zooming effects and an audio that plays once the user hovers over a picture. Other pictures are divided into several zones, where in each zone there is a soundscape that reflects the nature of the environment in that zone.

Research Methodology and Procedure

Stage I

For this study, we have developed a methodology specifically for measuring users’ engagement as they view an intangible digital heritage artifact. The methodology

participants are assigned to a task

combines different strategies used to measure engagement such as the PANAS Scale,

either to listen or to read the contents of selected pages from Batoola

PANAS questionnaire & interview questions administered

the 7-item focused attention subscale, and screen capture software (Morae). While

some of these strategies are common to other user study methods, it is the

MORAE

Stage II

combininattion of qualitative and quantitative methods that makes it unique and repeatable. We recruited 60 participants for this study. The study was divided into two main stages.

participants are assigned another task

to view and navigate Batoola for 5 minutes. Morae screen capture software will record their interaction with the artifact

participants will be asked to fill a Focused Attention questionnaire and answer interview questions

Results Both 16.67 %

Listening 70 %

The PANAS scale showed higher negative and positive affective response for Mean Positive and Negative affective reponses are significantly higher for listening

Reading 13.3 %

Users Preference

MORAE

listening. Also 70% of participants reported their preference for listening over reading. After exploring Batoola, most participants reported that they lost sense of the flow of time and were fully immersed in their task. Morae-based qualitative data analysis was used to backup this claim.

data shows intense participant engagement with Batoola

The recordings showed that participants were deeply engaged in using the zoom features and listening to soundscapes.

Trustmarks and trust in Qatar Author

Roda Al-Hor

Advisor

Daniel Phelps.D.

Category

Information Systemsnformation Systems

Abstract Trustworthiness of online merchants has many factors to it. Many conducted researches focus on how the design of the website affects customer’s trust. This project examines how Trustmarks – a verified mark from a credible source influence the perceived trustworthiness of websites. It is important because Trustmarks are often used in ecommerce website and research still did not consider its effect on customers. This project is divided into two sections; online, and in person. The online section participants receive a link to the experiment. First, they encounter a consent form that they must agree to. Then they are directed to the ecommerce website to navigate it and look through it either with or without a trustmark showing. Then they are redirected to the questionnaire which is an instrument to measure the perceived trust of the customer. The in-person section has the same steps as the online section except of one addition. The participant of the in-person part of the experiment will calibrate to an eye tracking device to examine if they look at the trustmark displayed. An analysis of the data so far shows that a trustmark affects people’s trust positively. 85.7% of the participants who were shown the trustmark had higher trust measures while only 28.6% of the people who have not been shown the trustmark chose to trust the website they saw for the first time.

Have lived in Qatar for 2 years.

years

18 years or older

18 Speaks English or Arabic

Enor Ar

The experiment started after the signing of the consent form where participants spent a few seconds calibrating to the eye tracking machine. Then, they explored the website for 3 minutes and were redirected to the survey afterwards. The importance of the in-person part of this study is to test if people notice the verified trustmark seal in the header of the website and whether it influences their perceived trust.

In Person Part:

Inclusion Criteria

Submit

An online experiment was sent to people that meet the criteria. It starts with a consent form that participants must agree to in order to continue. Then they were redirected to a webpage that they explored for 3 minutes. After that, participants were redirected to a survey that serves as an instrument to measure trust.

Online Part::

Methodology

The results of this research benefit online merchants by providing them with the either positive or negatives effects of Trustmarks on their website.

Beneficiary

The purpose of this research is to test whether trustmarks (verification from credible sources) affect the customers trust of an ecommerce merchant. The results will be one of three, 1. trustmarks enhance peopleâ&#x20AC;&#x2122;s trust so online merchants should incorporate a verified trustmark, 2. trustmarks reduce the customersâ&#x20AC;&#x2122; trust, or 3. trustmarks have no effects on customersâ&#x20AC;&#x2122; trust.

Purpose:

only 28.6% of participants not presented with a trustmark trusted the website

of trust when shown the trustmark

85.7% participants had higher measurements

and Trust in Qatar

Not Present

Present

Trustmark

14.3%

46.4%

people trusting the website

14.3%

39.3%

people slightly trusting the website

Roda Al-Hor || Advisor: Daniel Phelps

After analyzing the results of the study so far, it has been found that the trustmark has a positive effect on perceived trust. Including a credible trustmark in an ecommerce website will result in higher customer trust. This could possibly generate more revenue as trust is the foundation of success in ecommerce. People trust websites that look and feel familiar to them, therefore when a trustmark of a well known, verified source is presented with a webpage they see for the first time, the trust measurements increase.

Conclusion

The table shows participants trust measures so far.

In person The 10 participants who took part in the experiment so far have been glancing at the verified trustmark while browsing the website. The participants had high trust measurements implying that the trustmark affects the perceived trust of customers.

Online The experiment has been conducted on 14 people so far. The result show that 6 out of 7 people who had the trustmark showing had a higher trust measurement than the 7 people who did not have the trustmark showing.

Results

Trustmarks

Influence of culture on social media advertisements through eye-tracking Author

Manisha Dareddy

Advisor

Jim Jansen, Qatar Computing Research Institute, Ph.D.

Category

Information Systemst-Graduate

Abstract The focus of this project was to utilize eye tracking software to analyze the visual behavior of users on social media platforms. The aim of this experiment in specific was to investigate whether or not localization of an online advert on social media is able to attract more attention from users relative to a non-local ad. Various hypotheses were identified which included variations of advertisement types: localized and non-localized. A pilot test was conducted and the heat maps from eye tracking software was used to do further analysis through pixel analysis.

2. EyeProof recorder 3. Adobe Photoshop CC

• Software: 1. EyeTribe UI

1. Eye tracker by EyeTribe 2. USB cable

• Hardware:

Results

Amazon fire phone with international background Amazon fire phone with Qatari background Starbucks ad with Qatari names

Starbucks ad with international names

Starbucks ad with Qatari names

Nike ad with international costume

Nike ad with Qatari costume

The EyeProof software provided us with a heat map for every image in our experiment along with many other features such as scan path, bee swarm, etc. The results were analyzed via a pixel analysis of the heat map to see if the attention of the participant was more focused on the localized ads or the international ads. Some of the results that I found were: • the viewing pattern was looking at an image, first, then comments, and then looking at image again. • the costume had the biggest effect on catching attention, compared to the other two conditions • the users who read the Qatari news article gave consistent attention to the article until the end, whereas the attention faded after a few paragraphs for the foreign news article.

Starbucks ad with international names

Nike ad with international costume

Nike ad with Qatari costume

•

Equipment

• Piao, G., Jin, Q., Zhou, X., Nishimura, S., Wattanachote, K., Shih, T. K., & Yen, N. Y. (2013, December). Eye-tracking experiment design for extraction of viewing patterns in social media. In Ubiquitous Intelligence and Computing, 2013 IEEE 10th International Conference on and 10th International Conference on Autonomic and Trusted Computing (UIC/ATC) (pp. 308-313). IEEE. • Lenzner, T., Kaczmirek, L., & Galesic, M. (2014). Left feels right: A usability study on the position of answer boxes in web surveys. Social Science Computer Review, 32(6), 743-764

EyeTribe http://www.eyetracking.com/About-Us/What-Is-Eye-Tracking

References

• Then they were asked to fill in a demographic survey that had questions like their age, the number of years they lived in Qatar, whether they have an Instagram account and how often they use it.

• Each participant was assigned one of the 6 variations of experiments

• After completing the study they were asked a few questions based on what they saw and their responses were recorded in the exit survey.

• Run them through a “Where’s Waldo” task to make sure results that were being collected were accurate

I conducted a pilot study consisting of 6 people. Procedure: • Introduced subject to hardware.

Experiments

• They were then allowed to carry on with the study with 4 scenarios written before each task

Research Objective: To utilize powerful eye tracking package device to investigate whether or not localization of an online advert on social media is able to attract more attention from users relative to a non-local ad. In order to compare the effect of the above, I chose 2 copies of an advertisement of a company for each of the three hypotheses, and I photoshopped one copy of each to add a Qatari touch to it. Hypotheses: 1) Costume of characters in an advertisement has an effect on attracting attention 2) Traditional font type text in an advertisement grabs more attention 3) Background scenery of an advertisement catches more attention if it was a local image.

Methodology

mdaredd1@qatar.cmu.edu | bjansen@qf.org.qa

• Previous research has analyzed how people search for information on a Google search page (e.g., Piao, Zhou, Jin & Nishimura, 2013). • Timo, Lars, & Mirta (2014) investigated the effect of positioning of different elements of a webpage, like search box, images, check boxes, across different locations on the page. • Yang, Chang, Cheng, & Teng (2011) captured the different fixation times of a user who was given a picture and a word below it and then asked to say if the picture and the word matched.

Background: • To make use of the eye tracking technology which has a lot of potential to analyze how people act visually to information. Literature review:

Motivation

Manisha Dareddy | Dr. Jim Jansen

Influence of culture on social media advertisements through eye-tracking

The effect of culture on image appeal and social presence in Arab e-commerce websites Authors

Noor AlQaedi

Advisor

Divakaran Liginlal

Category

Information Systems

Abstract Prior research has determined that using human images in ecommerce websites has certain benefits like gaining user trust, attention and loyalty. However, the influence of culture on those human images is not tested appropriately. The ultimate objective of the research is to better understand the influence of cultural appeal on the design of Arab e-commerce websites, i.e., how best the website reflects Arab culture and to what extend Arabs feels culturally related to Arab websites. The proposed study combines multiple methods such as eye-tracking, questionnaire, and interview to specifically understand how the use of Arab human images impacts culture affinity, perceived social presence and image appeal for an Arab user. Three versions of an e-commerce website selling cameras was built for this study â&#x20AC;&#x201C; one with an Arab female image, another with a Western female image, and a third with no human image. All three websites displayed the same products (cameras) and sections. The specific objective of eye tracking is to test whether or not the human images gained participant attention and for how long. Thus, eye tracking showed results of time to first fixation, fixation count, first fixation duration and total fixation duration of human images. The specific objective of the questionnaire is to gauge the level of culture affinity, perceived social presence and image appeal toward the website. Finally, interview questions are to yield more qualitative information on the specific kinds of human images that are used and how best to display them on e-commerce websites. All experiments were done in Arabic and then translated to English. The experiment targeted Arab participants only. Results have shown that (in progress).

INTERVIEW

EYE-TRACKING

WITH WESTERN

IMAGES

WITH ARABIC

IMAGES

(CAMERAS ONLY)

NO HUMEN IMAGE

Using three shopping camera websites

QUESTIONNAIRE

RESEARCH METHOD

To show that culturally friendly Arabic images make a diﬀerence to Arab users.

RESEARCH AIM

The inﬂuence of culture on those human images speciﬁcally in Arabic context is not tested appropriately.

GAP

(Qiaohong, Bo, 2016) & (Gefen,2000): Human images make a difference on e-commerce website (Hassanein, & Head, 2007) & (Cyr, 2009): Human images in e-commerce portals is an effective way of appealing to consumers across specific and different cultures.

PRIOR WORK

“Does the cultural appeal of an Arab e-commerce website result in enhanced social presence and image appeal, in turn increasing trust, loyalty and attitude toward the website?”

RESEARCH QUESTION

APPEAL

IMAGE

ATTITUDE

CULTURE AFFINITY OF HUMAN IMAGES

TRUST

PRESENCE

SOCIAL

PERCEIVED

ALREADY DETERMINED

WHAT OTHER RESEARCHERS

WHAT THIS RESEARCH IS TESTING

ATTENTION

LOYALTY

This study testing the construct ‘culture affinity’ of Arabic images which is the degree to which the website reflects the targeted country's culture which is in this research Arab countries. Questionnaires will measure the level of culture affinity, perceived social presence and image appeal. Eye tracking will measure user’s attention.

Prior research has determined that perceived social presence & image appeal leads to trust which in turn leads to a user positive attitude and loyalty.

RESEARCH MODEL

THE EFFECT OF CULTURE ON IMAGE APPEAL AND SOCIAL PRESENCE IN ARAB E-COMMERCE WEBSITES AUTHOR : NOOR ALQAEDI ADVISOR : DIVAKARAN LIGINLAL

6.72

6.64

2.73

0.126

1.772

3.76

3.12

0.18

2.842

IMAGE

CULTURE AFFINITY

PRESENCE

PERCEIVED SOCIAL

IMAGE APPEAL

NO IMAGE

WESTERN

ARABIC

Greater attention was paid to the culturally Arab-inﬂuenced design. Limitations of the study design included the small sample size of the participant pool. This research provides insights for e-commerce businesses, web designers, and other researchers in Arab countries. This study may encourage other researchers to further investigate the use of culturally human images among in e-commerce websites.

CONCLUSIONS, CONTRIBUTIONS, AND FUTURE RESEARCH

During the last part of the interview, participants were asked to choose a website out of the three if they would purchase a camera. 73% chose the website that contained Arabic images because of loyalty, trust and to support the Arab e-commerce.

INTERVIEW:

4.5 4 3.5 3 2.5 2 1.5 1 0.5 0

Culture aﬃnity, perceived social presence and image appeal scored the highest for the website contacting Arabic images.

QUESTIONNAIRE:

FIXATION COUNT

6.97

0.37

FIRST FIXATION DURATION

TOTAL FIXATION DURATION

1.572

IMAGE

IMAGE TIME TO FIRST FIXATION

WESTERN

ARABIC

The quickest attention were paid to Arabic images. Participants seemed to focus longer on the culturally familiar Arabic image, indicating greater interest.

EYE TRACKING:

RESULTS

Arabic author profiling for cyber-security Authors

Wajdi Zaghouani

Advisors

Anis Charfi

Category

Postgraduate

Abstract Author profiling can be useful for forensics investigations to narrow the set of potential authors when receiving a threat message. Our aim in this project is to build author profiling resources and tools for the Arabic language and applying them to cyber-security as an instrument for fighting against cyber-crimes. Our research will address different characteristics for author profiling such as the age, the gender, the native language, the language variety, and the author interests. Furthermore, we will work on detecting deception and irony in Arabic text, which is necessary for distinguishing serious content from humoristic content. The collected data can be used to intercept suspect events, like the preparation of a terrorist act in a social media stream. For the resources, we plan to create a suite of seven large-scale annotated Arabic author profiling resources. This includes annotated corpora for: (1) author interests, (2) gender detection (e.g. males vs. females), (3) age detection (e.g. youth vs. adults), (4) native language (e.g. native vs. non-native), (5) Arabic dialect variety (e.g. Qatari Arabic vs. Iraqi Arabic), (6) Irony detection (e.g. sarcasm vs. sincerity) and (7) deception detection in online message (e.g. truth vs. lies). Based on the collected resources we will build a set of tools using a combination of linguistic features and machine learning techniques in order to determine automatically several aspects of a given text written by an anonymous author. We will build tools to automatically infer the interest topics of the author (e.g. sport, religion, extremism, violence), to detect the gender and the approximate age of the writer, whether he is a native Arabic speaker or not and also the Arabic dialectal variety used. Furthermore, we plan to profile the author for his degree of sincerity or irony (e.g. if the text is likely to be based on truth or lies).

Arabic Author Profiling For Cyber-Security Wajdi Zaghouani, Anis Charfi Example 2

Author Profile of the Text in Example 1 Age: 18‐25 / Gender: Female Dialect: American English ‐ Native Language: English Interest: Music

Current Project Outcomes • Dialect Identification Corpus collected from Twitter for four Arabic dialectal varieties : Egyptian, Qatari, Lebanese and Moroccan (60K Tweets collected for each dialect) • Survey report on available Arabic author profiling tools and resources

Acknowledgment: This research was supported by Qatar National Research Fund (QNRF), NPRP grant 9‐175‐1‐033

Multi-Arabic dialect lexicon extraction Authors

Mohammad Salameh Nizar Habash, New York University Abu Dhabi Houda Bouamor Wajdi Zaghouani Kemal Oflazer

Category

Postgraduate

Abstract Arabic dialects are the daily spoken variety of the Arabic language and the dominant form of the language used by the web and social media users. They are often classified in terms of geography, social class, and possibly ethnic and religious dimensions. The myriad of orthographic and lexical varieties of forms that represent similar concepts and the lack of dialectal lexical resources of the Arabic language complicates several natural language processing tasks. The main goal behind this work is to build a multi-dialect Arabic lexicon that maps the dialects of twenty-five cities in the Arab world to their Modern Standard Arabic form. In our work, we present a method that exploits multilingualism in a parallel corpus to extract a trilingual (English, French, and Arabic) lexicon with high coverage. We apply a graph-based approach to capture semantically related concepts from lexicon entries. Also, we build a multi-dialect Arabic Atlas by mapping and translating concepts to their various dialectal forms.

Multi-Arabic Dialect Lexicon Extraction Mohammad Salameh1 , Nizar Habash2, Houda Bouamor 1, Wajdi Zaghouani1 and Kemal Oflazer1

Main goals: • • •

1Carnegie

Mellon University Qatar

2New

Extract a trilingual lexicon with high coverage Capture semantically related concepts from lexicon entries Build a multi-dialect Arabic Atlas by translating/mapping the concepts to their various dialectal forms

Approach and Results:

please

Step1: Tuple Extraction • • • •

York University Abu Dhabi

write you

number

room

the back

Align English, Arabic and French BTEC parallel corpus at the word level ‫رقم‬ ‫ فضل من‬+‫ك‬ ‫اكتب‬ +‫ في غرفة ال‬+‫خلف ال‬ Convert the cascaded alignment into an undirected graph Extract connected components from the graph Form tuples (EN, FR, AR) with their count Veuillez inscrire le numéro de votre chambre au dos frequency

Step 2: Clustering into Concepts

room chambre ُ ‫غ ْرفَة‬

Observation: Lots of semantics are shared between the extracted tuples Idea: • Cluster tuples into concepts sharing similar meaning • Convert the lexicon into a graph: • Vertex: corresponds to a tuple • Edge: between two vertices that sharing two words in the 3 languages. • Connected components in the graph corresponds to tuples sharing similar concept

Step 3: Translating into ten Arabic Dialects

room chambre ‫حُجْ َرة‬

number numéro ‫َر ْقم‬

English room

French

This work was made possible by grant NPRP 7-290-1-047 from the Qatar National Research Fund (a member of Qatar Foundation)

back dos ‫خ َْلف‬

write écrire ‫َكتَب‬

write ecrivez ‫َكت َب‬

write rédiger ‫َكت َب‬

write noter ‫َكت َب‬

MSA

ُ chambre ‫غ ْرفَة‬

• A concept is represented with the tuple of highest frequency in each cluster number numéro ‫َر ْقم‬ • A concept is translated to a dialectal form using two approaches: hotel hotel ‫فُ ْندُق‬ 1. Automatically through joining with existing dialectal lexicons: Egyptian, Levantine, Iraqi, Tunisian, Moroccan, Yemeni table table ‫ما ِئ َدة‬ 2. Manually through crowdsourcing The Turkers are native speakers of different Arabic dialects. They provide lexicon feedback and dialect annotation.

Acknowledgment:

back arrière ‫َخ ْلف‬

room salle ُ ‫غرْ فَة‬

Dialect Egyptian

‫ش َْمبانُوار‬

Levantine Egyptian

‫ضة‬ َ ‫أو‬ ‫َرقَم‬

Levantine

‫َرئِم‬

Egyptian Levantine

‫لُوكا ْن َدة‬ ‫أ ُ ْوتِيْل‬

Iraqi

‫فِ ْن َدق‬

Egyptian

َ ‫طرا ِبيزَ ة‬

Levantine

‫س ْفرة‬ ُ

Iraqi

‫مِ يز‬

Tunisian

‫طاولة‬

Yemeni

‫المايده‬

Moroccan

‫طبلة‬

ُ ‫غ ْرفَه‬ ُ ‫غ ْرفِه‬

‫اُوت َيل‬

back dos َ ‫ظ ْهر‬

About Carnegie Mellon University in Qatar For more than a century, Carnegie Mellon University has challenged the curious and passionate to imagine and deliver work that matters. A private, global university, Carnegie Mellon stands among the worldâ&#x20AC;&#x2122;s most renowned educational institutions, setting its own course with programs that inspire creativity and collaboration. Consistently top-ranked, Carnegie Mellon has more than 13,000 students and 100,000 alumni worldwide. At the invitation of Qatar Foundation, Carnegie Mellon joined Education City in 2004 to deliver select programs that will support and contribute to the long-term development of Qatar. Today, Carnegie Mellon Qatar offers undergraduate programs in biological sciences, business administration, computational biology, computer science, and information systems. More than 400 students from 40 countries call Carnegie Mellon Qatar home. Graduates from CMU-Q are highly sought-after: most choose careers in top organizations, both in Qatar and around the world, while a significant number pursue graduate studies at international institutions. With nine graduating classes, the total number of alumni is 570. To learn more, visit www.qatar.cmu.edu and follow us on: Twitter:

@CarnegieMellonQ

Instagram: @carnegiemellonq Facebook: CarnegieMellonQ YouTube: CarnegieMellonQatar

P .O .B ox 24866 | E du c a tion City, D oh a , Qa ta r | Ph : +9 7 4 4 4 5 4 8 4 0 0 w w w .qa ta r.c mu .e du /me e tin g -o f - th e - m i n d s