Meeting of the Minds 2012

Page 1

C

A

R

N

E

G

I

E

M

E

L

L

O

N

Meeting of the Minds U

N

I

V

E

R

S

I

T

Y

I

N

Q

A

T

A

R


Meeting of the minds is an annual symposium at Carnegie Mellon University that gives students an opportunity to present their research and project work to a wide audience of faculty, fellow students, family members, industry representatives and the larger community. Students use posters, videos and other visual aides to present their work in a manner that can be easily understood by both experts and non experts. Through this experience, students learn how to brindege the gap between conducting research and presenting it to a wider audience. A review committee consisting of industry experts and faculty members from other universities will review the presentations and choose the best projects and posters. Awards and certificates are presented to the winners.




Table of Contents

POSTER # TITLE

PAGE

Business Administration Poster

Q1 Islamic Finance Meets Wall Street

1

Computer Science Posters

Q2 Building a Virtual Computer from the Ground Up

3

Q3 Developing Scenarios for a Qatar-specific Road Safety Simulator

5

Q4 Evaluation of the Ability of a Robot to Embody Different Cultural Traits

7

Q5 Evaluation of Variations in Giving Directions Across Cultures

9

Q6 Image Processing on the Cloud: Characterizing Edge Detection on Biomedical Images

11

Q7 Malware Inc. – Facebook and Google AppEngine

13

Q8 Malware Inc - Web Browsers

15

Q9 Multi-Robot Simulation

17

Q10 Projecting Named Entity Boundaries from English to Arabic

19

Q11 SCOUT: Extending the Reach of Social-Based Context-Aware Ubiquitous System

21

Information Systems Posters

Q12 A3 (A-Cubed)

23

Q13 EZ Intern: Internship Tracking System

25

Q14 Lost & Found

27

Q15 MoltaQatartans: Tartans Forum System

29

Q16 Using Mobile Technology for Enhancing Young Qatari Health Behavior

31

Humanities and Social Sciences

Q17 Gettin’ the Flow; Makin’ Good Grades

33

Q18 Service Learning at CMU-Q: Motivations, Gains, and Challenges

35

Post-Graduate Posters

QG1 Challenges in Mobile Opportunistic Networks

37

QG2 Characterization of Hadoop MapReduce Applications

39

QG3 CoGRS: A Center-of-Gravity Reduce Task Schedule for MapReduce

41

QG4 GreenLoc: Energy Efficient Wi-Fi-based Indoor Localization

43

QG5 Hala 2.0: Considerations for Developing a Test Bed for Multi-Lingual,

Cross-Cultural Human Robot Interaction

QG6 Performance Prediction of MapReduce Applications in Elastic Compute Clouds

QG7 SmartReader: A natural language processing-based active and interactive system

45 47

for accessing English language content and advanced language learning

49

QG8 VOtus: A Flexible and Scalable Monitoring Framework for Virtualized Clusters

51


Islamic Finance Meets Wall Street Author Edmond Abi Saleh (BA 2011)

Faculty Advisor Patrick Sileo, Ph.D

Category Business Administration

Abstract This research explores the feasibility of establishing Islamic banking and finance within the capitalist American financial system. This is a relevant and pressing topic seeing as Muslim communities have, for centuries, slowly and surely established themselves within American society. It is thus interesting to look at their financial participation in the American market especially in the wake of the economic crisis. More precisely, how the American economy versus traditional Islamic finance has faced the crisis and how this potentially affects capitalism in the United States. In that regard, the research will argue that: Islamic finance (through canonical texts and their application within Muslim communities) is only partly in accordance with the American capitalist economic practices, which means that it cannot be implemented in the United States on a large scale or in a precise and correct manner (meticulously following Islamic law). The research will put forth this argument by first examining how Muslim communities came to America and how well-established or influential they are. Second, it will analyze what these specific communities’ religious values say about the economic sphere and how it should operate. Finally, all this information will be brought together to look at present day Islamic finance in the United States. The objective is to assess how compatible it can be with the American capitalist values arguing that those two financial models are too divergent in their essences to extensively co-exist at present. In a second part, the research will analyze the IMXL, which is the Dow Jones Islamic market index. The analysis will try to look for similarities or gaps between Islamic markets and regular markets. This quantitative analysis will reveal that there is little or no difference between the Dow Jones Islamic market index and the regular Dow Jones Index. The analysis will also suggest hypotheses as to the behavior of the Islamic index in the market.

1



Building a Virtual Computer from the Ground Up

Authors Kenrick Fernandes (CS 2014) Jyda Moussa (CS 2014)

Faculty Advisor Kemal Oflazer, Ph.D.

Category Computer Science

Abstract: In this work, we present our exploration of understanding and building a full-fledged computer system from scratch – from the gate level all the way to the software levels. We implemented this computer using the tools contained in the Elements of Computing Systems framework. For this independent study, we were motivated to obtain a better understanding of how a modern computer functions, from the hardware level right up to the interfaces that users work with. Building a virtual computer from the gate level required us to build the individual gate components, followed by an Arithmetic Logic Unit (ALU) and finally a Central Processing Unit (CPU). We then implemented an assembler to translate assembly language programs to machine language that would run on the computer we built. Our poster aims to describe our experience in this independent study as well as the individual projects we have implemented to date.

3



Developing Scenarios for a Qatar-specific Road Safety Simulator Author Raggi al Hammouri (CS 2012)

Faculty Advisor Brett Browning, Ph.D.

Category Computer Science

Abstract: Driving simulators are increasingly being utilized as training devices to complement traditional in-vehicle driver training. By allowing scenario-based coaching and tailoring the instructive content to the student’s learning needs, driving simulators can play a significant role towards improving road safety. Williams Technology Centre in Qatar has started a project to design and build a customized driving simulation platform for the advanced training of drivers in Qatar. This work focuses on the creation of the programs that specify the scenario, and control the behavior of the simulated vehicles in the Williams’ driving simulator. We aim to develop a rich environment where subject drivers encounter realistic modeling of driving behavior as found in Qatar, and controlled presentation of events and situations. We have created a road and a number of traffic models for Doha, as well as specific training scenarios. We describe the architecture of the scenario subsystem of Williams’ driving simulator, focusing on the representation of the virtual world and the control of vehicle behavior. We present some example scenarios and show the results of these scenarios in practice.

5



Evaluation of the Ability of a Robot to Embody Different Cultural Traits Author Amna AlZeyara (CS 2014)

Faculty Advisors Majd F. Sakr, Ph.D. Micheline Ziadee

Abstract: Our focus is the development of effective multi-lingual, cross-cultural Human-Robot-Interaction. In this work, we attempt to understand the different visual accents in Arabic and American facial expressions and create culture-specific facial expressions for a female multi-lingual, cross-cultural robot testbed. Our work is twofold: the identification of the existence of accent variation in facial expressions across cultures and the validation of human recognition of theses accents. Facial expressions are embodied in culture and are crucial for effective communication; hence they play an important role in multi-lingual cross-cultural Human-Robot-Interaction. Elfenbein and Ambady found that there are different accents in facial expressions which are culture-specific and that the differences in expressions between cultures can create misunderstandings. Several studies compared American expressions with expressions from other cultures but none of them included Arabic facial expressions. In this work, we utilize two sets of facial expressions; one specific to the Arabic culture and the other specific to the American culture. We base the American expressions on samples from MMI Facial Expression Database, a web-based database for facial expression analysis. However, no such database exists for Arabic facial expressions. Consequently, we recorded videos of young Arab women narrating stories that express six emotions: happiness, sadness, surprise, fear, disgust, and disappointment. These videos are analyzed to extract Arabic accents in facial expressions. Accents from both Arab and American cultures are then implemented on a 3D model. We validate the existence of these accents by investigating whether humans can detect differences between Arabic and American facial expressions. Our initial observations have shown differences in anger expressions. Arabic expression of anger is characterized by raised eyebrows while the American expression of anger, as described by Paul Ekman, is highlighted by lowered, drawn-together eyebrows. Our next step will involve creating culture-specific linguistic content and pairing it with culture-specific expressions.

7



Evaluation of Variations in Giving Directions Across Cultures Authors Huda Gedawy (CS 2012) Micheline Ziadee (Research Assistant)

Advisor Majd F. Sakr, Ph.D.

Category Computer Science

Abstract This work explores the differences in direction-giving strategies between two groups, native Arabic and native English speakers. This study will help influence design decisions for multi-lingual, cross-cultural human robot interactions. There are clear cultural influences on modes of communication. Previous research studies found that directiongiving techniques and strategies vary between different cultural groups. Burhanudeen compared Japanese and English native speakers and found that locator remarks are more frequently used by Japanese natives, while the use of directives is more common with native English speakers. In this work, we examine the discourse for navigation instructions of members of two target groups, Arabic native speakers and English native speakers. We address the following questions: How does language and strategies used for providing directions vary between these two groups? What are the differences and what are the similarities? Are there any possible gender-related differences in giving directions? We recorded 56 participants giving oral direction instructions for 3 specific locations in the Carnegie Mellon Qatar campus. We transcribed the oral recordings and annotated the texts by categorizing the directional expressions into: landmarks, units (time and distance), cardinals, movement commands, conditional movement commands, advisory commands and positional commands. Non-directional commands, like intermediate details, were categorized as fillers. Our analysis also included the number of pauses, repetition and error correction. Our results showed that when giving directions, English native speakers use a considerably higher number of words than Arabic native speakers. Also, the difference in number of words between males and females in the first group is much higher than that in the second group. English native speakers had a higher frequency of using cardinals, intermediate information, and filled pauses. On the other hand, Arabic native speakers differed greatly from English native speakers in their use of units of distance and repeated information. Based on these results, we conclude that culture, language, and gender influence a speakers discourse and strategy for giving directions. 9



Image Processing on the Cloud: Characterizing Edge Detection on Biomedical Images Author Manoj Reddy (CS 2013)

Faculty Advisors Mohammad Hammoud Majd F. Sakr, Ph.D.

Category Computer Science

Abstract In order to analyze and deduce valuable information from big image data, we have developed a framework for distributed large-scale image processing in Hadoop MapReduce. A vast amount of scientific data is now represented in the form of images from sources including cameras, medical tomography and astronomical remote sensing. Applying algorithms on these images has been continually limited by the processing capacity of a single machine. MapReduce created by Google presents a potential solution. MapReduce efficiently parallelizes computation by distributing tasks and data across multiple machines. Hadoop, an open source implementation of MapReduce, is gaining a widespread popularity due to features such as scalability, fault tolerance and ability to use commodity clusters. Hadoop is primarily used with text-based input data. In contrast, its ability to process images has not been fully explored. We propose a framework that efficiently enables image processing on Hadoop. Existing approaches in distributed image processing suffer from two main problems: (1) input images need to be converted to a custom file format and (2) image processing algorithms require adherence to a specific API that might impose some restrictions on applying some algorithms to Hadoop. Our framework avoids these problems by: (1) bundling all small images into one large file that can be seamlessly parsed by Hadoop and (2) relaxing any restriction by allowing a direct porting of any image processing algorithm to Hadoop. A Reduce-less job is then launched where the code for processing images and a mechanism to write them back separated to HDFS are included in Mappers. We have tested the framework with a state-of-the-art image processing algorithm, Edge Detection, on a large dataset, 3760, of biomedical images. We observed that using Hadoop Sequence Files to bundle a large number of small images into a large sequence file might not be the most efficient solution, as we observed a storage overhead of up to 3x. Furthermore, and to examine Hadoop’s behavior with image processing, we are characterizing Edge Detection along several dimensions, such as degree of parallelism and network traffic patterns. Our characterization study has shown that varying the number of map tasks has a significant impact on Hadoop’s performance. The best performance was obtained when the number of map tasks equals the number of available slots, as long as the application resource demand is satisfied. Additionally, we observed a speedup of 2.1X as compared to the default Hadoop configuration.

11



Malware Inc. – Facebook and Google AppEngine Authors Talal Al-Haddad (CS 2013) Manoj Reddy (CS 2013)

Faculty Advisor Thierry Sans, Ph.D.

Category Computer Science

Abstract A new generation of software has emerged with the increasing popularity of web and cloud-based applications. Cloud computing platforms such as Facebook and Google AppEngine allow third party developers to build applications for users that run on a remote infrastructure. These new platforms give rise to new security threats and a new generation of malware. Malware, short for malicious software, is software designed to disrupt computer operation, gather sensitive information, or gain unauthorized access to computer systems. In this work, we aim to evaluate the security risk of exposure to malware on the two popular cloud application ecosystems: Facebook and Google AppEngine. Google AppEngine applications are hosted on Google infrastructure and can access Google services and also user’s personal data such Gmail and Google Calendar. As a proof of concept, we have developed a malware that pretends to do useful statistics on the user’s mailbox but in fact, this application searches through emails for logins, passwords and credit card information. When found, these data are sent to a remote web server controlled by the attacker. This operation cannot be detected by the user. Facebook allows developers to write applications that can perform a wide variety of tasks such as collaborative gaming and access user information for social statistics. Contrary to Google, these applications are not stored on Facebook servers but on servers controlled by the developer. As a consequence, when users use such applications, their profile data can be stolen and sent to a remote web server similar to the case of Google AppEngine. The malware applications demonstrate that it is easy to deploy malicious applications on cloud platforms and this is a huge cause for concern since ever increasing amount of sensitive data is stored online. Future work includes developing techniques for automated profiling of applications to sense malicious activity in a sandboxed environment.

13



Malware Inc - Web Browsers Authors Baljit Singh (CS 2014) Fahim Dalvi (CS 2014)

Faculty Advisor Thierry Sans, Ph.D.

Category Computer Science

Abstract An extension is an application that enhances the functionality of a browser. The different types of extensions available today range from including a calculator in a browser to storing passwords in the browser in an encrypted form. The goal of the Malware Inc project was to evaluate the risk of exposure to malware in popular browsers such as Google Chrome and Mozilla Firefox. A Malware is any software that maliciously tries to disrupt computer operations, steal sensitive information and so on. We were able to create several malware in the form of extensions, two of which are described below. In Google Chrome, we created an extension that could record the passwords of users when they try to login using e-banking mode. When the user enters data using an on-screen keyboard, a screenshot of the keyboard was captured with emphasis on the button that was pressed. Using the series of screenshots received, the developer could then deduce the data entered easily. On Mozilla Firefox, we were able to create an extension that could remotely download and execute external programs (which logged all the data that you typed on your keyboard), and then read files (the logs) from the user’s system and transmit them back to the developer without any extra permissions. The antivirus is unable to track this communication, as it is entirely through the browser, one of the trusted software on the machine. From the information we have gathered through this project, our aim is to create new technologies that can detect and stop malicious extensions. One of our ideas is to have an extension that can keep track of all the communication initiated by other extensions, essentially creating an “antivirus” for the browser.

15



Multi-Robot Simulation Authors Sidra Alam (CS 2013) Hanan Alshikhabobakr (CS 2013) Shailja Relwani (CS 2014)

Faculty Advisors M. Bernardine Dias, Ph.D. Brett Browning, Ph.D.

Category Computer Science

Abstract Multi-robot research can be very challenging due to the difficulty of operating several complex vehicles simultaneously. Simulation can be a powerful tool to speed up software development if the simulator is sufficiently accurate. The project builds from an existing robot simulation tool – the Urban Search and Rescue Simulator (USARSim). The focus of the research was to customize and enhance the USARSIM multi-robot environment to enable high confidence of capturing the complexities of real robot execution, specifically for the robots in the Qri8 robotics lab at Carnegie Mellon University in Qatar. In terms of an application domain, our focus was on disaster response scenarios. The primary task for the robots in this scenario was exploration. For the simulation testing, we created maps for the disaster response scenario and we added physical configuration models for the robots. We used generic robot control software called Player to control the robots in the simulator. To validate the experiment, we ran a test with real robots and an identical simulation test and compared the executed behaviors.

17



Projecting Named Entity Boundaries from English to Arabic Author Nehal Elkady Hussein (CS 2013)

Faculty Advisor Behrang Mohit, Ph.D.

Category Computer Science

Abstract: Named Entity Recognition (NER) is the process of identifying spans of text that constitute proper names and then classifying them according to their type (e.g. names of persons, locations, organizations, etc.). English is one of the few languages which have decent amounts of human-annotated data for training NER systems. In contrast, languages like Arabic do not have enough annotated data and robust NER systems. To overcome this resource shortage of Arabic, we implemented software that automatically tags Arabic text with named entity information using a cross-lingual projection. Named entity projection is a framework to use word alignments to transfer the named entity knowledge from a resource rich language (e.g. English) to a resource-poor language (e.g. Arabic). A word-aligned parallel corpus contains sentences in English, their translation in Arabic, and the word-level alignment information between the parallel sentences. Our named entity projection software uses the word alignments to find the proximity of the named entities on the Arabic sentence. Output of the projection system is the annotation of the named entities on the Arabic side of the parallel corpus. Major challenges in our projection come from the divergence of the languages and the noise in the word alignments. Also, since Arabic words could be aligned to more than one English word, and vice versa, we observe cases where one Arabic word gets mapped to two English entities. We introduced several heuristics in our projection algorithm. Two important ones are: (1) We labeled a named entity if and only if it has a gap of two words or less, and the words forming the gap are not aligned to other words within the same sentence; (2) If any of the elements constituting a named entity were repeated we remove them and then label the named entity. The evaluation of our projection system was based on comparing its output with the gold standard answers provided by human annotation. To do that, we selected thirty random Arabic sentences from the parallel corpus and annotated them with the gold standard named entity boundaries. We then compared the results of our projection output against this gold-standard data. The disagreement (error) between the projection output and the gold-standard data was about 6.3 %. Moreover, we used the standard precision and recall metrics to evaluate the projection. Precision measures the accuracy of the procedure and recall measures the coverage of it. For our thirty sentences corpus, we achieved a precision of 81.4% and a recall of 76% which are promising baselines for creating new named entity resources for Arabic. Furthermore, we used our projected data to enhance a baseline named entity system and achieved a significant 4% average improvement. 19



SCOUT: Extending the Reach of Social-Based Context-Aware Ubiquitous System Author Dania Abed Rabbou (CS 2012)

Faculty Advisors Abderrahmen Mtibaa, Ph.D. Khaled Harras, Ph.D.

Category Computer Science

Abstract The proliferation of social-networks, localization systems, and high-end mobile devices has created a fertile ground for the development of systems that are aware of interests and adaptive to location. With the burgeoning of the domain of social-based context-aware systems, numerous challenges are becoming of increased importance. One such challenge, not addressed so far by the research community, is end-to-end communication between ubiquitous systems and their users. Communication in existing systems is either centralized or distributed. Centralized systems require users to be connected to a server at all times and thus assume the availability of internet connectivity everywhere. In reality, internet connectivity may be absent, charged, energy-consuming, heterogeneous, and over-loaded. As an alternative, distributed communication enables users to obtain information from neighboring devices but the availability of information and the extent of its dissemination are dictated solely by user mobility and contacts. We realize the need for a new hybrid mode that leverages the centralized and distributed communication modes. Using the hybrid mode, a ubiquitous system disseminates a message to interested connected and unconnected users by first selecting a subset of connected users sufficient to reach the unconnected ones and secondly by forwarding the message from these users to neighboring users most likely to meet the unconnected ones. We implemented this hybrid communication mode on a social-based context-aware ubiquitous system, SCOUT, which we built as a testing platform. We plan to evaluate our proposed solution against the existing architectures based on three metrics important to the performance of ubiquitous systems, namely: (i) success rate or the ratio of the number of users reached to the total number of interested users, (ii) end-toend delay or the average time delay incurred to reach all interested unconnected users, and (iii) cost or total number of message forwards required to reach all the interested unconnected users.

21



A3 (A-Cubed) Authors Lakshmi Prakash (CS 2015) Sabih Bin Wasi (CS 2015) Tamim Jabban (CS 2015) Faculty Advisor Selma Limam Mansar, Ph.D.

Category Information Systems

Abstract As they enter the world of undergraduate education, students all around the globe discover a new world. Universities use various models to advise them to make informed decisions about their education. Indeed, at multiple stages during their academic life, students encounter one or more of the following problems in choosing courses: 1. It is not always obvious to anticipate the impact of a change to a study plan, such as postponing, dropping or changing a ‘core course’ – especially in an academic system where most courses are interdependent. 2. If vision around planning is obstructed, students find it risky to switch majors in between their academic career. 3. When it comes to choosing courses, the mere course description or course title may not be enough to make good choices. A3 was initiated as an independent study project in Information System, and attempts to solve such a demanding problem. Based on patterns for major degree programs, A3 advises a student with a plan that is well-fitted for a student pursuing a career with varying interests. Additionally, A3 makes it easier to choose free electives. A3 lets users visualize their future undergraduate academic plans. It enables students to maximize satisfaction from their undergraduate degree, making it more meaningful for their professional life. With an extensive database of courses offered on campus, A3 attempts to provide comprehensive information about each course that would aid students to pick the right courses beforehand. It also aims to bring social life to the application by integrating peer suggestion, ranking and advising features for each course or an entire customized course plan. A3 currently can be used to its full potential for CMU-Q students who have been offered a choice of 5 majors and various minors. In the longer run, A¬3 aims to act as a much-needed advising tool for students at any stage. If they can visualize, bend and enhance their course plan wisely and timely, it can unarguably benefit their future academic and professional life.

23



EZ Intern: Internship Tracking System Authors Hissa Al-Bahr (IS 2013) Reham Al-Tamime (IS 2013) Nabeeha Haque (IS 2013) Abhay Valiyaveettil (IS 2013)

Faculty Advosor Ian Lacey, Ph.D.

Category Information Systems

Abstract The main objective of this research is to design a system for the Office of Professional Development in order to provide students with an attractive and user friendly website that contains all the opportunities for internships and jobs available locally for CMUQ students and to allow them to apply online. Students at CMUQ receive several important and interesting emails from time to time about job/internship opportunities from the Office of Professional Development at CMUQ. Each of those emails has a different subject and is aimed at different groups of students. Sometimes companies email the Office of Professional Development of CMUQ to inform students about a specific opportunity and then the Office of Professional Development of CMUQ forwards it to students. The best part about this system is that it enables the students to apply online and can store their rĂŠsumĂŠ online so that they do not have to upload it each time they apply. This idea came about when the group members were thinking of a system that could be implemented at CMU-Q and the members received a series of e-mails regarding internships. This inspired the team to develop a system that could eliminate the unnecessary stack of e-mails. The group got in touch with the Office of Professional Development in order to conduct some more research and to understand the process more closely. This project is really important since it can resolve a pressing issue that CMU-Q faces at the moment. With internship opportunities on the rise, we really want to have an organized system for students to apply.

25



Lost & Found Authors Shivani Arora (BA 2013) Nur Aysha Anggraini (IS 2013) Fatema Akbar (IS 2013) Maryam Yousuf (IS 2013)

Faculty Advisor Ian Lacey, Ph.D.

Category Information Systems

Abstract Education City is a community with students, faculty, staff, and visitors across six branch campuses. These members belonging to different campuses interact with each other on a daily basis for meetings, cross-registered courses. With so much moving around, often, these community members tend to lose items within their building or others. Currently, searching for their lost items is a very tedious and tiresome process – a “loser” has to physically approach the security desks at one or more buildings and ask them about the lost item. And if the item is available, the “loser” has to fill out several forms to claim the item. If, on the other hand, the item has not been found by the security guards, the “loser” may have to approach the security desk several times again later to find their item. The Lost&Found system automates this paper-based, time-consuming process, making it simple and convenient for the EC community members. It serves as an online web-application, where EC members can check if their lost item has been found by any security desks in any of the EC buildings, and if the item has not been found, they can report their lost item. If their item is found in the future, they will be notified. The Lost&Found system ultimately connects the two sides of the lost and found process in an efficient, user-friendly manner.

27



MoltaQatartans: Tartans Forum System Authors Wadha Alajmy (IS 2013) Mariem Fekih (IS 2013) Tasneem Jahan (IS 2013) Abhinav Vemuri (IS 2013)

Faculty Advisor Ian Lacey, Ph.D.

Category Information Systems

Abstract

The poster is about a system the team is working on as part of the Junior Project IS class. The team was looking for a way to link alumni with current students. After contacting Khadra Dualeh from the professional development office and Feras Villanueva from the marketing department, the team concluded that there doesn’t exit a CMUQ website that joins all members of the community together. Since there is already a website for the Alumni Association in Pittsburgh, the team realized that it would be interesting if the Qatar campus had a similar system for students and alumni who studied in Qatar. Among the ways that graduates stay connected to CMUQ are: (1) A CMUQ email, where alumni send emails to Khadra or professors in order to stay informed about the upcoming events, talks, and opportunities to give back. (2) Social networking websites like Facebook or Twitter, where alumni can communicate with each other and current students. Through these social websites they can stay updated about their friends and also can share their experiences after graduation with students at CMUQ. However, these means remain inefficient and represent several problems: •

The alumni may have disabled their CMUQ emails or no longer use them and check them.

he alumni don’t receive or reply to any email coming from CMUQ or from the professional T development office.

The “Tartans Forum System” is a website that will bring in the different parts of the CMUQ community to the same page. It will help to: •

nhance the bonds between alumni and current students by enabling them to exchange ideas, E thoughts, advice, and even job opportunities.

Enable the alumni to stay in touch with CMUQ faculty and staff.

eep the Alumni updated about upcoming events on campus or even organize them and invite people K to attend.

29



Using Mobile Technology for Enhancing Young Qatari Health Behavior: An Experiment Design Authors Aysha Anggraini (IS 2013) Nawal Behih (IS 2014) Maahd Shahzad (IS 2014)

Faculty Advisor Selma Limam Mansar, Ph.D.

Category Information Systems

Abstract This project introduces the design of an experiment that tests the effectiveness of mobile applications for a weight loss program. A mobile application was developed, combining three modules: a tailored text messaging service, goal setting and progress monitoring support, and a social network setup. The intent of the project is to test its effectiveness in achieving weight management goals. An element of localization is added to the application design, addressing the needs of Qatar’s population. When the mobile application development is completed an intervention study is performed with participants because the application has to be tested. This project describes a usability study design: verbal feedback is collected from participants of the local community who are interacting with the technology. Each participant is asked to complete a list of tasks, structured according to the three interfaces. The feedback allows the developers to enhance the mobile application. This MoM submission summarizes progress on an awarded UREP entitled “Using Mobile Technology for Enhancing Young Qatari Health Behavior”. Our contribution has so far been on developing some parts of the application and designing the usability testing experiment. We meet weekly as a research team and discuss progress according to a timeline. We use the IS lab resources (a server, Android phones, Android developer tools, and voice recorders to record participants’ feedback). This research project is important because it is in line with Qatar’s goals to reduce obesity and related health problems. It shows us how modern technology can be used to solve health problems.

31



Gettin’ the Flow; Makin’ Good Grades Authors Maha Al-Moghany (CS 2012) Sara Kawas

Faculty Advisor David Emmanuel Gray, Ph.D.

Category Humanities and Social Sciences

Abstract Csikszentmihalyi’s flow theory describes a state of mind in which a person becomes fully “immersed” and engaged in an activity. According to Csikszentmihalyi, a person can achieve flow even when doing what might be thought the dullest of activities, as long as that person’s skills are on par with the challenge of the activity. In our study, we explore this phenomenon inside the classroom, determining the extent to which flow influences student performance and their perceived learning. Some previous studies have focused on flow in learning environments, but our study focuses on a single university-level course over an entire semester, and compares the balance of student skills and coursework challenge to grades. Students who took a joint CMU-Q/NU-Q course in logical reasoning (cross-listed as 80-206 at CMU-Q and PHIL 242-70 at NU-Q) were surveyed throughout the semester after each quiz about the quiz’s perceived challenge and about their own perceived level of skill in solving that quiz’s problems. When we analyzed the results, we clustered the students into four groups, using a method proposed by Massimini and Carli. First, apathetic students are those who report lower than average on both challenge and skill for a quiz. Second, anxious students are those who report higher than average challenge but lower than average skill. Third, bored students are those who report lower than average challenge but higher than average skill. Fourth and finally, students in flow are those who report higher than average on both challenge and skill. At the end of the semester, students were asked to compare their logic abilities at the beginning of the semester with where they believed their abilities were currently. We then used this to calculate their perceived learning over the semester. From our analysis, we found that students in flow scored a higher average grade on quizzes than bored or apathetic students. Yet, anxious students did score a higher average grade than students in flow. However, students in flow reported the highest perceived learning distance among all four groups of students. Our results indicate a difference between grades and perceived learning among the four groups, which future studies ought to explore in order to discover a more detailed causal relationship among these variables. Furthermore, our results may encourage further work into discovering the extent to which instructors can encourage flow in their students, and whether this might lead students to get higher grades.

33



Service Learning at CMU-Q: Motivations, Gains, and Challenges Authors Shivani Arora (BA 2013) Firas Bata (BA 2013)

Faculty Advisor Silvia Pessoa, Ph.D.

Category Humanities and Social Sciences

Abstract This study examines the motivation, gains, and challenges experienced by 27 undergraduate students at Carnegie Mellon University in Qatar (CMU-Q) who participated as volunteer teachers in an adult literacy program for migrant workers in Qatar. A total of 27 student reflections were analyzed for recurrent themes, 22 students completed a survey about their participation in the program, and seven students were personally interviewed. The findings indicate that initially, students in higher education need incentives such as obtaining academic credit to enroll in community service programs. In addition to academic credit, during their experience, the participating students gained several tangible and intangible benefits including developing several new skills such as communication and teaching skills, and an appreciation for the differences in the communities surrounding them. Although the initial motivation for the students to participate in Language Bridges may have been academic credit, this initial experience may later lead to students participating in similar programs in the future for truly humanitarian and altruist reasons. Based on these findings, we discuss recommendations for curricular and meta curricular practices at CMU-Q.

35



Challenges in Mobile Opportunistic Networks Author Abderrahmen Mtibaa

Faculty Advisor Khaled Harras, Ph.D.

Category Postgraduate

Abstract Mobile devices such as smart-phones and tablets are becoming ubiquitous, with ever increasing communication capabilities. In situations where the necessary infrastructure is unavailable, costly, or overloaded, opportunistically connecting these devices becomes a challenging area of research. Data is disseminated using nodes that store-carry-and-forward messages across the network. In such networks, node cooperation is fundamental for the message delivery process. Therefore, the lack of node cooperation (e.g., a node may refuse to act as a relay and settle for sending and receiving its own data) causes considerable degradation in the network. In order to ensure node cooperation in such networks, we investigate three main challenges: (i) ensuring fair resource utilization among participating mobile devices, (ii) enabling trustful communication between users, and (iii) guaranteeing scalable solutions for large number of devices. (i) F airness is particularly important for mobile opportunistic networks since it acts as a major incentive for node cooperation. We propose and evaluate FOG - a real-time distributed framework that ensures efficiency-fairness trade-off for users participating in the opportunistic network. (ii) Since users may not accept to forward messages in opportunistic networks without incentives, we introduce a set of trust-based filters to provide the user with an option of choosing trustworthy nodes in coordination with personal preferences, location priorities, contextual information, or encounter-based keys. (iii) M obile opportunistic solutions should scale to large networks. Our hypothesis is that in large scale networks, mobile-to-mobile communication has its limitations. We therefore introduce CAF, a Community Aware Forwarding framework, which can easily be integrated with most state-of-the-art algorithms, in order to improve their performance in large-scale networks. CAF uses social information to break down the network into sub-communities, and forward messages within and across sub-communities. In the three contributions we propose above, we adopt a real-trace driven approach to study, analyze, and validate our algorithms and frameworks. Our analysis is based on different mobility traces including the SanFrancisco taxi cab trace, traces collected from conferences such as Infocom\’06 and CoNext\’07, and Dartmouth campus wireless data set.

37



Characterization of Hadoop MapReduce Applications Authors Mohammad Hammoud M. Suhail Rehman

Faculty Advisor Majd F. Sakr, Ph.D.

Category Postgraduate

Abstract Hadoop is an open source implementation of MapReduce, one of the most successful realizations of largescale data-intensive cloud computing platforms. Hadoop MapReduce is becoming a ubiquitous programming model finding applications on many fields including web, medicine and astronomy, among others. As a result, Hadoop MapReduce is faced with an enormous space of application domains and system parameters that might impact MapReduce employments and studies. As such, characterizing Hadoop becomes crucial for: (1) fostering the understanding of the MapReduce model so as to enable practical applications of MapReduce programs, (2) gleaning insights into the framework’s bottlenecks and potentially identifying desirable optimizations, (3) establishing a quantitative foundation for evaluating MapReduce applications, and (4) supporting research and comparisons across multiple related studies. We propose and apply a characterization methodology that serves in meeting these objectives. The only two current studies on characterizing Hadoop focus primarily on promoting benchmark suites and overlook establishing a methodology through which Hadoop can be effectively characterized. In contrast, we focus on developing an effective characterization methodology and suggest using that methodology at a later stage to create a standard benchmark suite. We utilize two representative and widely adopted MapReduce benchmarks and characterize Hadoop along seven dimensions: the degree of parallelism, the dataset distribution, the job system resource utilization, the job communication to computation ratio, the execution timelines, the dataset types and sizes, and the job network traffic pattern. As a result, we attained multiple observations. For instance, we realized that having the number of mappers equal to the number of available map slots leads to the best parallelism. This does not only improve performance, but also tackles one of the root causes of the common stragglers problem in MapReduce. As another example, we observed that the network bandwidth dissipation of the reduce stage exceeds that of the shuffle stage by approximately a factor of two. This might suggest addressing the network bandwidth problem at the reduce stage, especially since all research effort has been put so far on the shuffle stage.

39



CoGRS: A Center-of-Gravity Reduce Task Schedule for MapReduce Authors Mohammad Hammoud M. Suhail Rehman

Faculty Advisor Majd F. Sakr, Ph.D.

Category Postgraduate

Abstract: MapReduce is one of the most successful realizations of large-scale data-intensive cloud computing platforms. As compared to traditional programming models, MapReduce automatically and efficiently parallelizes computation by running multiple Map and/or Reduce tasks over distributed data across multiple machines. Hadoop, an open source implementation of MapReduce, schedules Map tasks in the vicinity of their input splits thereby reducing network traffic. However, for reduce tasks, Hadoop neither exploits data locality nor addresses data partitioning skew inherent in many MapReduce applications. Consequently, MapReduce experiences a performance penalty and network congestion as observed in our experimental results. In this study, we introduce the Center-of-Gravity Reduce task Scheduler (CoGRS), a practical strategy for improving MapReduce performance in clouds. CoGRS attempts to schedule each reducer at its center-of-gravity node. It controllably avoids scheduling skew, a situation where some nodes receive more reducers than others, and promotes effective pseudo-asynchronous Map and Reduce phases resulting in earlier completion of submitted jobs, diminished network traffic, and better cluster utilization. We implemented CoGRS in Hadoop-0.20.2 and conducted extensive experimentations to evaluate its potential. We found that CoGRS outperforms the native Hadoop scheduler by 11%, and by up to 26% in runtime performance for our benchmark studies. In addition, we deployed CoGRS on Amazon EC2 and found that our strategy scales well, reducing network traffic by as much as 38.6%, which consequently translates into an runtime improvement of up to 23.8%. From these studies, we believe that CoGRS is applicable to several cloud computing environments and applications, including but not limited to, shared environments and scientific applications. LASAR paves the way for these applications, and others, to be ported to various clouds in an effective manner.

41



GreenLoc: Energy Efficient Wi-Fi-based Indoor Localization Author Mohamed Abdellatif

Faculty Advisor Khaled Harras, Ph.D.

Category Postgraduate

Abstract User-localization and positioning systems have been a core challenge in the domain of context-aware pervasive systems and applications. GPS has been the de-facto standard for outdoor localization; however, geo-satellite signals upon which GPS relies, are inaccurate in indoor environments. Therefore, various indoor localization techniques based on triangulation, scene analysis, or proximity, have been introduced. The most prominent technologies over which these techniques are applied include WiFi, Bluetooth, RFID, Infrared, and UWB. Due to the ubiquitous deployment of access points, WiFi-based localization via triangulation has emerged to be among the most prominent indoor positioning solutions. A major deployment obstacle for such systems, however, is the high energy consumption rates of WiFi adapters in mobile devices where energy is the most valuable resource. We propose GreenLoc, an indoor green localization system that exploits sensors prevalent in today’s smartphones in order to dynamically adapt the frequency of location updates required. Significant energy gains can, therefore, be acquired when users are not mobile. For example, accelerometers can aid in detecting different user states such as walking, running or stopping. Based on these states, mobile devices can dynamically decide upon the appropriate update frequency. We accommodate various motion speeds by estimating the velocity of the device using the latest two location coordinates, and the time interval between these two recorded locations. We have taken the first steps towards implementing GreenLoc, based on the infamous Ekahau system. We have also conducted preliminary tests utilizing the accelerometer, gravity, gyroscope, and light sensors residing on the HTC Nexus One and IPhone4 smart-phones. To further save energy in typical indoor environments, such as malls, schools, and airports, GreenLoc exploits people’s proximity when moving in groups. Devices within short-range of each other do not necessarily require that they each be individually tracked. Therefore, GreenLoc detects and clusters users moving together and elects a reference node (RN) based on device energy levels and needs. The elected RN will then be tracked via triangulation while other nodes in the group will be tracked based on the RN’s location using Bluetooth. Our initial analysis demonstrates very promising results with this system.

43



Hala 2.0: Considerations for Developing a Test Bed for Multi-Lingual, Cross-Cultural Human Robot Interaction Authors Micheline Ziadee Imran Fanaswala Maxim Makatchev Amna Al Zeyara Huda Gedawy Nawal Behih

Faculty Advisor Majd Sakr, Ph.D. Reid Simons, Ph.D.

Category Postgraduate

Abstract

This research focuses on developing multi-lingual, cross-cultural human robot interaction by identifying, analyzing, and utilizing language and culture-related factors/variables that influence human interactions. Our test bed is a bi-lingual, cross-cultural robot receptionist, Hala, deployed at the Carnegie Mellon Qatar reception that helps us collect the necessary data. Hala can speak Arabic and English and she interacts with a variety of users. Hala as a platform models user interactions, invites users with a SICK laser, handles facial animations (borrowed from Paul Ekman’s models), text-to-speech and lip synchronization (borrowed English visemes for Arabic speech), as well as error and reporting, post dialogue analysis, networking/interprocess communication, and a rich client interface. Results from our prior work indicated variations in user interactions with the robot. For example, around 97% of the interactions with Hala were in English but the average duration was higher for interactions in Arabic as opposed to interactions in English (231.7 seconds for Arabic/184.4 seconds for English). Also, Arabic native speakers were twice as likely to accept an invite from the robot than English native speakers. Users in Qatar, as compared to those in the US, thank the robot less frequently. In this work, we discuss Hala 2.0, which has more Arabic features in expression and interaction. We constructed Hala’s personality taking into account the socio-cultural context in which her interactions will take place. We have expanded Hala’s stateless content, in English and in Arabic, to include previously unanswered queries that we gathered from Hala’s logs. We have introduced natural language processing with semantic and syntactic parsing and created new state-full content whereby Hala is capable of carrying on a meaningful conversation. We developed Arabic visemes to replace the borrowed English visemes that were used for Arabic speech. In addition to that, we developed more facial animations that add to Hala’s ability to express emotions; our latest study focused on understanding the differences in accents of facial expressions that vary between cultures – in this case, Arabic and American. In another study, we investigated the strategies used by English and Arabic native speakers when giving directions. Results from these studies will influence the design of Hala 2.0. 45



Performance Prediction of MapReduce Applications in Elastic Compute Clouds Author Fan Zhang

Faculty Advisor Majd F. Sakr, Ph.D.

Category Postgraduate

Abstract The MapReduce programming model is a widely accepted solution to address the rapid growth of the socalled big-data processing demands. Various applications with a very large volume of input data can run on an elastic compute cloud composed of many distributed computing instances. This elastic compute cloud is best represented by a virtual cluster, such as Amazon EC2. Performance prediction of MapReduce applications is challenging due to the complex interaction of the MapReduce framework running on highly parameterized distributed virtualized resources. In this study, we have characterized a series of MapReduce applications on Amazon EC2, and identified how input data size and cluster size affect the execution time. We studied the scaling curve of each application, and provided a number of data cleaning, mining and prediction methods to discover the data pattern. These MapReduce applications span across data-intensive, compute-intensive, and iterative benchmarks. Initial observations suggest a near positive power-law distribution of execution time against the input data size and a negative power-law distribution against the cluster size. Based on all these observations, we have estimated the prediction error and suggested methods to reduce the estimation error. Five regression and prediction methods, linearly incremental regression, polynomial regression, exponential regression, moving average regression and power regression are thoroughly investigated and compared for estimating prediction errors. We conclude that the power regression performs best at performance prediction compared with the other methods evaluated. Our observations and performance prediction methods will aid users in choosing appropriate computing resources, both virtual and physical, from small-scale experimental test runs. This approach will predict performance speedups or slowdowns for MapReduce applications when scaling the infrastructure or the input datasets.

47



SmartReader: A natural language processing-based active and interactive system for accessing English language content and advanced language learning Authors Kemal Oflazer Teruko Mitamura Hideki Shima Jun Araki Ahmed Salama

Faculty Advisor Kemal Oflazer, Ph.D

Category Postgraduate

Abstract:

SmartReader is a general-purpose “reading appliance” being implemented at Carnegie Mellon University (Qatar and Pittsburgh) – building upon an earlier prototype version. It is an artificial intelligence system that employs advanced language processing technologies and can interact with the reader and respond to queries about the content, words and sentences in a text. We expect it to be used by students in Qatar and elsewhere to help improve their comprehension of English text. SmartReader is motivated by the observation that text is still the predominant medium for learning especially at the advanced level and that text, being ``bland,’’ is hardly a conducive and motivating medium for learning. This is especially true when one does not have access to tools that enable one get over language roadblocks, ranging from unknown words to unrecognized and forgotten names, to hard-to-understand sentences. SmartReader strives to make reading (English) textual material, an “active” and an “interactive” process with the user interacting with the text using anytime-anywhere contextually-guided query mechanism based-on contextual user intent recognition. With SmartReader, a user can: • I nquire about the contextually correct meaning or synonyms of a word or idiomatic and multi- word constructions. • Select a person’s name, and then get an immediate ``flashback’’ to the first (or the last) time the person was encountered in text to remind herself the details of the person. • Extract a summary of a section to remember important aspects of the content at the point she left off, and continue reading with a significantly refreshed context. • Select a sentence that she may not be able to understand fully and ask SmartReader to break it down, simplify or paraphrase to comprehend it better. • Test her comprehension of the text in a page or a chapter, by asking SmartReader to dynamically generate quizzes and answering them. • Ask questions about the content of the text and get answers in addition to many other functions. SmartReader is being implemented as a multi-platform (tablet/PC) client-server system using HTML5 technology, with Unstructured Information Management Architecture – UIMA technology (used recently in IBM’s Watson Q/A system in the Jeopardy Challenge) as the underlying language processing framework.

49



VOtus: A Flexible and Scalable Monitoring Framework for Virtualized Clusters Authors Suhail Rehman Mohammad Hammoud

Faculty Advisor Majd Sakr, Ph.D.

Category Postgraduate

Abstract Cloud computing revolutionizes the way big data is processed and offers a compelling paradigm to organizations. Data-intensive scientific applications are being ported to cloud environments such as virtualized clusters; however, this process poses its own set of challenges. Given the complexity of the application execution environment as well as the infrastructure, routine tasks such as monitoring, performance analysis and debugging of applications deployed on the cloud become cumbersome and complex. These tasks often require close interaction and inspection of multiple layers in the application software stack. For example, when analyzing a distributed application that has been provisioned on a cluster of virtual machines, a researcher might need to look at the virtual resource usage (e.g., virtual CPU and virtual memory) and the corresponding physical resource usage (physical CPU and physical memory) of the cluster. This would require two different sets of tools to collect and analyze performance metrics from each level. One such tool is Otus, which currently reports only the virtual resource utilization and not the physical resource utilization on virtualized clusters. Through VOtus, we have extended Otus to include physical resource metrics that can be collected from the hypervisor. For example, a researcher can now view the virtual resource usage of his application on the VMs as well as the physical resource usage of the VMs on the physical machines. This enables the researcher to closely monitor the application and make modifications at the application level or at the VM level. This would help the researcher to optimize performance or manage infrastructure effectively. VOtus also scales to large clusters and can be used for real-time monitoring and to archive performance metrics for detailed analysis in the future. VOtus should prove to be an important tool for researchers who plan to design, develop and deploy distributed applications on virtualized clusters.

51



For more than a century, Carnegie Mellon University has been inspiring innovations that change the world. Consistently top ranked, Carnegie Mellon has more than 11,000 students, 90,000 alumni and 5,000 faculty and staff globally. In 2004, Qatar Foundation invited Carnegie Mellon to join Education City, a groundbreaking center for scholarship and research. Students from 39 different countries enroll at our world-class facilities in Education City. Carnegie Mellon Qatar offers undergraduate programs in biological sciences, business administration, computational biology, computer science and information systems. Learn more at www.qatar.cmu.edu



P. O . B o x 2 4 8 6 6 | E d u c a t i o n C i t y, D o h a , Q a t a r | P h : + 9 7 4 4 4 5 4 8 4 0 0 | F a x : + 9 7 4 4 4 5 4 8 4 1 0 | w w w. q a t a r. c m u . e d u


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.