The School of Computational Science and Engineering 2020 Annual Report

Page 1

Innovation

2020 ANNUAL REPORT

cse.gatech.edu

1


Faculty Awards

n

n

n

rofessor Haesun Park and Joint CSE P Professor Surya Kalidindi were named Regents’ Professors. The title of Regent’s Professor serves as recognition of the highest academic merit at Georgia Tech and is awarded by the Board of Regents to outstanding tenured, full professors, based on excellence in research and contributions to their profession and to Georgia Tech. Both have made numerous and impactful contributions to their fields of research in data analysis, parallel computing, data mining (Park) and physics-based multi-scale modeling and microstructure design (Kalidindi). ssistant Professor Chao Zhang was selected A as the winner of the 2019-2020 Google Faculty Research Award for the category of structured data. Zhang was recognized with this award for his research developing new techniques that quantify uncertainty for deep learning models which allow these models to better process unstructured data, improving the robustness and effectiveness of existing knowledge extraction technology. Zhang was also awarded the ACM SIGKDD 2019 Dissertation Runner-up Award for his research in multidimensional mining of unstructured text data. A ssociate Professor Polo Chau received the 2019 Outstanding Researcher Award from Intel, honoring his innovations in artificial intelligence and security, and for his contributions to the Intel Science & Technology Center for Adversary-Resilient Security Analytics (ISTCARSA) at Georgia Tech. The award is given annually to recognize researchers across scientific disciplines who have demonstrated exceptional innovation for work related to Intel initiatives. 2

COLLEGE OF COMPUTING

n

R egent’s Professor Haesun Park was named as a Georgia Tech Face of Inclusive Excellence. This honor recognizes faculty, staff, and students whose accomplishments in their research, teaching, leadership, and service endeavors have earned special awards or recognition during the previous academic year.

n

A ssistant Professor Tobin (Toby) Isaac was named as the new Catherine M. and James E. Allchin Assistant Professor. This two-year professorship enables Toby to undertake innovative and high-risk research at his discretion. This also marks the first time a CSE faculty is a recipient of this prestigious professorship, earmarked for early career faculty in the College.

n

A ssociate Professor Le Song received the 2020 College of Computing Outstanding Senior Research Faculty Award. This award is given to a faculty member who has made significant, high quality, innovative contributions to their field of study, visibly impacting one or more mission areas.


Message from the

Chair

The School of Computational Science and Engineering marked an important milestone this year, moving into its new permanent home on the 13th floor of the newly constructed 21-story CODA Building in the technology square area of Georgia Tech campus. Embedded centrally within this building which co-locates data science and high-performance computing researchers and centers/institutes from across Georgia Tech, alongside state-of-the-art data center facilities, provides a vibrant atmosphere for CSE to thrive well into the future. Tech’s strategic vision has nurtured significant data science industry presence both within and near the CODA complex, which augurs well for applied research and industry collaboration aspects of CSE mission. CSE added four new faculty during the last year. Two are junior faculty in their first academic appointment – Srijan Kumar, previously a postdoc at Stanford, and Xiuwei Zhang, previously a postdoc at University of California at Berkeley. We also added two Associate Professors – Elizabeth Cherry and B. Aditya Prakash, who previously served at the Rochester Institute of Technology and Virginia Tech, respectively. Together, these faculty bolster our strengths in data science and applied machine learning (Kumar and Prakash), cardiac modeling (Cherry), and computational biology (Zhang). We are a small school of 15 full-time and 4 joint academic faculty, but our achievements and contributions are anything but. Two of our professors, Surya Kalidindi and Haesun Park, were bestowed the Regents’ Professor title last year, the highest academic and research recognition in the University System of Georgia. Our faculty received six Fellow recognitions from professional societies such as IEEE, SIAM, ISCB, SCS, and I/ITSEC. We maintain a robust active funding portfolio of nearly $25 million spread across all major federal agencies and industry. We are also pivotal to computation and data-driven interdisciplinary research and education at Georgia Tech, particularly with engineering and science disciplines. New initiatives well underway include addition of the school of Mechanical Engineering to our flagship CSE Ph.D. program, and creation of a new M.S. program in Urban Analytics jointly with the School of City and Regional Planning. Our students continue to bring home fellowships, best paper and dissertation awards, both within and outside Georgia Tech. A team of outstanding staff serves as the backbone of our entire operation, taking pride in our collective advancement, well-being, and mission of outreach and service. The annual report offers an eclectic sample of contributions and achievements by CSE students, and research and academic faculty. We remain committed to fueling the growth of computational and data enabled advances in science, engineering, technology, and society, and working with our partners in academia and industry worldwide. Come join us or collaborate with us! Srinivas Aluru, Professor, Interim Chair cse.gatech.edu

1


Research Spotlight

Georgia Tech and Intel

Selected for Multimillion-Dollar DARPA Award

R

esearchers from CSE and Intel are working together to strengthen cybersecurity defenses for machine learning (ML) models designed for vision systems.

Bolstered by a new four-year, multimillion-dollar Defense Advanced

Research Projects Agency (DARPA) grant, the team will create deceptionresistant ML technologies with an emphasis on object detectors for the Guaranteeing AI Robustness against Deception (GARD) program. Object detectors are a type of technology used to identify objects within an image or video using labels and bounding boxes. While no known real-world attacks have been made on these systems, a team of researchers first identified security vulnerabilities in object detectors in 2018 with a project known as ShapeShifter. Led by CSE Associate Professor Polo Chau at Georgia Tech’s Intel Science and Technology Center for Adversary-Resilient Security Analytics (ISTC-ARSA), the ShapeShifter project exposed adversarial machine learning techniques that were able to mislead object detectors and even erase stop signs from autonomous vehicle detection. “As ML technologies have developed, researchers used to think that attacking object detectors would be difficult. ShapeShifter showed us that was not true, they can be affected, and we can attack them in a way to have objects disappear completely or be labeled as anything we want,” said

2

COLLEGE OF COMPUTING


UnMask combats adversarial attacks (in red) by extracting building-block knowledge (e.g., wheel) from the image (top, in green), and comparing them to expected features of the classification (“Bird” at bottom) from the unprotected model. Low feature overlap signals attack. UnMask rectifies misclassification using the image’s extracted features. Chau, who serves as the lead investigator from

there are all these people floating in the air and

Georgia Tech on the GARD program.

are overlapping in odd ways?’ Whereas we would

“The reason we study vulnerabilities in ML

think it’s unnatural,” said Chau. “That is what

systems is to get into the mindset of the bad

spatial coherence attempts to address – does it

guy in order to develop the best defenses. The

make sense in a relative position?”

GARD program provides us with an excellent opportunity for this,” he said. GARD is a DARPA-funded program that

This idea of applying common sense to AI object recognition extends to other coherencebased techniques, such as temporal coherence,

aims to establish theoretical ML foundations

which checks for suspicious objects’ disappearance

to identify system vulnerabilities in real-world

or reappearance over time. The team’s UnMask

applications. Intel and Georgia Tech are leading

semantic coherence technique, which is based on

a program team together under this platform

meaning, looks to identify the parts of an object

with Intel serving as the prime awardee and

rather than just the whole, and verifies that those

Georgia Tech’s funding totaling $1.3 million.

parts indeed make sense.

The four-year program is divided into

In terms of defenses, the goal of all three

three phases with the first phase focused on

coherence-based techniques is to force attackers to

enhancing object detection technologies through

adhere to all categories’ laws created for continuity

spatial, temporal, and semantic coherence

in the AI. This multi-perspective approach thwarts

for both still images and videos. These three

any future attempts by adversarial ML that do

defining qualities of object detectors look for

not meet the complex rules, causing any security

contextual clues to determine if a possible

breach to be flagged.

anomaly or attack is occurring. “Our research develops novel coherence-

As AI models with image recognition software are increasingly implemented and used in daily

based techniques to protect AI from attacks.

applications, the need to understand and thwart

We want to inject common sense into the AI

attacks in such programs is critical across fields.

that humans take for granted when they look

The GARD program aims to develop effective

at something. Even the most sophisticated AI

defenses across broad ranges of attacks, with

today doesn’t ask, ‘Does it make sense that

Georgia Tech and Intel helping lead the way. n cse.gatech.edu

3


Spotlight on Faculty Fellowships

n

R egents’ Professor Richard Fujimoto was named a Fellow for three different organizations this year. Fujimoto was named an Institute of Electrical and Electronics Engineers (IEEE) Fellow, a 2019 Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) Fellow, and the 2020 Fellow for the Society for Modeling and Simulation International (SCS) for his work in parallel and distributed discrete event simulation. Discrete event simulations model operations within a system and have uses in a wide variety of applications. Fujimoto has authored and co-authored hundreds of technical papers on the subject as well as several books, which span application areas including transportation systems, telecommunication networks, and multiprocessor and defense systems. The announcement for all three recognitions came only two years after he was named an Association for Computing Machinery Fellow in 2017.

n

I nterim Chair and Professor Srinivas Aluru and CSE Professor and Associate Chair Ümit V. Çatalyürek have both been inducted into the 2020 Class of Society of Industrial and Applied Mathematics (SIAM) Fellows. Nominated for exemplary research and outstanding service to the SIAM community, Aluru and Çatalyürek’s nominations account for two of the 28 inducted into this year’s international Fellows Program. Aluru was recognized by SIAM for his contributions to the field of computational genomics with sequential and parallel discrete algorithms research and for his leadership in data science and engineering. Çatalyürek was recognized by SIAM for his contributions to the fields of combinatorial scientific computing, and high-performance and parallel algorithms – fields of research in which he has won a number of awards in prior years.

n

J oint CSE Regents’ Professor Mark Borodovsky was elected as an International Society for Computational Biology (ISCB) Fellow for his influential research in developing algorithms for genome analysis as well as his recognized leadership in education and community development. He is one of twelve ISCB Fellows elected in the Class of 2020. Borodovsky is best known for his work on gene finding algorithms which are used widely. He is also responsible for launching the interdisciplinary Bioinformatics Masters and Ph.D. programs at Georgia Tech. He is the first faculty member at Georgia Tech, and in the state of Georgia for that matter, to become an ISCB Fellow.

4

COLLEGE OF COMPUTING


CSE Stats and Infographic April 1, 2019 – April 1, 2020

Program Info:

177

71

Total Student Enrollment Male 126 (71%) Female 51 (29%)

Ph.D. in CSE (male 87%; Female 13%)

2

Ph.D.

CSE and CS

3

M.S.

106 M.S. in CSE (male 60%; Female 40%)

CSE, CS, Analytics

4

Interdisciplinary Programs

M.S. and Ph.D in Bioinformatics / M.S. and Ph.D. in Bioengineering

Undergraduate Diversity

FY20 Research Funding

15.8

25.2

Percentage of underrepresented minority computing undergraduates

Percentage of female computing undergraduates

People

47 19 15 7 6

TOTAL STAFF AND FACULTY ACADEMIC FACULTY

6 5 4 4

RESEARCH SCIENTISTS

$24,928,169

$2,970,504

Total Active Funding

Research Expenditures

74

$228,500

CSE Active Research Projects

Research Expenditures per Faculty

$492K

Air Force (1)

$1.2M

ASSOC PROFESSORS

DARPA (1)

ADJUNCT APPOINTMENTS

ASSISTANT PROFESSORS

National Science Foundation (29)

STAFF

JOINT APPOINTMENTS

PROFESSORS

$14.1M $1M

National Institutes of Health (2)

$268K

Industry (2)

$4.6M

Georgia Tech Foundation (26)

$153K NASA (1)

$1.2M

National Labs (PNNL, Sandia, etc..) (6)

$828K Department of Energy (3)

$769K

Other (subawards) (3)

cse.gatech.edu

5


CSE Provides

COVID-19 Relief

with Big Data Research W

ith the COVID-19 pandemic changing the global landscape, CSE faculty have harnessed their diverse research initiatives to unite under

the goal of facilitating aid during the crisis. From cross-institute computational epidemiology projects to cybersecurity efforts, there is no shortage of ways in which CSE research is being utilized both by local governments and global organizations to address the defining challenge of the century. Continue reading on for highlights showcasing some of the CSE COVID-19 relief initiatives. Disease Forecasting A new COVID-19 forecasting project hosted by the Centers for Disease Control and Prevention (CDC) is helping predict the coronavirus disease spread, peak weeks and days of new cases, and anticipate the number of hospitalizations caused by COVID-19 across the country. This critically timed project is comprised of over 10 teams of leading data scientists, epidemiologists, statisticians, and high-performance comput-

Their team is using a deep learning model to predict characteristics of the novel coronavirus disease at the national, regional, state, and local levels.

ing researchers from national laboratories, public universities, public health institutions, and some private sector agents.

Associate Professor Aditya Prakash and CSE Ph.D. student Alexander Rodriguez are lead investigators on the project for one of the teams and are bringing a new data-driven approach to disease forecasting. Their team is using a deep learning model to predict characteristics of the novel coronavirus disease at the national, regional, state, and local levels. The CDC synthesizes their weekly and monthly predictions with other models to determine policy and resource allocation to help communities prepare for and fight the disease. “We want to predict quickly and early to give lead times to decision makers to decide appropriately when to and how to allocate resources such as determining where to send ventilators, where additional beds are most critically needed, vaccine creation timelines, implementing temporary shelter in-place orders, whether additional guidance to state and local authorities is needed, and more,� said Prakash. 6

COLLEGE OF COMPUTING


CSE Research Technologist Will Powell works on servers in the College of Computing Building Data Center during the Covid-19 pandemic.

Prakash is an expert in epidemiology and infectious disease forecasting and has been a lead researcher on the preexisting influenza forecasting project with the CDC since 2018. Predicting Hate Crimes Amid COVID-19 Outbreak CSE Assistant Professor Srijan Kumar has helped alleviate issues brought on by COVID-19 on the sociological front. As the novel coronavirus has spread across the globe, reports of harassment and cases of violent attacks aimed towards Asians and Asian Americans have dramatically increased. To combat this targeted harassment, Kumar and his new lab, CLAWS, have developed a data science pipeline to leverage social media signals to measure how targeted hate and racism has spread worldwide, and how we can use data-driven solutions to forecast these attacks. “In many unfortunate documented cases, xenophobic behaviors excited by the coronavirus have led to extreme physical harm to the victims and mental distress for others in the community,� said Kumar. The pipeline combines online data from news and social media platforms with offline data, such as reports of physical abuse and racial incidents, together to measure how targeted harassment is spreading in online and physical communities while comparing the spread side-by-side with the disease coverage. There are five levels of harassment, with the first level being the least damaging. Each level after is subsequentially more aggressive, with the fifth level including fatalities caused by hate crimes. “We are coding up each incident with a corresponding number and then cse.gatech.edu

7


scraping news websites, collecting all the race incidents and physical abuse incidents in the real world, and essentially creating a timeline of how things have progressed,” said Kumar. “Our hypothesis is as this virus is spreading through different cities and different states, people get more alarmed and scared, and that anxiety leads to more hate and harassment crimes in those particular cities,” he said. Kumar’s team is the first ever to create a real-time database and pipeline that can forecast deviant behaviors using disease spread as an indicator. The pipeline is also general enough to be customizable to different activities, such as monitoring and predicting harassment of other minority groups, drug use, child abuse, suicides, and more. “When pandemics or crises happen, there is an increase in public health and safety issues. So, we are using this particular framework as this is a first-use case to track and predict these problems and we plan to expand to others,” said Kumar. Increasing Test Kit Availability A lack of widely available testing lies at the forefront of alleviating the pandemic crisis in the United States. As of April 2020, this is a particularly critical problem for the state of Georgia which ranked as one of the bottom states in the country for testing per capita. To combat this issue, state leadership launched the Georgia Coronavirus Task Force, which was created to assess the state’s preparations and procedures for preventing, identifying, and addressing the spread of COVID-19.

8

COLLEGE OF COMPUTING


As part of this program, universities across the state are coordinating with the task force for what is referred to as a “lab surge” to increase the availability of testing for Georgia residents. Efforts are also underway to manage exchange of critical supplies and carry out pandemic modeling to inform policy makers. Through this effort, Georgia Tech will significantly increase the number of coronavirus test kits processed, which will also alleviate the supply chain for other critically needed public health items. Using HPC to Model Pandemic Simulations CSE Interim Chair and Professor Srinivas Aluru has contributed to these initiatives using high performance computing and data science. He is also supporting the state’s efforts through coordinating the use of Hive, Georgia Tech’s new $5.3 million high-performance computing (HPC) system, for creating pandemic simulations. Aluru said, “Many people use models to predict two things. One is to predict expected future scenarios, this could be in terms of hospitalizations, deaths, and infections. This could also be used to model resources such as personal protective equipment (PPE) and you could model what kind of resource needs a hospital will have. Models can also predict different types of use cases such as when school closures or businesses reopening could and should happen.” However, given the nature of daily changes from both human and virus behavior, models will likely never be perfect. But they help us develop a better understanding of what type of action is needed for a problem and are a valuable tool for policy makers to make critical decisions. This is where high-performance computers, such as the Hive, are a great asset. The Hive was acquired by the Institute for Data Engineering and Science (IDEaS) through a $3.7 million National Science Foundation (NSF) Major Research Instrumentation award and Aluru serves as the primary investigator on the project. The supercomputer has over 100 trillion bytes of memory, 11,500 compute cores, and 2.5 quadrillion bytes of storage. Some of Hive capacity is redirected to accommodate COVID-19 research. School of Industrial and Systems Engineering Professor Pinar Keskinocak was the first to have their simulations research migrated to the Hive. CSE Associate Professor Aditya Prakash is also conducting his work on an NSF Expeditions Project using the Hive. This large multi-institution project, which received a $10 million grant, is actively working with multiple federal and state agencies to support response efforts for the current pandemic. The project is specifically helping support response efforts by capturing the complexities underlying the disease, providing new analytical capabilities to decision makers, and using AI to develop simulations of multi-scale, multi-layer networks to provide insights into how the pandemic can be controlled.

n

cse.gatech.edu

9


CSE Welcomes Four New Faculty The School of Computational Science and Engineering continues to diversify and expand its faculty pool. Since 2019, CSE has added four new tenure-track faculty members, each hailing from diverse regions across the globe and equally diverse research backgrounds. With these new research strengths integrated into the already robust CSE research agenda, the new research makeup of the school boasts more data mining and bioinformatics research than ever before. In a time when misinformation and infectious disease are on the rise, this could be seen as the most clearly telling form of CSE leaderships’ forward-thinking and strategic growth plan. Meet the new CSE faculty below.

ELIZABETH CHERRY

utilizes mathematical modeling and simulation

Associate Professor

of the electrical dynamics of cardiac cells and tissue. This field of research is highly

Cherry joined CSE this month after teaching

interdisciplinary and requires expertise in a

at the School of Mathematical Sciences at the

number of sub-fields, including modeling,

Rochester Institute of Technology. While at the

computational algorithms, scientific

Rochester Institute, she found her researching

applications, and advanced interactive

stride in mathematical biology with a focus on

visualization.

cardiac electrophysiology and arrhythmias. According to Cherry, “Sudden death is

She has won numerous awards and recognitions for her work in this field, including:

secondary to ventricular fibrillation and remains a leading cause of mortality in the United States.

n

R ecipient of Trustees Scholarship Award,

n

R ecipient of the Outstanding Student Mentor

And the occurrence of atrial fibrillation, which is responsible for about 15 percent of all strokes, continues to rise.” In efforts to address this issue, Cherry

Elizabeth Cherry

10

COLLEGE OF COMPUTING

Rochester Institute of Technology, 2019 Award, Rochester Institute of Technology, College of Science, 2016

Srijan Kumar


n

R ecipient of the Outstanding Faculty of

manipulate and misinform users. Instances in

the Year Award, Rochester Institute of

which these behaviors occur are vast, and can,

Technology, College of Science 2013

according to Kumar, be categorized based on the three areas of use that they impact. “There are three major things people do

SRIJAN KUMAR

online: interact with one another, consume

Assistant Professor

information, and act on the recommendations they are shown. A way to unify and transform

With high-stake decisions being made via the

the user experience is to develop the user

web, the ways in which malicious users engage

models, which are deep-learning and network-

with us online can have a profoundly negative

based models,” he said.

impact on our lives and on society as a whole. For Srijan Kumar, a new assistant professor

However, the applications of Kumar’s work stretch far beyond harassment and his anti-

in Georgia Tech’s School of Computational

abuse algorithms have been used by the likes

Science and Engineering, this is a concept that

of Flipkart, India’s largest E-commerce platform,

transcends social media, encompassing most,

and Wikipedia.

if not all, of the web and society. His research

Prior to Georgia Tech, Kumar was

group, CLAWS (the Computational Lab for the

visiting research scientist at Google AI, and a

Web and Society), was established with the goal

postdoctoral researcher at Stanford University.

of improving the safety and well-being of people

He is the recipient of the 2018 ACM SIGKDD

world-wide. This is achieved by ridding the user

Doctoral Dissertation Award runner-up, WWW

experience of digital abuse and disinformation

2017 Best Paper Award runner-up, Larry S. Davis

pitfalls and using the online social signals to

Doctoral Dissertation Award 2017, and Dr. BC

forecast harmful real-world events, such as

Roy Gold Medal.

mass shootings. “Broadly, my group’s research is in data science and applied machine learning and we

B. ADITYA PRAKASH

create the next generation of algorithms to

Associate Professor

understand and improve how users behave online and how it impacts society,” said Kumar. These next-generation algorithms that

Prakash’s research invents new data science and machine learning techniques for networks

Kumar references are used to understand and

and sequences. His work has applications in

forecast deceptive behavior that attempts to

public health, cybersecurity, critical infrastructure

B. Aditya Prakash

Xiuwei Zhang

cse.gatech.edu

11


systems, and the web. By using these techniques,

at IIT-Bombay. He is a recipient of the NSF

Prakash is able to solve real-world problems

CAREER award, multiple best paper awards, and

and develop tools to help leading organizations

was named as one of ‘AI 10 to Watch’ by the IEEE.

such as the Centers for Disease Control and Prevention (CDC), Wal-Mart, Facebook, and Oak Ridge National Laboratory (ORNL). “A big draw for me to these technically

XIUWEI ZHANG Assistant Professor

challenging problems is their inherent interdisciplinarity and potential for high societal

Zhang joined CSE on Aug. 1 after working as

impact. Simply put, progress here can save lives

a postdoctoral researcher in the Electrical

and make a real difference,” he said.

Engineering and Computer Sciences

For Prakash, making a difference does not

Department at the University of California at

end with just understanding the data and using

Berkeley. Zhang’s research focuses on data

it for different applications. Instead, Prakash

science, method development, and data analysis

believes in using data science as a means to

with an emphasis on computational biology.

drive informed policies and decisions.

While at Berkeley, her time centered on two different projects,

“A big draw for me to these technically challenging each using singleproblems is their inherent interdisciplinarity and potential for high societal impact.”

cell sequencing. One project, called SymSim, published in the Nature Communications journal,

“Networks are a great abstraction for

developed a simulator to model processes

modeling real-world phenomena. As they give

observed during single cell RNA sequencing

us both a local and a global perspective, they

experiments.

are able to provide an opportunity to bridge

According to Zhang, the SymSim simulates

gaps between data, models, and actionable

single cell RNA data which allows researchers

strategies,” he said.

to benchmark various computational methods.

His work is now used for a wide variety

“What we really want to understand is

of these phenomena including finding failure

what is controlling all the changes in the cells

hot spots in energy grids, guiding users to

and track their differences,” she said. “On a

relevant products on e-commerce websites,

mechanism level, we need to not only look at the

and designing policies to determine how best to

RNA sequencing data but also integrate other

allocate scarce resources for hospital infection

types of data such as protein analysis.”

control. His group is also taking part in the

Zhang has won several distinguishing

CDC forecasting project for past and current

awards in the areas of computational biology

pandemics, which aims to use influenza-like

and data anlysis, including:

illness surveillance data to understand the trajectory of disease outbreak in the US.

n

S wiss National Science Foundation (SNSF)

n

S NSF Advanced Postodc Mobility Fellowship,

n

S imons-Berkeley Research Fellowship, 2016 n

Prior to joining Georgia Tech, Prakash was an associate professor of computer science at

Fellowship for Prospective Researchers, 2012

Virginia Tech. He received his Ph.D. at Carnegie Mellon University and an undergraduate degree 12

COLLEGE OF COMPUTING

2014


cse.gatech.edu

13


CSE Student Spotlight

Xinshi Chen

Hua Huang

Computational Science and Engineering offers a uniquely interdisciplinary pool of student researchers who specialize in bridging software and hardware together with real-world applications ranging from high-performance computing to cybersecurity and more. We’d like to introduce you to several CSE students with diverse research interests who have received awards for their outstanding work this year. Xinshi Chen is a CSE Ph.D. student advised

“Both algorithms and deep learning models

by CSE Associate Professor Le Song, who

are solving problems and making predictions

specializes in principled machine learning

for various tasks. Our project investigates the

research with a focus on learning-based

connection between traditional algorithms and

algorithm design and deep learning for

deep learning models, and the strengths of

structured data.

these two can be combined to help each other.”

Chen was honored in 2020 for outstanding

According to Chen, the design of algorithms

graduate research in machine learning, a

can be automated and improved upon by

field in which she has only recently begun

learning from data with the data-driven

exploring since she started the program with a

components filling the gaps between the

background in math.

rules designed by experts and the real-world

Since coming to Georgia Tech and working under Song’s guidance, Chen has thrived in the

observations. “On the other hand, deep learning models can

machine learning field. She is currently working

use algorithm structures as inductive bias for

on algorithmic design research that aims to

designing the architectures, which can improve

automatically learn an algorithm from data

the data efficiency and interpretability of deep

and apply the learned algorithm to solve new

learning models,” she said.

problems. 14

COLLEGE OF COMPUTING

“By viewing learning-based algorithms as


Austin Wright

Wendi Ren

deep learning models, currently we are designing

understanding of science for the purpose of

a theoretical framework to understand their

improving the human condition.

behaviors from the learning theory perspective,

Huang is advised by CSE Associate Professor

by characterizing their generalization,

Edmond Chow and works in the field of

representation abilities, etc.,” said Chen.

computational chemistry. Specifically, Huang’s

When asked why Chen chose to study with

research focuses on creating effective and

the School of CSE she said it was due to its

efficient frameworks that use high-performance

interdisciplinary approach to modern science.

computing to facilitate better processes for

“There is a board range of research topics that the CSE faculties are working on, and many

computational chemistry problems. His recognized thesis, Performance

of these research topics are closely related

Optimizations for Quantum Chemistry

to our daily life, including high performance

Calculations, focuses on three topics: (1)

computing, healthcare, and computational

batching and vectorizing electron repulsion

biology. It is very useful to attend the seminars

integral (ERI) calculations, since ERI

organized by CSE and interact with people

calculation is a building block in quantum

with different backgrounds as it can bring in

chemistry calculations; (2) improving network

new ideas from communities outside machine

communication performance for a large

learning,” she said.

scale eigen solver in quantum chemistry calculations; and (3) implementation of a lightweight portable library for distributed

CSE Ph.D. student Hua Huang won the Georgia

matrix communications in quantum chemistry

Tech 2020 Sigma Xi Best MS Thesis award for

calculations.

2019. Sigma Xi is a scientific research honor

“In my research, I pay a lot of attention to the

society that awards achievement in science

difference between the theories and the actual

or engineering research and communication

performance of the software. The larger the

as part of its mission to enhance the health of

difference, the more we can dig into it and learn

research enterprise and promote the public’s

something new,” said Huang. cc.gatech.edu

15


CSE Student Spotlight computational challenges we need to overcome. Meanwhile, CSE has a powerful supercomputer, the IDEaS-Hive system, which allows us to fully explore the possibilities of calculations,” he said.

Austin Wright is a CSE Ph.D. student advised by Associate Professor Polo Chau, with a research focus on human centered design and AI for social good. Wright was recently selected as a GT-GSU Public Interest Technology (PIT) Fellow for 2020. This program was founded to support Patrick Flick

collaborations between technologists and social scientists centered around the continued equity challenges of the Southeast region and provides a model for regional PIT work focused

“In the United States and other countries, about one third of computer time on supercomputers

on community challenges. Specifically, Wright will be collaborating with

is used for quantum chemistry calculations.

Professor Scott Jacques from the Andrew

My thesis proposes several new methods for

Young School of Public Policy at Georgia State

optimizing those commonly used computational

to develop novel data analysis and visualization

kernels in quantum chemistry calculations.”

tools to make crime and criminology data and

Currently, Huang and his research team are

trends more accessible. “Currently, the availability of standardized

“Many of us make important

and up-to-date crime incidence data from the

decisions on the basis of crime

federal government is complicated by the

trends, such as where to live or what public policy to support.”

tools used for dissemination. Furthermore, the audience for whom this data is the most important, does not often have extensive training in data visualization and analysis;

working with several developers of popular

which can lead to erroneous conclusions or

quantum chemistry packages to implement

misleading charts in news publications or even

and benefit from his thesis’ findings. With the

in academic research,” said Wright.

addition of Huang’s research, these packages

By making the process of scraping up-to-date

will help facilitate research for material science

data easier, and then enabling subject matter

and drug design.

experts to more effectively generate quantitative

“Many CSE professors have a deep

analyses, this research can make insights from

understanding of both high-performance

the existing data more readily available to those

computing and the scientific areas they are

in academia and the general public.

working in. The combination of domain specific

“Many of us make important decisions on

knowledge and the experience of writing code for

the basis of crime trends, such as where to live

supercomputers allows us to better define the

or what public policy to support. By making

16

COLLEGE OF COMPUTING


“Compared to other graduate programs, I feel that CSE at Georgia Tech could provide me a free space to choose between research and job prospects.” — Wendi Ren, CSE M.S the data and analysis of this information more

“Compared to other graduate programs, I feel

available and less prone to mathematical error,

that CSE at Georgia Tech could provide me a

more of us can easily make informed decisions

free space to choose between research and job

based on these trends,” he said.

prospects. I did not have a very clear career path,

The project makes use of a wide variety

so I wanted to try different options. Luckily, CSE

of methodologies including human centered

provides enough resources to be successful in

design practices, data visualization and

both industry and academia,” she said.

automated data analysis, as well as computer science and social science analysis. “My advisor, Polo Chau, has been immensely

“I like the course setting of CSE very much. It is very flexible for us to choose courses either related to research topics or job required skills.

supportive of this collaboration and project.

The high quality of each course with reasonable

While it is interdisciplinary, and much of it

design and workload really gives us a solid CS

does not always fit neatly into the standard

background.”

delineations of academic disciplines, the

Ren is currently creating training algorithms

flexibility and focus on impact of CSE has

that are able to learn neural text classifiers

allowed me to pursue this goal in a way that I feel

without using any labeled data and only easy-to-

is very unique to this department,” said Wright.

provide heuristic rules as weak supervision. “It is challenging because rule-induced labeled data are often noisy and have low coverage. To

Wendi Ren is a CSE M.S. student advised

address these challenges, we propose a model

by Assistant Professor Chao Zhang, who

that is learned from multiple weak supervision

specializes in machine learning for text data

sources with two key components,” she said.

with an emphasis on improving label efficiency and model interpretation. Ren is the recipient of the 2020 Marshall D.

To date, Ren’s approach, which uses two components – a rule denoiser and a neural classifier – that are integrated into a co-training

Williamson Fellowship, an award that honors

framework, has outperformed state-of-the-

second year master’s students who embody

art weakly-supervised and semi-supervised

values of academic excellence and leadership.

methods by 9.2 percent on average on five

She was honored with this award for her good

widely used benchmarks.

academic performance with a high GPA as

“Deep learning techniques have demonstrated

well as being recognized as an outstanding

superior performance in text mining tasks,

graduate teaching assistant (TA) with the Thank

however, deep neural networks (DDNs) are data

a Teacher award from the Georgia Tech Center

hungry. And in many applications, large-scale

for Teaching and Learning.

labeled data are unavailable and manually

While Georgia Tech has always been a dream school for Ren, choosing a graduate program took some consideration as she searched for a flexible yet robust degree.

annotating data at a large scale can be prohibitively expensive,” said Ren. In such cases, the lack of training data becomes the key bottleneck of applying DNNs cse.gatech.edu

17


“By making the process of scraping up-to-date data easier, and then enabling subject matter experts to more effectively generate quantitative analyses, this research can make insights from the existing data more readily available to those in academia and the general public” — Austin Wright, CSE Ph.D

18

COLLEGE OF COMPUTING


for text classification. Ren’s work solves

and per-node communication volume within a

this problem by only using easily accessible

computer’s distributed memory architecture.

heuristic rules, which eliminates the time-

Ultimately, these findings offer a more

consuming annotation process but maintains

efficient method to represent, construct,

the high classification accuracy.

and query data structures for large-scale and memory intensive applications in text processing, information retrieval, and

CSE Ph.D. alumnus Patrick Flick has been selected for the prestigious (ACM) SIGHPC

computational biology. Flick joined CSE for his Ph.D. in 2014 under

Dissertation Award for 2020. Flick is the first

the guidance of CSE Professor and Interim Chair

recipient in Georgia Tech history to receive the

Srinivas Aluru.

award which honors one outstanding doctoral

According to Aluru, “Patrick’s Ph.D. work

dissertation focused on high-performance

addresses some notoriously difficult problems

computing (HPC) research each year.

in parallel string algorithms, and his dissertation

The winning dissertation, Parallel and

gets it just right by providing both theoretical

Scalable Combinatorial String Algorithms on

optimality and practical efficiency. His work, all

Distributed Memory Systems, offers a new

published in top forums in the field, has lasting

approach to solve large-scale string and graph

value. It is gratifying to see him win this year’s

problems used throughout computational

ACM SIGHPC Dissertation Award.”

biology applications. The computational

Flick defended his thesis in March 2019 and

methods introduced in Dr. Flick’s work achieve

officially graduated the following May. He is

efficient and scalable execution on large-

now a software engineer at Google.

scale distributed compute clusters, achieving

Flick’s previous successes include authoring the first paper used for the Student

Flick defended his thesis in March 2019 and officially graduated the following May. He is now a software engineer at Google

Cluster Reproducibility Challenge at Supercomputing 2016 and winning the Best Student Paper Award at Supercomputing 2015. SIGHPC is the ACM’s special interest group that focuses on

solutions to increasingly larger problems. Inspired by the advent of high-throughput

providing a platform for high-performance computing (HPC) research and efforts

DNA sequencing which enables generation of

internationally. The ACM SIGHPC Dissertation

billions of reads per minute, and the growing

Award pulls from this professional society in

need to find a computational approach that

an effort to highlight innovative and prolific

can keep pace, this research expands on

research in the supercomputing and parallel

prior theoretical approaches. The resulting

processing fields.

algorithms and data structures implemented by

The 2020 ACM SIGHPC Dissertation Award

Flick advance the state-of-the-art by providing

includes a $2,000 honorarium, travel support to

improved theoretical complexity and better

the Supercomputing Conference, and an award

practical performance, while minimizing overall

plaque. n

cse.gatech.edu

19


Faculty n Srinivas

Aluru, Professor, Interim Chair

l Executive

Director, Institute for Data Engineering and Science Ph.D., Iowa State University, 1994

l I EEE

Fellow, SIAM Fellow, AAAS Fellow, NSF CAREER Award, John V. Atanasoff Discovery Award, Outstanding Achievement in Research Program Development Award

n Mark

Borodovsky, Regents’ Professor, Joint with Wallace H. Coulter Department of Biomedical Engineering

l Director,

Center for Bioinformatics and Computational Genomics

l Ph.D.,

Moscow Institute of Physics and Technology, 1976

l

ISCB Fellow, AIMBE Fellow

n Ümit

V. Çatalyürek, Professor, Associate Chair for Academic Programs

l

l

l IEEE

20

Director, CSE Graduate Programs Ph.D., Bilkent University, Turkey, 2000 Fellow, SIAM Fellow, NSF CAREER Award

COLLEGE OF COMPUTING

n Polo

Chau, Associate Professor

l

Associate Director, MS Analytics

l

Ph.D., Carnegie Mellon University, 2012

l Raytheon

Faculty Fellowship Award, James Edenfield Faculty Fellowship, Intel Outstanding Researcher Award, Google Faculty Research Award

n Elizabeth

l

Cherry, Associate Professor

Ph.D., Duke University, 2000

n Edmond

Chow, Associate Professor

l

Ph.D., University of Minnesota, 1997

l

PECASE Award, ACM Gordon Bell Prize

n Barry

Drake, Senior Research Scientist, Joint with Georgia Tech Research Institute of Cyber Technology and Information Security,

l

M.S., University of Washington

n Richard

Fujimoto, Regents’ Professor

l Ph.D.

University of California Berkeley,

l ACM

1983

Fellow, IEEE Fellow, 2019 I/ITSEC Fellow, Class of 1934 Outstanding Interdisciplinary Activities Award


n Felix

Herrmann, Professor, Joint with School of Earth and Atmospheric Sciences and School of Electrical and Computer Engineering,

l

Ph.D., Delft University of Technology, 1997

l Georgia

Research Alliance Eminent Scholar in Energy, 2019 Distinguished Lecturer of the Society of Exploration of Geophysicists, Reginald Fessenden Award

n Tobin

Isaac, Assistant Professor

l

l ACM

Ph.D., University of Texas at Austin, 2015 Gordon Bell Prize, James E. Allchin Assistant Professor Award

n Surya

Kalidindi, Regents’ Professor, Joint with School of Mechanical Engineering

l Ph.D.,

Massachusetts Institute of Technology, 1992

l DoD

Vannevar Bush Faculty Fellowship, Alexander von Homboldt Research Award, Khan International Award

n Srijan

n David

Sherrill, Professor, Joint with School of Chemistry and Biochemistry

l

Ph.D., University of Georgia, 1996

l A AAS

Fellow, American Chemical Society Fellow, American Physical Society Fellow, Vasser Wooley Faculty Fellow, NSF CAREER Award

n Le

Song, Associate Professor,

l Associate

l Ph.D.,

l

Learning

University of Sydney and National ICT Australia, 2008 NSF CAREER Award

n Rich

Vuduc, Professor

l Director,

Center for High Performance Computing

l Ph.D.,

l NSF

University of California Berkeley, 2004

CAREER Award, Gordon Bell Prize, Lockheed-Martin Aeronautics Company Dean’s Award for Teaching Excellence

Kumar Assistant Professor

n Honyuan

l Ph.D.,

l F acebook

University of Maryland, 2017

Faculty Award, Adobe Faculty Award, ACM SIGKDD Doctoral Dissertation Award Runner-up

n Haesun

l

l IEEE

Park, Regents’ Professor

Ph.D., Cornell University, 1987 Fellow, SIAM Fellow, 2019 Faces of Inclusive Excellence

n B.

Aditya Prakash, Associate Professor

l

Ph.D., Carnegie Mellon University, 2012

l Facebook

Faculty Award, NSF CAREER Award, IEEE ‘AI 10 to Watch’

l Ph.D.,

n Chao

Zha, Professor

Stanford University, 1993

Zhang, Assistant Professor

l Ph.D.,

l Google

University of Illinois at UrbanaChampaign, 2018

n Xiuwei

Director, Center for Machine

Faculty Research Award

Zhang, Assistant Professor

l Ph.D.,

École Polytechnique Fédérale de Lausanne, 2011


cse.gatech.edu Georgia Institute of Technology 801 Atlantic Drive Atlanta, GA 30332-3000

facebook.com/gtcomputing instagram.com/gtcomputing @gtcse

22

COLLEGE OF COMPUTING