Innovation
2020 ANNUAL REPORT
cse.gatech.edu
1
Faculty Awards
n
n
n
rofessor Haesun Park and Joint CSE P Professor Surya Kalidindi were named Regents’ Professors. The title of Regent’s Professor serves as recognition of the highest academic merit at Georgia Tech and is awarded by the Board of Regents to outstanding tenured, full professors, based on excellence in research and contributions to their profession and to Georgia Tech. Both have made numerous and impactful contributions to their fields of research in data analysis, parallel computing, data mining (Park) and physics-based multi-scale modeling and microstructure design (Kalidindi). ssistant Professor Chao Zhang was selected A as the winner of the 2019-2020 Google Faculty Research Award for the category of structured data. Zhang was recognized with this award for his research developing new techniques that quantify uncertainty for deep learning models which allow these models to better process unstructured data, improving the robustness and effectiveness of existing knowledge extraction technology. Zhang was also awarded the ACM SIGKDD 2019 Dissertation Runner-up Award for his research in multidimensional mining of unstructured text data. A ssociate Professor Polo Chau received the 2019 Outstanding Researcher Award from Intel, honoring his innovations in artificial intelligence and security, and for his contributions to the Intel Science & Technology Center for Adversary-Resilient Security Analytics (ISTCARSA) at Georgia Tech. The award is given annually to recognize researchers across scientific disciplines who have demonstrated exceptional innovation for work related to Intel initiatives. 2
COLLEGE OF COMPUTING
n
R egent’s Professor Haesun Park was named as a Georgia Tech Face of Inclusive Excellence. This honor recognizes faculty, staff, and students whose accomplishments in their research, teaching, leadership, and service endeavors have earned special awards or recognition during the previous academic year.
n
A ssistant Professor Tobin (Toby) Isaac was named as the new Catherine M. and James E. Allchin Assistant Professor. This two-year professorship enables Toby to undertake innovative and high-risk research at his discretion. This also marks the first time a CSE faculty is a recipient of this prestigious professorship, earmarked for early career faculty in the College.
n
A ssociate Professor Le Song received the 2020 College of Computing Outstanding Senior Research Faculty Award. This award is given to a faculty member who has made significant, high quality, innovative contributions to their field of study, visibly impacting one or more mission areas.
Message from the
Chair
The School of Computational Science and Engineering marked an important milestone this year, moving into its new permanent home on the 13th floor of the newly constructed 21-story CODA Building in the technology square area of Georgia Tech campus. Embedded centrally within this building which co-locates data science and high-performance computing researchers and centers/institutes from across Georgia Tech, alongside state-of-the-art data center facilities, provides a vibrant atmosphere for CSE to thrive well into the future. Tech’s strategic vision has nurtured significant data science industry presence both within and near the CODA complex, which augurs well for applied research and industry collaboration aspects of CSE mission. CSE added four new faculty during the last year. Two are junior faculty in their first academic appointment – Srijan Kumar, previously a postdoc at Stanford, and Xiuwei Zhang, previously a postdoc at University of California at Berkeley. We also added two Associate Professors – Elizabeth Cherry and B. Aditya Prakash, who previously served at the Rochester Institute of Technology and Virginia Tech, respectively. Together, these faculty bolster our strengths in data science and applied machine learning (Kumar and Prakash), cardiac modeling (Cherry), and computational biology (Zhang). We are a small school of 15 full-time and 4 joint academic faculty, but our achievements and contributions are anything but. Two of our professors, Surya Kalidindi and Haesun Park, were bestowed the Regents’ Professor title last year, the highest academic and research recognition in the University System of Georgia. Our faculty received six Fellow recognitions from professional societies such as IEEE, SIAM, ISCB, SCS, and I/ITSEC. We maintain a robust active funding portfolio of nearly $25 million spread across all major federal agencies and industry. We are also pivotal to computation and data-driven interdisciplinary research and education at Georgia Tech, particularly with engineering and science disciplines. New initiatives well underway include addition of the school of Mechanical Engineering to our flagship CSE Ph.D. program, and creation of a new M.S. program in Urban Analytics jointly with the School of City and Regional Planning. Our students continue to bring home fellowships, best paper and dissertation awards, both within and outside Georgia Tech. A team of outstanding staff serves as the backbone of our entire operation, taking pride in our collective advancement, well-being, and mission of outreach and service. The annual report offers an eclectic sample of contributions and achievements by CSE students, and research and academic faculty. We remain committed to fueling the growth of computational and data enabled advances in science, engineering, technology, and society, and working with our partners in academia and industry worldwide. Come join us or collaborate with us! Srinivas Aluru, Professor, Interim Chair cse.gatech.edu
1
Research Spotlight
Georgia Tech and Intel
Selected for Multimillion-Dollar DARPA Award
R
esearchers from CSE and Intel are working together to strengthen cybersecurity defenses for machine learning (ML) models designed for vision systems.
Bolstered by a new four-year, multimillion-dollar Defense Advanced
Research Projects Agency (DARPA) grant, the team will create deceptionresistant ML technologies with an emphasis on object detectors for the Guaranteeing AI Robustness against Deception (GARD) program. Object detectors are a type of technology used to identify objects within an image or video using labels and bounding boxes. While no known real-world attacks have been made on these systems, a team of researchers first identified security vulnerabilities in object detectors in 2018 with a project known as ShapeShifter. Led by CSE Associate Professor Polo Chau at Georgia Tech’s Intel Science and Technology Center for Adversary-Resilient Security Analytics (ISTC-ARSA), the ShapeShifter project exposed adversarial machine learning techniques that were able to mislead object detectors and even erase stop signs from autonomous vehicle detection. “As ML technologies have developed, researchers used to think that attacking object detectors would be difficult. ShapeShifter showed us that was not true, they can be affected, and we can attack them in a way to have objects disappear completely or be labeled as anything we want,” said
2
COLLEGE OF COMPUTING
UnMask combats adversarial attacks (in red) by extracting building-block knowledge (e.g., wheel) from the image (top, in green), and comparing them to expected features of the classification (“Bird” at bottom) from the unprotected model. Low feature overlap signals attack. UnMask rectifies misclassification using the image’s extracted features. Chau, who serves as the lead investigator from
there are all these people floating in the air and
Georgia Tech on the GARD program.
are overlapping in odd ways?’ Whereas we would
“The reason we study vulnerabilities in ML
think it’s unnatural,” said Chau. “That is what
systems is to get into the mindset of the bad
spatial coherence attempts to address – does it
guy in order to develop the best defenses. The
make sense in a relative position?”
GARD program provides us with an excellent opportunity for this,” he said. GARD is a DARPA-funded program that
This idea of applying common sense to AI object recognition extends to other coherencebased techniques, such as temporal coherence,
aims to establish theoretical ML foundations
which checks for suspicious objects’ disappearance
to identify system vulnerabilities in real-world
or reappearance over time. The team’s UnMask
applications. Intel and Georgia Tech are leading
semantic coherence technique, which is based on
a program team together under this platform
meaning, looks to identify the parts of an object
with Intel serving as the prime awardee and
rather than just the whole, and verifies that those
Georgia Tech’s funding totaling $1.3 million.
parts indeed make sense.
The four-year program is divided into
In terms of defenses, the goal of all three
three phases with the first phase focused on
coherence-based techniques is to force attackers to
enhancing object detection technologies through
adhere to all categories’ laws created for continuity
spatial, temporal, and semantic coherence
in the AI. This multi-perspective approach thwarts
for both still images and videos. These three
any future attempts by adversarial ML that do
defining qualities of object detectors look for
not meet the complex rules, causing any security
contextual clues to determine if a possible
breach to be flagged.
anomaly or attack is occurring. “Our research develops novel coherence-
As AI models with image recognition software are increasingly implemented and used in daily
based techniques to protect AI from attacks.
applications, the need to understand and thwart
We want to inject common sense into the AI
attacks in such programs is critical across fields.
that humans take for granted when they look
The GARD program aims to develop effective
at something. Even the most sophisticated AI
defenses across broad ranges of attacks, with
today doesn’t ask, ‘Does it make sense that
Georgia Tech and Intel helping lead the way. n cse.gatech.edu
3
Spotlight on Faculty Fellowships
n
R egents’ Professor Richard Fujimoto was named a Fellow for three different organizations this year. Fujimoto was named an Institute of Electrical and Electronics Engineers (IEEE) Fellow, a 2019 Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) Fellow, and the 2020 Fellow for the Society for Modeling and Simulation International (SCS) for his work in parallel and distributed discrete event simulation. Discrete event simulations model operations within a system and have uses in a wide variety of applications. Fujimoto has authored and co-authored hundreds of technical papers on the subject as well as several books, which span application areas including transportation systems, telecommunication networks, and multiprocessor and defense systems. The announcement for all three recognitions came only two years after he was named an Association for Computing Machinery Fellow in 2017.
n
I nterim Chair and Professor Srinivas Aluru and CSE Professor and Associate Chair Ümit V. Çatalyürek have both been inducted into the 2020 Class of Society of Industrial and Applied Mathematics (SIAM) Fellows. Nominated for exemplary research and outstanding service to the SIAM community, Aluru and Çatalyürek’s nominations account for two of the 28 inducted into this year’s international Fellows Program. Aluru was recognized by SIAM for his contributions to the field of computational genomics with sequential and parallel discrete algorithms research and for his leadership in data science and engineering. Çatalyürek was recognized by SIAM for his contributions to the fields of combinatorial scientific computing, and high-performance and parallel algorithms – fields of research in which he has won a number of awards in prior years.
n
J oint CSE Regents’ Professor Mark Borodovsky was elected as an International Society for Computational Biology (ISCB) Fellow for his influential research in developing algorithms for genome analysis as well as his recognized leadership in education and community development. He is one of twelve ISCB Fellows elected in the Class of 2020. Borodovsky is best known for his work on gene finding algorithms which are used widely. He is also responsible for launching the interdisciplinary Bioinformatics Masters and Ph.D. programs at Georgia Tech. He is the first faculty member at Georgia Tech, and in the state of Georgia for that matter, to become an ISCB Fellow.
4
COLLEGE OF COMPUTING
CSE Stats and Infographic April 1, 2019 – April 1, 2020
Program Info:
177
71
Total Student Enrollment Male 126 (71%) Female 51 (29%)
Ph.D. in CSE (male 87%; Female 13%)
2
Ph.D.
CSE and CS
3
M.S.
106 M.S. in CSE (male 60%; Female 40%)
CSE, CS, Analytics
4
Interdisciplinary Programs
M.S. and Ph.D in Bioinformatics / M.S. and Ph.D. in Bioengineering
Undergraduate Diversity
FY20 Research Funding
15.8
25.2
Percentage of underrepresented minority computing undergraduates
Percentage of female computing undergraduates
People
47 19 15 7 6
TOTAL STAFF AND FACULTY ACADEMIC FACULTY
6 5 4 4
RESEARCH SCIENTISTS
$24,928,169
$2,970,504
Total Active Funding
Research Expenditures
74
$228,500
CSE Active Research Projects
Research Expenditures per Faculty
$492K
Air Force (1)
$1.2M
ASSOC PROFESSORS
DARPA (1)
ADJUNCT APPOINTMENTS
ASSISTANT PROFESSORS
National Science Foundation (29)
STAFF
JOINT APPOINTMENTS
PROFESSORS
$14.1M $1M
National Institutes of Health (2)
$268K
Industry (2)
$4.6M
Georgia Tech Foundation (26)
$153K NASA (1)
$1.2M
National Labs (PNNL, Sandia, etc..) (6)
$828K Department of Energy (3)
$769K
Other (subawards) (3)
cse.gatech.edu
5
CSE Provides
COVID-19 Relief
with Big Data Research W
ith the COVID-19 pandemic changing the global landscape, CSE faculty have harnessed their diverse research initiatives to unite under
the goal of facilitating aid during the crisis. From cross-institute computational epidemiology projects to cybersecurity efforts, there is no shortage of ways in which CSE research is being utilized both by local governments and global organizations to address the defining challenge of the century. Continue reading on for highlights showcasing some of the CSE COVID-19 relief initiatives. Disease Forecasting A new COVID-19 forecasting project hosted by the Centers for Disease Control and Prevention (CDC) is helping predict the coronavirus disease spread, peak weeks and days of new cases, and anticipate the number of hospitalizations caused by COVID-19 across the country. This critically timed project is comprised of over 10 teams of leading data scientists, epidemiologists, statisticians, and high-performance comput-
Their team is using a deep learning model to predict characteristics of the novel coronavirus disease at the national, regional, state, and local levels.
ing researchers from national laboratories, public universities, public health institutions, and some private sector agents.
Associate Professor Aditya Prakash and CSE Ph.D. student Alexander Rodriguez are lead investigators on the project for one of the teams and are bringing a new data-driven approach to disease forecasting. Their team is using a deep learning model to predict characteristics of the novel coronavirus disease at the national, regional, state, and local levels. The CDC synthesizes their weekly and monthly predictions with other models to determine policy and resource allocation to help communities prepare for and fight the disease. “We want to predict quickly and early to give lead times to decision makers to decide appropriately when to and how to allocate resources such as determining where to send ventilators, where additional beds are most critically needed, vaccine creation timelines, implementing temporary shelter in-place orders, whether additional guidance to state and local authorities is needed, and more,� said Prakash. 6
COLLEGE OF COMPUTING
CSE Research Technologist Will Powell works on servers in the College of Computing Building Data Center during the Covid-19 pandemic.
Prakash is an expert in epidemiology and infectious disease forecasting and has been a lead researcher on the preexisting influenza forecasting project with the CDC since 2018. Predicting Hate Crimes Amid COVID-19 Outbreak CSE Assistant Professor Srijan Kumar has helped alleviate issues brought on by COVID-19 on the sociological front. As the novel coronavirus has spread across the globe, reports of harassment and cases of violent attacks aimed towards Asians and Asian Americans have dramatically increased. To combat this targeted harassment, Kumar and his new lab, CLAWS, have developed a data science pipeline to leverage social media signals to measure how targeted hate and racism has spread worldwide, and how we can use data-driven solutions to forecast these attacks. “In many unfortunate documented cases, xenophobic behaviors excited by the coronavirus have led to extreme physical harm to the victims and mental distress for others in the community,� said Kumar. The pipeline combines online data from news and social media platforms with offline data, such as reports of physical abuse and racial incidents, together to measure how targeted harassment is spreading in online and physical communities while comparing the spread side-by-side with the disease coverage. There are five levels of harassment, with the first level being the least damaging. Each level after is subsequentially more aggressive, with the fifth level including fatalities caused by hate crimes. “We are coding up each incident with a corresponding number and then cse.gatech.edu
7
scraping news websites, collecting all the race incidents and physical abuse incidents in the real world, and essentially creating a timeline of how things have progressed,” said Kumar. “Our hypothesis is as this virus is spreading through different cities and different states, people get more alarmed and scared, and that anxiety leads to more hate and harassment crimes in those particular cities,” he said. Kumar’s team is the first ever to create a real-time database and pipeline that can forecast deviant behaviors using disease spread as an indicator. The pipeline is also general enough to be customizable to different activities, such as monitoring and predicting harassment of other minority groups, drug use, child abuse, suicides, and more. “When pandemics or crises happen, there is an increase in public health and safety issues. So, we are using this particular framework as this is a first-use case to track and predict these problems and we plan to expand to others,” said Kumar. Increasing Test Kit Availability A lack of widely available testing lies at the forefront of alleviating the pandemic crisis in the United States. As of April 2020, this is a particularly critical problem for the state of Georgia which ranked as one of the bottom states in the country for testing per capita. To combat this issue, state leadership launched the Georgia Coronavirus Task Force, which was created to assess the state’s preparations and procedures for preventing, identifying, and addressing the spread of COVID-19.
8
COLLEGE OF COMPUTING
As part of this program, universities across the state are coordinating with the task force for what is referred to as a “lab surge” to increase the availability of testing for Georgia residents. Efforts are also underway to manage exchange of critical supplies and carry out pandemic modeling to inform policy makers. Through this effort, Georgia Tech will significantly increase the number of coronavirus test kits processed, which will also alleviate the supply chain for other critically needed public health items. Using HPC to Model Pandemic Simulations CSE Interim Chair and Professor Srinivas Aluru has contributed to these initiatives using high performance computing and data science. He is also supporting the state’s efforts through coordinating the use of Hive, Georgia Tech’s new $5.3 million high-performance computing (HPC) system, for creating pandemic simulations. Aluru said, “Many people use models to predict two things. One is to predict expected future scenarios, this could be in terms of hospitalizations, deaths, and infections. This could also be used to model resources such as personal protective equipment (PPE) and you could model what kind of resource needs a hospital will have. Models can also predict different types of use cases such as when school closures or businesses reopening could and should happen.” However, given the nature of daily changes from both human and virus behavior, models will likely never be perfect. But they help us develop a better understanding of what type of action is needed for a problem and are a valuable tool for policy makers to make critical decisions. This is where high-performance computers, such as the Hive, are a great asset. The Hive was acquired by the Institute for Data Engineering and Science (IDEaS) through a $3.7 million National Science Foundation (NSF) Major Research Instrumentation award and Aluru serves as the primary investigator on the project. The supercomputer has over 100 trillion bytes of memory, 11,500 compute cores, and 2.5 quadrillion bytes of storage. Some of Hive capacity is redirected to accommodate COVID-19 research. School of Industrial and Systems Engineering Professor Pinar Keskinocak was the first to have their simulations research migrated to the Hive. CSE Associate Professor Aditya Prakash is also conducting his work on an NSF Expeditions Project using the Hive. This large multi-institution project, which received a $10 million grant, is actively working with multiple federal and state agencies to support response efforts for the current pandemic. The project is specifically helping support response efforts by capturing the complexities underlying the disease, providing new analytical capabilities to decision makers, and using AI to develop simulations of multi-scale, multi-layer networks to provide insights into how the pandemic can be controlled.
n
cse.gatech.edu
9
CSE Welcomes Four New Faculty The School of Computational Science and Engineering continues to diversify and expand its faculty pool. Since 2019, CSE has added four new tenure-track faculty members, each hailing from diverse regions across the globe and equally diverse research backgrounds. With these new research strengths integrated into the already robust CSE research agenda, the new research makeup of the school boasts more data mining and bioinformatics research than ever before. In a time when misinformation and infectious disease are on the rise, this could be seen as the most clearly telling form of CSE leaderships’ forward-thinking and strategic growth plan. Meet the new CSE faculty below.
ELIZABETH CHERRY
utilizes mathematical modeling and simulation
Associate Professor
of the electrical dynamics of cardiac cells and tissue. This field of research is highly
Cherry joined CSE this month after teaching
interdisciplinary and requires expertise in a
at the School of Mathematical Sciences at the
number of sub-fields, including modeling,
Rochester Institute of Technology. While at the
computational algorithms, scientific
Rochester Institute, she found her researching
applications, and advanced interactive
stride in mathematical biology with a focus on
visualization.
cardiac electrophysiology and arrhythmias. According to Cherry, “Sudden death is
She has won numerous awards and recognitions for her work in this field, including:
secondary to ventricular fibrillation and remains a leading cause of mortality in the United States.
n
R ecipient of Trustees Scholarship Award,
n
R ecipient of the Outstanding Student Mentor
And the occurrence of atrial fibrillation, which is responsible for about 15 percent of all strokes, continues to rise.” In efforts to address this issue, Cherry
Elizabeth Cherry
10
COLLEGE OF COMPUTING
Rochester Institute of Technology, 2019 Award, Rochester Institute of Technology, College of Science, 2016
Srijan Kumar
n
R ecipient of the Outstanding Faculty of
manipulate and misinform users. Instances in
the Year Award, Rochester Institute of
which these behaviors occur are vast, and can,
Technology, College of Science 2013
according to Kumar, be categorized based on the three areas of use that they impact. “There are three major things people do
SRIJAN KUMAR
online: interact with one another, consume
Assistant Professor
information, and act on the recommendations they are shown. A way to unify and transform
With high-stake decisions being made via the
the user experience is to develop the user
web, the ways in which malicious users engage
models, which are deep-learning and network-
with us online can have a profoundly negative
based models,” he said.
impact on our lives and on society as a whole. For Srijan Kumar, a new assistant professor
However, the applications of Kumar’s work stretch far beyond harassment and his anti-
in Georgia Tech’s School of Computational
abuse algorithms have been used by the likes
Science and Engineering, this is a concept that
of Flipkart, India’s largest E-commerce platform,
transcends social media, encompassing most,
and Wikipedia.
if not all, of the web and society. His research
Prior to Georgia Tech, Kumar was
group, CLAWS (the Computational Lab for the
visiting research scientist at Google AI, and a
Web and Society), was established with the goal
postdoctoral researcher at Stanford University.
of improving the safety and well-being of people
He is the recipient of the 2018 ACM SIGKDD
world-wide. This is achieved by ridding the user
Doctoral Dissertation Award runner-up, WWW
experience of digital abuse and disinformation
2017 Best Paper Award runner-up, Larry S. Davis
pitfalls and using the online social signals to
Doctoral Dissertation Award 2017, and Dr. BC
forecast harmful real-world events, such as
Roy Gold Medal.
mass shootings. “Broadly, my group’s research is in data science and applied machine learning and we
B. ADITYA PRAKASH
create the next generation of algorithms to
Associate Professor
understand and improve how users behave online and how it impacts society,” said Kumar. These next-generation algorithms that
Prakash’s research invents new data science and machine learning techniques for networks
Kumar references are used to understand and
and sequences. His work has applications in
forecast deceptive behavior that attempts to
public health, cybersecurity, critical infrastructure
B. Aditya Prakash
Xiuwei Zhang
cse.gatech.edu
11
systems, and the web. By using these techniques,
at IIT-Bombay. He is a recipient of the NSF
Prakash is able to solve real-world problems
CAREER award, multiple best paper awards, and
and develop tools to help leading organizations
was named as one of ‘AI 10 to Watch’ by the IEEE.
such as the Centers for Disease Control and Prevention (CDC), Wal-Mart, Facebook, and Oak Ridge National Laboratory (ORNL). “A big draw for me to these technically
XIUWEI ZHANG Assistant Professor
challenging problems is their inherent interdisciplinarity and potential for high societal
Zhang joined CSE on Aug. 1 after working as
impact. Simply put, progress here can save lives
a postdoctoral researcher in the Electrical
and make a real difference,” he said.
Engineering and Computer Sciences
For Prakash, making a difference does not
Department at the University of California at
end with just understanding the data and using
Berkeley. Zhang’s research focuses on data
it for different applications. Instead, Prakash
science, method development, and data analysis
believes in using data science as a means to
with an emphasis on computational biology.
drive informed policies and decisions.
While at Berkeley, her time centered on two different projects,
“A big draw for me to these technically challenging each using singleproblems is their inherent interdisciplinarity and potential for high societal impact.”
cell sequencing. One project, called SymSim, published in the Nature Communications journal,
“Networks are a great abstraction for
developed a simulator to model processes
modeling real-world phenomena. As they give
observed during single cell RNA sequencing
us both a local and a global perspective, they
experiments.
are able to provide an opportunity to bridge
According to Zhang, the SymSim simulates
gaps between data, models, and actionable
single cell RNA data which allows researchers
strategies,” he said.
to benchmark various computational methods.
His work is now used for a wide variety
“What we really want to understand is
of these phenomena including finding failure
what is controlling all the changes in the cells
hot spots in energy grids, guiding users to
and track their differences,” she said. “On a
relevant products on e-commerce websites,
mechanism level, we need to not only look at the
and designing policies to determine how best to
RNA sequencing data but also integrate other
allocate scarce resources for hospital infection
types of data such as protein analysis.”
control. His group is also taking part in the
Zhang has won several distinguishing
CDC forecasting project for past and current
awards in the areas of computational biology
pandemics, which aims to use influenza-like
and data anlysis, including:
illness surveillance data to understand the trajectory of disease outbreak in the US.
n
S wiss National Science Foundation (SNSF)
n
S NSF Advanced Postodc Mobility Fellowship,
n
S imons-Berkeley Research Fellowship, 2016 n
Prior to joining Georgia Tech, Prakash was an associate professor of computer science at
Fellowship for Prospective Researchers, 2012
Virginia Tech. He received his Ph.D. at Carnegie Mellon University and an undergraduate degree 12
COLLEGE OF COMPUTING
2014
cse.gatech.edu
13
CSE Student Spotlight
Xinshi Chen
Hua Huang
Computational Science and Engineering offers a uniquely interdisciplinary pool of student researchers who specialize in bridging software and hardware together with real-world applications ranging from high-performance computing to cybersecurity and more. We’d like to introduce you to several CSE students with diverse research interests who have received awards for their outstanding work this year. Xinshi Chen is a CSE Ph.D. student advised
“Both algorithms and deep learning models
by CSE Associate Professor Le Song, who
are solving problems and making predictions
specializes in principled machine learning
for various tasks. Our project investigates the
research with a focus on learning-based
connection between traditional algorithms and
algorithm design and deep learning for
deep learning models, and the strengths of
structured data.
these two can be combined to help each other.”
Chen was honored in 2020 for outstanding
According to Chen, the design of algorithms
graduate research in machine learning, a
can be automated and improved upon by
field in which she has only recently begun
learning from data with the data-driven
exploring since she started the program with a
components filling the gaps between the
background in math.
rules designed by experts and the real-world
Since coming to Georgia Tech and working under Song’s guidance, Chen has thrived in the
observations. “On the other hand, deep learning models can
machine learning field. She is currently working
use algorithm structures as inductive bias for
on algorithmic design research that aims to
designing the architectures, which can improve
automatically learn an algorithm from data
the data efficiency and interpretability of deep
and apply the learned algorithm to solve new
learning models,” she said.
problems. 14
COLLEGE OF COMPUTING
“By viewing learning-based algorithms as
Austin Wright
Wendi Ren
deep learning models, currently we are designing
understanding of science for the purpose of
a theoretical framework to understand their
improving the human condition.
behaviors from the learning theory perspective,
Huang is advised by CSE Associate Professor
by characterizing their generalization,
Edmond Chow and works in the field of
representation abilities, etc.,” said Chen.
computational chemistry. Specifically, Huang’s
When asked why Chen chose to study with
research focuses on creating effective and
the School of CSE she said it was due to its
efficient frameworks that use high-performance
interdisciplinary approach to modern science.
computing to facilitate better processes for
“There is a board range of research topics that the CSE faculties are working on, and many
computational chemistry problems. His recognized thesis, Performance
of these research topics are closely related
Optimizations for Quantum Chemistry
to our daily life, including high performance
Calculations, focuses on three topics: (1)
computing, healthcare, and computational
batching and vectorizing electron repulsion
biology. It is very useful to attend the seminars
integral (ERI) calculations, since ERI
organized by CSE and interact with people
calculation is a building block in quantum
with different backgrounds as it can bring in
chemistry calculations; (2) improving network
new ideas from communities outside machine
communication performance for a large
learning,” she said.
scale eigen solver in quantum chemistry calculations; and (3) implementation of a lightweight portable library for distributed
CSE Ph.D. student Hua Huang won the Georgia
matrix communications in quantum chemistry
Tech 2020 Sigma Xi Best MS Thesis award for
calculations.
2019. Sigma Xi is a scientific research honor
“In my research, I pay a lot of attention to the
society that awards achievement in science
difference between the theories and the actual
or engineering research and communication
performance of the software. The larger the
as part of its mission to enhance the health of
difference, the more we can dig into it and learn
research enterprise and promote the public’s
something new,” said Huang. cc.gatech.edu
15
CSE Student Spotlight computational challenges we need to overcome. Meanwhile, CSE has a powerful supercomputer, the IDEaS-Hive system, which allows us to fully explore the possibilities of calculations,” he said.
Austin Wright is a CSE Ph.D. student advised by Associate Professor Polo Chau, with a research focus on human centered design and AI for social good. Wright was recently selected as a GT-GSU Public Interest Technology (PIT) Fellow for 2020. This program was founded to support Patrick Flick
collaborations between technologists and social scientists centered around the continued equity challenges of the Southeast region and provides a model for regional PIT work focused
“In the United States and other countries, about one third of computer time on supercomputers
on community challenges. Specifically, Wright will be collaborating with
is used for quantum chemistry calculations.
Professor Scott Jacques from the Andrew
My thesis proposes several new methods for
Young School of Public Policy at Georgia State
optimizing those commonly used computational
to develop novel data analysis and visualization
kernels in quantum chemistry calculations.”
tools to make crime and criminology data and
Currently, Huang and his research team are
trends more accessible. “Currently, the availability of standardized
“Many of us make important
and up-to-date crime incidence data from the
decisions on the basis of crime
federal government is complicated by the
trends, such as where to live or what public policy to support.”
tools used for dissemination. Furthermore, the audience for whom this data is the most important, does not often have extensive training in data visualization and analysis;
working with several developers of popular
which can lead to erroneous conclusions or
quantum chemistry packages to implement
misleading charts in news publications or even
and benefit from his thesis’ findings. With the
in academic research,” said Wright.
addition of Huang’s research, these packages
By making the process of scraping up-to-date
will help facilitate research for material science
data easier, and then enabling subject matter
and drug design.
experts to more effectively generate quantitative
“Many CSE professors have a deep
analyses, this research can make insights from
understanding of both high-performance
the existing data more readily available to those
computing and the scientific areas they are
in academia and the general public.
working in. The combination of domain specific
“Many of us make important decisions on
knowledge and the experience of writing code for
the basis of crime trends, such as where to live
supercomputers allows us to better define the
or what public policy to support. By making
16
COLLEGE OF COMPUTING
“Compared to other graduate programs, I feel that CSE at Georgia Tech could provide me a free space to choose between research and job prospects.” — Wendi Ren, CSE M.S the data and analysis of this information more
“Compared to other graduate programs, I feel
available and less prone to mathematical error,
that CSE at Georgia Tech could provide me a
more of us can easily make informed decisions
free space to choose between research and job
based on these trends,” he said.
prospects. I did not have a very clear career path,
The project makes use of a wide variety
so I wanted to try different options. Luckily, CSE
of methodologies including human centered
provides enough resources to be successful in
design practices, data visualization and
both industry and academia,” she said.
automated data analysis, as well as computer science and social science analysis. “My advisor, Polo Chau, has been immensely
“I like the course setting of CSE very much. It is very flexible for us to choose courses either related to research topics or job required skills.
supportive of this collaboration and project.
The high quality of each course with reasonable
While it is interdisciplinary, and much of it
design and workload really gives us a solid CS
does not always fit neatly into the standard
background.”
delineations of academic disciplines, the
Ren is currently creating training algorithms
flexibility and focus on impact of CSE has
that are able to learn neural text classifiers
allowed me to pursue this goal in a way that I feel
without using any labeled data and only easy-to-
is very unique to this department,” said Wright.
provide heuristic rules as weak supervision. “It is challenging because rule-induced labeled data are often noisy and have low coverage. To
Wendi Ren is a CSE M.S. student advised
address these challenges, we propose a model
by Assistant Professor Chao Zhang, who
that is learned from multiple weak supervision
specializes in machine learning for text data
sources with two key components,” she said.
with an emphasis on improving label efficiency and model interpretation. Ren is the recipient of the 2020 Marshall D.
To date, Ren’s approach, which uses two components – a rule denoiser and a neural classifier – that are integrated into a co-training
Williamson Fellowship, an award that honors
framework, has outperformed state-of-the-
second year master’s students who embody
art weakly-supervised and semi-supervised
values of academic excellence and leadership.
methods by 9.2 percent on average on five
She was honored with this award for her good
widely used benchmarks.
academic performance with a high GPA as
“Deep learning techniques have demonstrated
well as being recognized as an outstanding
superior performance in text mining tasks,
graduate teaching assistant (TA) with the Thank
however, deep neural networks (DDNs) are data
a Teacher award from the Georgia Tech Center
hungry. And in many applications, large-scale
for Teaching and Learning.
labeled data are unavailable and manually
While Georgia Tech has always been a dream school for Ren, choosing a graduate program took some consideration as she searched for a flexible yet robust degree.
annotating data at a large scale can be prohibitively expensive,” said Ren. In such cases, the lack of training data becomes the key bottleneck of applying DNNs cse.gatech.edu
17
“By making the process of scraping up-to-date data easier, and then enabling subject matter experts to more effectively generate quantitative analyses, this research can make insights from the existing data more readily available to those in academia and the general public” — Austin Wright, CSE Ph.D
18
COLLEGE OF COMPUTING
for text classification. Ren’s work solves
and per-node communication volume within a
this problem by only using easily accessible
computer’s distributed memory architecture.
heuristic rules, which eliminates the time-
Ultimately, these findings offer a more
consuming annotation process but maintains
efficient method to represent, construct,
the high classification accuracy.
and query data structures for large-scale and memory intensive applications in text processing, information retrieval, and
CSE Ph.D. alumnus Patrick Flick has been selected for the prestigious (ACM) SIGHPC
computational biology. Flick joined CSE for his Ph.D. in 2014 under
Dissertation Award for 2020. Flick is the first
the guidance of CSE Professor and Interim Chair
recipient in Georgia Tech history to receive the
Srinivas Aluru.
award which honors one outstanding doctoral
According to Aluru, “Patrick’s Ph.D. work
dissertation focused on high-performance
addresses some notoriously difficult problems
computing (HPC) research each year.
in parallel string algorithms, and his dissertation
The winning dissertation, Parallel and
gets it just right by providing both theoretical
Scalable Combinatorial String Algorithms on
optimality and practical efficiency. His work, all
Distributed Memory Systems, offers a new
published in top forums in the field, has lasting
approach to solve large-scale string and graph
value. It is gratifying to see him win this year’s
problems used throughout computational
ACM SIGHPC Dissertation Award.”
biology applications. The computational
Flick defended his thesis in March 2019 and
methods introduced in Dr. Flick’s work achieve
officially graduated the following May. He is
efficient and scalable execution on large-
now a software engineer at Google.
scale distributed compute clusters, achieving
Flick’s previous successes include authoring the first paper used for the Student
Flick defended his thesis in March 2019 and officially graduated the following May. He is now a software engineer at Google
Cluster Reproducibility Challenge at Supercomputing 2016 and winning the Best Student Paper Award at Supercomputing 2015. SIGHPC is the ACM’s special interest group that focuses on
solutions to increasingly larger problems. Inspired by the advent of high-throughput
providing a platform for high-performance computing (HPC) research and efforts
DNA sequencing which enables generation of
internationally. The ACM SIGHPC Dissertation
billions of reads per minute, and the growing
Award pulls from this professional society in
need to find a computational approach that
an effort to highlight innovative and prolific
can keep pace, this research expands on
research in the supercomputing and parallel
prior theoretical approaches. The resulting
processing fields.
algorithms and data structures implemented by
The 2020 ACM SIGHPC Dissertation Award
Flick advance the state-of-the-art by providing
includes a $2,000 honorarium, travel support to
improved theoretical complexity and better
the Supercomputing Conference, and an award
practical performance, while minimizing overall
plaque. n
cse.gatech.edu
19
Faculty n Srinivas
Aluru, Professor, Interim Chair
l Executive
Director, Institute for Data Engineering and Science Ph.D., Iowa State University, 1994
l I EEE
Fellow, SIAM Fellow, AAAS Fellow, NSF CAREER Award, John V. Atanasoff Discovery Award, Outstanding Achievement in Research Program Development Award
n Mark
Borodovsky, Regents’ Professor, Joint with Wallace H. Coulter Department of Biomedical Engineering
l Director,
Center for Bioinformatics and Computational Genomics
l Ph.D.,
Moscow Institute of Physics and Technology, 1976
l
ISCB Fellow, AIMBE Fellow
n Ümit
V. Çatalyürek, Professor, Associate Chair for Academic Programs
l
l
l IEEE
20
Director, CSE Graduate Programs Ph.D., Bilkent University, Turkey, 2000 Fellow, SIAM Fellow, NSF CAREER Award
COLLEGE OF COMPUTING
n Polo
Chau, Associate Professor
l
Associate Director, MS Analytics
l
Ph.D., Carnegie Mellon University, 2012
l Raytheon
Faculty Fellowship Award, James Edenfield Faculty Fellowship, Intel Outstanding Researcher Award, Google Faculty Research Award
n Elizabeth
l
Cherry, Associate Professor
Ph.D., Duke University, 2000
n Edmond
Chow, Associate Professor
l
Ph.D., University of Minnesota, 1997
l
PECASE Award, ACM Gordon Bell Prize
n Barry
Drake, Senior Research Scientist, Joint with Georgia Tech Research Institute of Cyber Technology and Information Security,
l
M.S., University of Washington
n Richard
Fujimoto, Regents’ Professor
l Ph.D.
University of California Berkeley,
l ACM
1983
Fellow, IEEE Fellow, 2019 I/ITSEC Fellow, Class of 1934 Outstanding Interdisciplinary Activities Award
n Felix
Herrmann, Professor, Joint with School of Earth and Atmospheric Sciences and School of Electrical and Computer Engineering,
l
Ph.D., Delft University of Technology, 1997
l Georgia
Research Alliance Eminent Scholar in Energy, 2019 Distinguished Lecturer of the Society of Exploration of Geophysicists, Reginald Fessenden Award
n Tobin
Isaac, Assistant Professor
l
l ACM
Ph.D., University of Texas at Austin, 2015 Gordon Bell Prize, James E. Allchin Assistant Professor Award
n Surya
Kalidindi, Regents’ Professor, Joint with School of Mechanical Engineering
l Ph.D.,
Massachusetts Institute of Technology, 1992
l DoD
Vannevar Bush Faculty Fellowship, Alexander von Homboldt Research Award, Khan International Award
n Srijan
n David
Sherrill, Professor, Joint with School of Chemistry and Biochemistry
l
Ph.D., University of Georgia, 1996
l A AAS
Fellow, American Chemical Society Fellow, American Physical Society Fellow, Vasser Wooley Faculty Fellow, NSF CAREER Award
n Le
Song, Associate Professor,
l Associate
l Ph.D.,
l
Learning
University of Sydney and National ICT Australia, 2008 NSF CAREER Award
n Rich
Vuduc, Professor
l Director,
Center for High Performance Computing
l Ph.D.,
l NSF
University of California Berkeley, 2004
CAREER Award, Gordon Bell Prize, Lockheed-Martin Aeronautics Company Dean’s Award for Teaching Excellence
Kumar Assistant Professor
n Honyuan
l Ph.D.,
l F acebook
University of Maryland, 2017
Faculty Award, Adobe Faculty Award, ACM SIGKDD Doctoral Dissertation Award Runner-up
n Haesun
l
l IEEE
Park, Regents’ Professor
Ph.D., Cornell University, 1987 Fellow, SIAM Fellow, 2019 Faces of Inclusive Excellence
n B.
Aditya Prakash, Associate Professor
l
Ph.D., Carnegie Mellon University, 2012
l Facebook
Faculty Award, NSF CAREER Award, IEEE ‘AI 10 to Watch’
l Ph.D.,
n Chao
Zha, Professor
Stanford University, 1993
Zhang, Assistant Professor
l Ph.D.,
l Google
University of Illinois at UrbanaChampaign, 2018
n Xiuwei
Director, Center for Machine
Faculty Research Award
Zhang, Assistant Professor
l Ph.D.,
École Polytechnique Fédérale de Lausanne, 2011
cse.gatech.edu Georgia Institute of Technology 801 Atlantic Drive Atlanta, GA 30332-3000
facebook.com/gtcomputing instagram.com/gtcomputing @gtcse
22
COLLEGE OF COMPUTING