3 minute read

“Big” Research Projects

“Big” Research Projects

DSI catalyzes and houses large-scale multidisciplinary collaborative efforts. These projects are deemed “big” for one or more reasons: number of participants, breadth and depth of reach into a larger community, level of funding, number of years of funding, or expected long-term impact.

Advertisement

Observational Health Data Sciences and Informatics (OHDSI)

Founded in 2014, the Observational Health Data Sciences and Informatics (OHDSI, pronounced “Odyssey”) initiative is a multi-stakeholder, interdisciplinary, open-science community that collaborates to bring out the value of health data through large-scale analytics. OHDSI generates reliable and reproducible real-world evidence to promote better health decisions and better care in a variety of health care issues. OHDSI’s effort involves 30 countries, 800 researchers, 150 data sets, and electronic health records on 600 million unique patients. Because all records are in the same format, queries over this data collection enable population-level estimating, patient-level prediction, and new analytic methods, tools, and models for clinical research.

Map of the OHDSI collaborator network with more than 2000 collaborators from 74 countries.

COSMOS

COSMOS (Cloud-enhanced Open Software-defined MObile wireless testbed for city-Scale deployment) is a beyond-5G, city-scale testbed being deployed in West Harlem since 2018 by Columbia, Rutgers University, and New York University in partnership with New York City, Silicon Harlem, City College of New York, IBM, and University of Arizona. COSMOS is one of the first two testbeds awarded as part of the NSF Platforms for Advanced Wireless Research program and includes a first-of-its-kind Federal Communications Commission Innovation Zone. The COSMOS project focuses on the design and deployment of an advanced wireless testbed that will support real-world experimentation on next-generation wireless technologies and applications. COSMOS also established a summer program for teachers, most from schools that serve students from populations underrepresented in STEM.

The COSMOS FCC Innovation Zone that encompasses the area between Columbia University and City College of New York, including Morningside Heights, Manhattanville and Hamilton Heights.

Northeast Big Data Innovation Hub

In 2015, NSF established a national network of Big Data Regional Innovation Hubs to help address some of the nation’s most pressing research and development challenges in extracting knowledge and insights from large, complex collections of digital data. Columbia was selected as the lead institution for the Northeast Big Data Innovation Hub, with its headquarters in DSI. The hub is a community convener and enabler for data science collaboration, research, education, and innovation in the region. It builds and strengthens partnerships across academia, industry, nonprofits, and government to address societal and scientific challenges in the community and to spur economic development in data science.

SCRIPTS

SCRIPTS (System for CRoss-language Information Processing, Translation, and Summarization) is a project funded under the IARPA MATERIAL program to develop a system for finding text, audio, and video files in any foreign language along with an accompanying summary in English given a search query. The system is geared toward low-resource languages, which lack large-scale data collections and automatic tools to enable automatic translators. SCRIPTS’ cross-language information retrieval module determines relevant documents and its summarization module generates textual summaries for each document to help a human assess document relevance. The SCRIPTS team includes 13 faculty and senior researchers at Columbia, University of Maryland at College Park, Yale University, University of Cambridge, and University of Edinburgh.

TRIPODS

TRIPODS (Transdisciplinary Research In Principles Of Data Science) is an NSF program that aims to bring together the statistics, mathematics, and theoretical computer science communities to develop the theoretical foundations of data science through integrated research and training activities. One of the major research themes of the Columbia TRIPODS institute has been the development of new theoretical and algorithmic results for deep learning. A second major research theme is to develop and analyze algorithmic primitives for efficient learning.

This article is from: