SIB Profile 2020 - Data scientists for life

Page 1

SIB Profile 2020

Data scientists for life

36 48

SPECIAL FOCUS

MAKING SENSE OF HEALTH DATA OPEN KNOWLEDGE


COVER IMAGE

A drug (here an immuno­ suppressant, Rapamycin, in red), tightly bound to its target (here the protein complex between FKBP12 on the right and mTOR on the left, in light blue). Molecular modelling specialists at SIB develop tools to generate and interpret such three-dimensional models of molecules (SEE P. 17). This enables them to predict, for instance, the consequences of mutations detected in patients on drug resistance, thus supporting clinicians in tailoring treatments to individual patients (SEE P. 43). SOURCE: COMPUTER-AIDED MOLECULAR ENGINEERING GROUP, SIB & UNIVERSITY OF LAUSANNE



2

Forewords

I

have been President of SIB’s Foundation Council since 2013, and have had the good fortune to witness SIB’s remarkable evolution and ever-widening span of competences, which now include clinical bioinformatics and data protection. In true Swiss fashion, SIB embraces Switzerland’s three major linguistic regions, while federating nearly 80 groups from the country’s most distinguished schools of higher education involved in multiple areas of bioinformatics research and infrastructure. As the former President of the Science, Education and Culture Commit­ tee of the Swiss Parliament, I am well aware that none of this could have been achieved without the continuous support of the Swiss government. SIB has gradually gained recognition as both a key and impartial partner, not only on its own soil but also beyond its borders, as exemplified by the role it plays in the recent Global Biodata Coalition, set up to ensure the sustainability of core life-science data resources. You will discover, in these pages, that the true strength of SIB lies in the complementarity of its activities rather than in the mere sum of its parts. As a physician and an epidemiologist, I am honoured to represent an institution that has a true impact both on society and research, and continues to lay down sturdy foundations in a field that did not exist when I began my career and has now become essential in resolving major health issues. Long live SIB! •

Felix Gutzwiller President of the Foundation Council

W

ith SIB, Switzerland benefits from a world-class bioinformatics institute that offers both collaborative opportunities and an attractive environment for biomedical companies. The positive perception of SIB by the industry is something I have witnessed personally. As a bioinformatician and entrepreneur in the field of precision medicine at Merck Serono SA and Quartz Bio, I have successfully hired professionals trained by SIB as well as benefited from the Institute’s expertise through several collaborative research projects. My focus has long been on precision medicine and I recognize, in particular, the importance of the BioMedIT project and the Data Coordination Centre driven by SIB in the context of the Swiss Personalized Health Network (SPHN). Such infrastructure enterprises are a key part of the future of all health systems. Advances on the health front – in particular precision medicine – are parading down the catwalk these days, but bioinformatics is powering research and discoveries in many other exciting ventures too. Over the course of 2019, for instance, SIB Groups reported the discovery of a forgotten Siberian people thanks to ancient DNA, a greater understanding of DNA knots, how milkweed bugs protect themselves and how to identify bacterial toxin-antitoxin systems. In a society where bioinformatics has become an integral part of decisive fields of knowledge, I feel privileged to be involved in SIB’s strategic piloting. •

Jérôme Wojcik Chairman of the Board of Directors


SIB Profile 2020

F

rom sensitive human data to the COVID-19 pandemic that is raging as we write: SIB’s activities are connected with some of the most burning topics of our society. While we will give you a full account in next year’s activity report, already our teams have demonstrated great responsiveness and exibility to share research, data and tools to support the global fight against COVID-19 in record time. SIB’s active community is one of its key strengths. Looking back at 2019, last September, the [BC] 2 international conference organized by SIB offered a multidisciplinary knowledge exchange platform to discuss big data and their conversion into clinically relevant knowledge. The event welcomed students, early-career researchers, world experts and private-sector scientists from 30 countries. SIB also joined the Life Sciences Switzerland (LS 2) society to drive a new bio­ informatics intersection and build even more bridges between lifescience disciplines. Acknowledging the growing number of projects involving sensitive human data, in 2019 SIB created a Data Protection and Security Board to reinforce protection for its resources, data and infrastructure. In a true federative style, this board is multidisciplinary and composed of colleagues with expertise in data protection regulations, cyber­ security, bioinformatics and its clinical applications.

Christine Durinx Joint Executive Director

Ron Appel Joint Executive Director

Our activity report focuses on the Institute’s multiple facets, its diverse nationwide scientific network and the variety of bioinformatics competences and services we develop. The novelty this year? Two ‘special focus’ topics, which are of particular interest for us: health data and Open Knowledge. We hope this will make an enjoyable read. As always, none of this could be achieved without the major support of the Federal government through the State Secretariat for Education, Research and Innovation SERI, but most of all, the expertise, work and dedication of our employees and members, to whom we express our sincere gratitude and with whom we look forward to spending yet another fruitful year. •

“ From sensitive human data to the COVID-19 pandemic that is raging as we write: SIB’s activities are connected with some of the most burning topics of our society. ”

3


4

TABLE OF CONTENTS

10 DATA SCIENTISTS FOR LIFE

36 SPECIAL FOCUS MAKING SENSE OF HEALTH DATA


5

SIB Profile 2020

Table of contents 06

Bioinformatics: a definition

08

Converting biological questions into answers

10

Data scientists for life

12

SIB in brief

14

Supporting our partners’ needs

20

A network of scientific expertise

30

Bringing bioinformatics to society

32

Organization and governance

36

Making sense of health data

38

SIB in the Swiss Personalized Health Network

40

A vault for sensitive data

42

Supporting precision oncology

44

From tools to clues

48

Open Knowledge

50

On the road to Open Knowledge

54

Resources, interconnected

56

Joining forces

59

Index of Group and Team Leaders

63 Acknowledgements

48 SPECIAL FOCUS OPEN KNOWLEDGE


6

Bioinformatics: a definition Computer-based approaches are indispensable to science, by allowing researchers to advance their understanding of complex systems. Life scientists and clinicians have always tried to assemble data and evidence to find the right answers to fundamental questions. In 2020, there is no shortage of data. But a different kind of problem has emerged. Nowadays, new technologies are producing data at an unprecedented speed. Indeed, so much data – and of so many kinds – that they can no longer be interpreted by the human mind alone.

Bioinformatics encompasses: DATABASES for storing, retrieving and

organizing information to maximize the value of biological data; SOFTWARE TOOLS for modelling, visualizing, analysing, interpreting and comparing biological data; ANALYSIS of complex biological datasets or systems in the context of particular research projects;

Enter bioinformatics. RESEARCH in a wide variety of biolog-

Bioinformatics is the application of computer technology to the understanding and effective use of biological and clinical data. It is the discipline that stores, analyses and interprets the big data generated by life-science experiments, or collected in a clinical context, using computer science and dedicated data experts. This multidisciplinary field brings together scientists from a variety of backgrounds: biologists, computer scientists, mathematicians, statisticians and physicists.

ical fields using computer and data science and leading to applications in diverse areas, from agriculture to precision medicine; (SEE P. 24) COMPUTING AND STORAGE INFRASTRUCTURE to process and

safeguard large amounts of data.

What sort of data are we talking about? Bioinformatics deals with a broad spectrum of complex data types. Sequence data from DNA, RNA or proteins

Expression data, such as the level of expression of a gene in a sample

Imaging data

Text data And more...


Bioinformatics is the application of computer technology to the understanding and effective use of biological and clinical data

SIB Profile 2020

7


8

Converting biological questions ... Massive amount of data and data types: genetics, text, biochemical, imaging, etc.

Hospitals and clinics

Research institutes

Private sector

International institutions

Life sciences and health actors

... into answers with various applications

Basic research

Medicine

Environmental sciences

Tailoring treatment to cancer patients Agriculture


9

SIB Profile 2020

SIB Swiss Institute of Bioinformatics

Secure services for sensitive data

Data management

Software engineering Biostatistics and and tailoring bioinformatics analysis

Process optimization

Training

Expert biocuration

Dedicated multidisciplinary experts

Understanding the origin of beetle diversity

Real-time tracking of pandemics


1 0


SIB Profile 2020

Data scientists for life This is who we are: a pool of multidisciplinary experts safeguarding data, sharing their value and making them speak to solve biological questions.

1 1


1 2

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

SIB in brief As a non-profit foundation, we lead the field of bioinformatics in Switzerland, in order to foster advances in life sciences and health.

77 784 184 21 160 research and service groups

Infrastructure

Community

members, including

SIB provides the national and international life-science community with a state-of-the-art bio­infor­matics infrastructure, including resources, collaborative support and services. DATABASES AND SOFTWARE TOOLS

We create, maintain and disseminate worldwide a large portfolio of databases and software tools, including Core Data Resources, enabling researchers to leverage knowledge about life and foster innovations. COMPETENCE CENTRES

We offer in-depth expertise and support in bioinformatics, from secure infrastructure for sensitive data and analyses of all kinds of biological data to software development and data management.

employees

institutional partners across Switzerland Over

databases and software tools developed by our members and accessible via the ExPASy web portal Over

2,770 peer-reviewed articles published since SIB’s creation in 1998

As of 1 January 2020

SIB brings together worldclass researchers based in Switzerland and delivers training in bioinformatics. SCIENTIFIC COLLABORATION

Through knowledge exchange networks, collaborative projects and events, we strengthen cooperation on shared issues among bioinformatics research and service groups from Swiss schools of higher education and research institutes. TRAINING IN BIOINFORMATICS

To ensure that life scientists and clinicians make the best of the data, we provide them with a large portfolio of courses and workshops. We also foster exchanges among bioinformatics and computational biology PhD students, and empower them to use the most up-to-date methods for their research.


SIB Profile 2020

A FEW WORDS FROM THE CHAIRMAN OF SIB’S SCIENTIFIC ADVISORY BOARD Alfonso Valencia Chairman of the Scientific Advisory Board ICREA Research Professor Director, Life Sciences, Barcelona Supercomputing Center (BSC) Director, Spanish National Bioinformatics Institute (INB-ISCIII)

What is so special about SIB tools and software? Alfonso Valencia What is really special is that they are designed to be global leaders in their respective fields. The striving for excellence emanates directly from SIB’s global vision. The fact that they are developed by top groups in their specific research area and embedded in Swiss schools of higher education also ensures a close match with the needs of the field, e.g. the Molecular Modelling Group develops tools it actually uses but without missing the goal of making them useable worldwide.

How does the Scientific Advisory Board function and what is its modus operandi? AV The SAB is composed of expert computational scientists who, based on reports and interviews, evaluate the evolution of SIB resources and proposals for new ones. The assessment takes into account general aspects such as the scientific / social impact and the interrelation with other SIB resources, as well as detailed technical and managerial developments. Additionally, the SAB provides input on SIB’s general strategy and future plans. Expert biocuration is often viewed as an essential part of biological knowledge creation. Do you agree with this statement? AV Yes, but I would go much further. Expert biocuration, such as that conducted at SIB, is also a key driver of innovation and policies, whether in establishing a common language within and across communities – thus favouring Open Science – or in driving the development of machine learning and natural language processing for incremental curation. SIB’s accumulated experience in biocuration is, without doubt, one of the Institute’s key assets.

“ SIB’s accumulated experience in biocuration is, without doubt, one of the Institute’s key assets.”

1 3

What major trends are you observing in bioinformatics resources globally? AV The rise of new fields of applications for bioinformatics resources – from electron micro­ scopy-based structural methods to chromatin capture and single cell genomics – represents interesting challenges and opportunities. Among them, the onboarding of end users with formal training in mathematics, physics and computer science. Also, the developments of medical genomics and personalized medicine involve technical and sociological issues that are already influencing bioinformatics developments – and consequently SIB’s strategy for the future.


1 4

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

Supporting our partners’ needs Discover our competences, explore our portfolio of leading tools and software and meet our experts.

PROCESS OPTIMIZATION

DATA MANAGEMENT

Gaining efficiency, from analysis pipelines to quality control

Organizing data for long-term reuse

Our experts harmonize and optimize internal data management processes, analysis pipelines or software tools. We also organize and support clinical benchmarking activities with Swiss or international partners.

We assist our partners by defining and implementing their Data Management Plans (DMP) for research proposals; reaching data interoperability targets, from local to international scales, within academic or regulated environments; ensuring the long-term management and storage of biological data.

BIOSTATISTICS AND BIOINFORMATICS ANALYSIS

Making biological data speak With more than 20 years’ experience in applying data and computer science to biological questions, our specific areas of expertise include: biomarker identification; de novo assembly from sequencing data; genome comparative data analysis; targeted, exome and whole genome sequencing analysis; metagenomic data analysis; omics analysis; data integration from multiple sources; annotation and gene prediction; machine learning.

Our approach: collaborative, independent and reliable From one-off services to long-term collaborative support, we turn our in-depth expertise into complete collaborative solutions, in line with strict delivery goals and regulatory requirements, to make your projects a reality.


1 5

SIB Profile 2020

SECURE SERVICES FOR SENSITIVE DATA

SOFTWARE ENGINEERING AND TAILORING

Dedicated, secure IT environment to process sensitive human data

Developing engaging and customized software tools

Our encrypted information technology infrastructure conforms with all current data protection regulations (incl. GDPR) and IT security standards. Our partners can therefore process and host both sensitive data – such as genomic information or health records – and non-sensitive data – with complete confidence. We use modern virtualization technologies such as OpenStack to protect our computing environments.

1,377 54 99 trainees

courses and workshops

training days

Our software engineers and User eXperience specialists contribute to some of the world-leading databases and tools used by life scientists around the globe. We also develop tailored applications for personalized medicine in hospitals or industry settings. Our teams assist our partners in creating user-friendly software products – or tailoring existing ones to specific needs – based on the most up-to-date web technologies.

TRAINING

EXPERT BIOCURATION

Boosting bioinformatics skills

Generating high-quality, up-to-date annotations

Our comprehensive – and constantly evolving – course portfolio provides hands-on experience of the most up-to-date bioinformatics techniques and resources, including clinical applications for researchers or healthcare professionals. We offer 99 course-days, provided by 80 trainers each year.

Who takes part in SIB courses? 33% PhD candidates 25% Postdoctoral researchers 20% Senior scientists / Principal investigators 10% Other scientists 9% Research assistants / Technicians 3% Master's students

Find the full list of courses at sib.swiss/training

Our pool of biocuration experts excels in the art of generating knowledge from a growing body of publications, ensuring the most appropriate and informative evidence is reported. We provide expert biocuration on specific data types (e.g. proteomics, lipidomics or transcriptomics), or help with setting up expert-annotated resources for a wide range of applications, such as understanding protein function, facilitating clinical interpretation of cancer variants or enabling biomarker discovery.


1 6

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

DISCOVER OUR TOOLS AND DATABASES

To unlock life’s mysteries, life scientists often rely on dozens of databases or software tools to reconstruct DNA sequences, interpret the function of genes, or predict the 3D structure of proteins. Among the 160 databases and software tools developed by SIB Groups, these 11 SIB Core Data Resources are deemed of particular importance to the life-science community (SEE P. 13).

Genes and genomes

Bgee

SwissOrthology

SwissRegulon Portal

Gene expression expertise

One-stop shop for orthologs

Tools and data for regulatory genomics

Knowledgebase with expert curation and software tool DESCRIPTION Gene expression data (including all types of transcriptomes), allowing retrieval and comparison of expression patterns between animals, human, model organisms and diverse species of evolutionary or agronomical relevance.

TYPE

Phylogenomic databases and software tools DESCRIPTION Web portal of resources to infer orthologs, i.e. corresponding genes across different species, a key aspect to predicting gene function or reconstructing species trees. It includes OrthoDB, BUSCO as well as OMA and the Quest for Orthologs benchmark service.

TYPE

TYPE

HIGHLIGHT Only resource to provide homologous gene expression between species. NEW IN 2019 Curated information of anatomical homology made useful: new expression comparison page and new anatomical homology page. Launch of the BgeeCall package.

V-Pipe Viral genomics pipeline Software tool DESCRIPTION Pipeline integrating various open-source software packages for assessing viral genetic diversity from next-generation sequencing data. TYPE

HIGHLIGHT Enabling reliable and comparable viral genomics and epidemiological studies and facilitating clinical diagnostics of viruses. NEW IN 2019 An in silico talk was recorded about the resource. WATCH WATCH THE THE IN SILICO IN SILICO TALKTALK ABOUT ABOUT THISTHIS RESOURCE RESOURCE

HIGHLIGHT World-leading orthology and

comparative genomic resources. NEW IN 2019 Launch of the SwissOrthology.ch web portal for federated orthology queries and an interactive guide. Release of BUSCO 4 for both eukaryotic and prokaryotic genomes in sync with OrthoDB 10. WATCH THE THE WATCH IN SILICO TALK TALK IN SILICO ABOUT THIS THIS ABOUT RESOURCE RESOURCE

Software tools and knowledgebases Web portal for regulatory genomics, including genome-wide annotations of regulatory sites and motifs, the web server ISMARA for automated inference of regulatory networks and CRUNCH for automated analysis of ChIPseq data, and REALPHY for reconstructing phylogenies from raw sequence data.

DESCRIPTION

HIGHLIGHT ISMARA and Crunch web servers allow users to upload raw microarray, RNA-seq or ChIP-seq data to automatically infer the core regulatory networks acting in their system of interest. NEW IN 2019 ISMARA can now be run for several more model organisms including: zebrafish, rat, dog, and E. coli. Rather than upload the data, users can specify a URL where their data are located, and ISMARA will fetch and run the data automatically.

EPD Eukaryotic Promoter Database Knowledgebase with expert curation and software tools DESCRIPTION Quality-controlled information on experimentally defined promoters of higher organisms, as well as web-based tools for promoter analysis. TYPE

HIGHLIGHT Over 180,000 promoters downloadable, analysable over a web interface and viewable in the UCSC genome browser.


1 7

SIB Profile 2020

Proteins & proteomes

Lipids

UniProtKB/Swiss-Prot

STRING

SwissLipids

Protein knowledgebase

Protein-protein Interaction Networks and Functional Enrichment Analysis

A knowledge resource for lipids

Knowledgebase and software tool DESCRIPTION Resource for known and predicted protein-protein interactions, including direct (physical) and indirect (functional) associations derived from various sources, such as genomic context, high-throughput experiments, (conserved) co-expression and the literature.

DESCRIPTION

Knowledgebase with expert curation DESCRIPTION Hundreds of thousands of protein descriptions, including function, domain structure, subcellular location, post-translational modifications and functionally characterized variants. TYPE

HIGHLIGHT Expert-curated part of UniProt, the most widely used protein information resource in the world, with over six million pageviews per month. An ELIXIR Core Data Resource. NEW IN 2019 New chemical structure and ontology searches from the UniProt website, API, and SPARQL endpoint – for enhanced data mining and simpler integration of genomics, proteomics, and metabolomics data.

SWISS-MODEL Protein structure homology-modelling Software tools and repository DESCRIPTION Automated protein structure homology-modelling platform for generating 3D models of a protein using a comparative approach, and database of annotated models for key reference proteomes based on UniProtKB. TYPE

HIGHLIGHT Easy-to-use web-based platform processing over two million model requests per year, providing model information for experts and non-specialists. NEW IN 2019 The model quality estimation approach of SWISS-MODEL was updated and published after being evaluated as one of the top performers in the CASP13 community-wide experiment.

TYPE

HIGHLIGHT An ELIXIR Core Data Resource. STRING networks cover over 5,000 different organisms with over 25 million high-confidence links between proteins. NEW IN 2019 New functionality gives an option to upload entire, genome-wide datasets allowing users to perform visualization and gene-set enrichment analysis on their entire input.

Knowledgebase Information about known lipids, including knowledge of lipid structures, metabolism and interactions, providing a framework for the integration of lipid and lipidomics data with biological knowledge and models.

TYPE

HIGHLIGHT Contains information on more than 590,000 lipid structures from over 640 lipid classes. NEW IN 2019 Expert curation and structural enumeration of over 400 new classes of complex glycosphingolipids, which will bring the total number to over 1000 lipid classes.

Structural biology

neXtProt Human protein knowledgebase Knowledgebase with expert curation and associated tools DESCRIPTION Information on human proteins such as function, involvement in diseases, mRNA/protein expression, protein-protein interactions, posttranslational modifications, protein variations and their phenotypic effects. TYPE

HIGHLIGHT High data coverage through inte-

gration of multiple sources. Advanced semantic search functionalities. Tools specifically designed for the proteomics community. NEW IN 2019 Allele frequencies now available for about 2,7 million variants (imported from gnomAD). New tool performing in silico protein digestion for multiple enzymes.

An active collaboration with core facilities SIB provides support to the BCF Group in Lausanne, renowned for its biostatistical capabilities and led by Mauro Delorenzi and Frédéric Schütz, as well as to the sciCORE Group in Basel, led by Torsten Schwede and Thierry Sengstag, which is also a BioMedIT node (SEE P. 39).

SwissDrugDesign Widening access to computer-aided drug design Software tools Web-based computeraided drug design tools, from molecular docking (SwissDock) to pharmacokinetics and druglikeness (SwissADME), through virtual screening (SwissSimilarity), lead optimization (SwissBioisostere) and target prediction of small molecules (SwissTargetPrediction).

TYPE

DESCRIPTION

HIGHLIGHT Comprehensive and integrated web-based drug design environment. NEW IN 2019 The new version of SwissTargetPrediction, released in 2019, was particularly well received by the community of users, with a 150% increase in the number of submitted jobs.


1 8

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

MEET OUR TEAMS

Our Internal Groups, composed and headed by SIB Employees, harness their complementary expertise to collaborate with external partners and other SIB Groups on a daily basis.

Clinical Bioinformatics

Core-IT

Personalized Health Informatics

Valérie Barbié

Heinz Stockinger

Katrin Crameri

“The clinical realm is undergoing a data revolution. To ensure health professionals from academic and regulated environments make the most of this information ow, we evaluate and define their needs and develop specific tools and methods for hospitals and the pharma industry to foster optimal patient care and well-being.”

“With our Sensitive Data Processing Platform we support research to extract knowledge from human data, for the benefit of society. We enable researchers to use sensitive biomedical data in a lawful and efficient way by using appropriate technologies and building on our expertise in IT service provision, data protection and information security.”

“We aim to make data-driven medicine a reality in Switzerland, to ensure better health outcomes for more people in the long term. We are making biomedical information for personalized health research interoperable, by building a secure IT infrastructure, and harmonizing standards for Swiss health data semantics and exchange mechanisms.”

EXAMPLES

EXAMPLE

Diagnostic applications (cancer, genetic diseases, etc.) for the medical and pharmaceutical domain.

Collaborative databases to enable data sharing for research or clinical purposes.

Acting as a secure gateway to national (e.g. the Swiss Personalized Health Network TAGS

TAGS

personalized medicine; oncology; infectious disease; human genetics; interoperability; medicine and health; outreach; training

via BioMedIT) and international (e.g. Innovative Medicines Initiative) data networks.

data protection; information security; high-performance computing; interoperability; medicine and health; personalized medicine; training; software engineering; infrastructure provision

EXAMPLE

BioMedIT, the national infrastructure for the secure handling of health data,

TAGS

which can be jointly used by Swiss universities, research institutions and hospitals.

information security; interoperability; medicine and health; personalized medicine; training


1 9

SIB Profile 2020

Some of our most taught skills in 2019

Single-cell Data privacy R Shiny Best practices NGS Python Big data Databases Deep learning Statistics Cancer diagnosis Protein Data carpentry RNA-Seq Data analysis IT Security Data management

R

Coding practices

Swiss-Prot

Training

Vital-IT

Alan Bridge

Patricia Palagi

Mark Ibberson

“As a competence centre for biocuration and knowledge management, we develop, annotate and maintain internationally renowned knowledge resources such as UniProtKB/ Swiss-Prot. Our resources provide an essential framework for biological data science – supporting integrated analyses of genomic, proteomic and metabolomic data to promote human health and wellbeing.”

“Thanks to the unique pool of leading experts making up SIB’s scientific network, we are able to provide a rich nationwide training offer across the spectrum of bioinformatics techniques, methods and tools, and thus support the fast-evolving needs of researchers.”

“As both computational biologists and software developers, we understand data as well as the underlying biological questions. Our focus is on data analysis, but also on finding innovative ways to approach it, such as overcoming constraints related to sensitive data.”

KEY FIGURES 2019

EXAMPLES

EXAMPLES

Some of our f lagship resources include: UniProtKB/Swiss-

Prot, ENZYME, Rhea, SwissLipids, HAMAP, PROSITE and ViralZone.

TAGS

database curation; proteins and proteomes; systems biology; biochemistry; ontology; lipidomics; metabolomics; proteomics; semantic web

16 69

SIB Groups engaged in teaching activities

experts and trainers

Setting up a federated data analysis system across several countries to enable access TAGS

to large patient cohorts while addressing legal, ethical and FAIR principles.

structural biology; systems biology; machine learning; data management; mass spectrometry; next-generation sequencing; data mining; genome reconstruction; software engineering; personalized medicine


2 0

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

A network of scientific expertise Bioinformatics is an interdisciplinary field, where the encounter between genetics, physiology, chemistry or physics leads to many fields of activities and applications.

Genes and genomes

Proteins and proteomes

Evolution and phylogeny

Life’s instruction manual

More than meets the eye

Splitting ends

17

31

8

19

A genome is the sum of genetic material of an organism, including all of its genes. It is composed of DNA and contains all the information needed to create and maintain an organism, as well as the instructions on how this information should be expressed.

A proteome is the sum of proteins expressed by a cell, a tissue or an organism, at a given time. Proteins are the products of genes, and are involved in nearly every task carried out within an organism – from carrying oxygen to fighting off pathogens.

Bioinformatics develops tools able to read genomes, and store, analyse and interpret the resulting data.

Bioinformatics develops tools to understand the role of proteins.

Number of groups per domain Top resources (over 160+ tools and databases developed)

15

26

Changes that occur in genomes tell life scientists how an organism has evolved over time. Comparisons made between genomes from different species or populations tell them how they are related to one another – this is the field of phylogenetics. Bioinformatics develops tools able to compare the genomes of organisms, as well as computing methods to reconstruct their past and build their ‘family’ trees.


2 1

SIB Profile 2020

…AGRICULTURE 8 GROUPS

Structural biology

Systems biology

The third dimension

Never alone

6

12

Macromolecules such as DNA and proteins have specific 3D structures that are dictated by their sequence. A protein’s function is defined by its 3D structure, or architecture, which in turn defines the way it reacts with other molecules. Bioinformatics develops software to create 3D models of proteins to study their interactions with other molecules, such as drugs.

12

12

Life occurs and is sustained by a mesh of interactions within and between cells, tissues, organisms, and their environment. Understanding how these complex systems function allows scientists to predict what happens if one of the components changes or the conditions are altered. Bioinformatics develops models to delineate metabolic pathways.

from predicting the spread of bird flu outbreaks and understanding the lifecycle of agricultural pests, to improving crop productivity.

…BASIC RESEARCH 47 GROUPS

from unravelling the evolutionary processes that have shaped today’s biodiversity, to solving the equation behind a lizard’s scale colour pattern.

…ENVIRONMENTAL SCIENCES 8 GROUPS

Text mining and machine learning

Competence centres and core facilities

Rise of the machines

The means to an end

6

9

Text mining algorithms are designed to recognize patterns within text so that computers can extract the information of interest, such as biomedical terms. Using machine learning, computers can also acquire the ability to learn without explicit instructions. Bioinformatics develops text-mining tools that complement expert biocuration, as well as machine learning techniques that can improve protein structure prediction.

12

10

The quantity of data generated by the life sciences has grown exponentially over the years, and needs to be stored and processed. Researchers also need support in making sense of their data. Core facilities centralize research resources, and provide tools, technologies, services and expert consultation to this end. Bioinformatics core facilities are located in the major Swiss academic institutions.

from understanding how organisms adapt to climate change, to how microbial communities can be used to break down pollutants in oil spills.

…MEDICINE AND HEALTH 48 GROUPS

from designing optimized proteins in cancer immunotherapy, to creating biomedical decision-support tools using text-mining.

The activities of our Training Group are transversal to all these fields and domains


2 2

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

OUR COMMUNITY AT A GLANCE

BASEL

Through partnerships with major Swiss schools of higher education and renowned Swiss research institutes, we are proud to federate a diverse and skilled community of data scientists.

784 178

172 MEMBERS 15 GROUPS

BERN

SIB Members, incl. 184 Employees (SEE P. 35)

31 MEMBERS 2 GROUPS

YVERDON 9 MEMBERS 1 GROUP

students taking part in the SIB PhD Training Network

FRIBOURG 15 MEMBERS 3 GROUPS

LAUSANNE 273 MEMBERS 25 GROUPS

GENEVA 133 MEMBERS 9 GROUPS

GENEVA

LAUSANNE

YVERDON

FRIBOURG

BERN

BASEL


2 3

SIB Profile 2020

ZURICH 109 MEMBERS 16 GROUPS

WÄDENSWIL 23 MEMBERS 2 GROUPS

DAVOS 2 MEMBERS 1 GROUP

“ SIB successfully operates a national bioinformatics community with over 20 partner institutes.”

BELLINZONA 13 MEMBERS 2 GROUPS

Evaluation of SIB (Swiss node of ELIXIR) by ELIXIR’s Scientific Advisory Board, March 2019

LUGANO 4 MEMBERS 1 GROUP

ZURICH

WÄDENSWIL

LUGANO

BELLINZONA

DAVOS


2 4

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

RESEARCH HIGHLIGHTS

320 publications by SIB Groups in 2019

A cancer-associated mutation modifies the activity of entire chromatin domains

TASmania – A new resource for microbiologists

Software promises to redefine the tree of life as we know it

Toxin-Antitoxin Systems or TAS are pairs of adjacent genes – widespread in bacteria – one coding for a toxin, the other for the corresponding antitoxin. They are involved in a range of mechanisms, from nutritional stress to antibiotic resistance. A new tool, developed by SIB Scientists at the University of Fribourg enables users to identify and discover existing and new TAS, as well as associated networks, in a given genome.

A new method to analyse genetic data reveals that the relationships between living species, from bacteria to plants and animals, differ from previous estimates. The software was created by SIB Groups at the University of Lausanne.

A study co-led by an SIB Group at the University of Lausanne shows how a mutated gene can affect the three-dimensional interactions of genes in the cell, leading to various forms of cancer.

SIB GROUPS INVOLVED Computational Phylogenetics Group, led by Nicolas Salamin

SIB GROUP INVOLVED Computational Systems Oncology Group, led by Giovanni Ciriello

Bioinformatics Core Facility Group, co-led by Mauro Delorenzi and Frédéric Schütz

Published in Nature Genetics, DOI: 10.1038/s41588-018-0338-y

Bioinformatics Unravelling Group, led by Laurent Falquet SIB GROUP INVOLVED

Published in PLoS Computational Biology, DOI: 10.1371/journal.pcbi.1006946 WATCH THE WATCH THE IN SILICO TALK IN SILICO TALKPAPER ON THIS ON THIS PAPER

Published in PNAS, DOI: 10.1073/pnas.1813836116


How the dragon got its frill The frilled dragon exhibits a distinctive large erectile ruff. A multidisciplinary team led by an SIB Group at the University of Geneva reports that an ancestral embryonic gill of the dragon embryo turns into a neck pocket that expands and folds, forming the frill. SIB GROUP INVOLVED Artificial & Natural Evolutionary Development of Complexity Group, led by Michel Milinkovitch

2 5

SIB Profile 2020

A new approach to unravel genetic determinants of complex and clinical traits

Ancient teeth lead to the discovery of a population group

While genome-wide association studies (GWAS) have identified tens of thousands of common genetic variants associated with complex traits, understanding the underlying mechanisms in terms of gene function is not straightforward. A new method, developed by SIB Scientists from the University of Lausanne and colleagues, makes it possible to estimate the causal effects of gene expression on human traits.

Two 31,000-year-old milk teeth from an excavation site in northeastern Siberia have led to the discovery of a previously unknown population that lived in the area during the last ice age. An SIB Group from the University of Bern was involved in the discovery of these “Ancient North Siberians”.

Published in eLIFE, DOI: 10.7554/eLife.44455 SIB GROUP INVOLVED Statistical Genetics Group, led by Zoltán Kutalik

Published in Nature Communications, DOI: 10.1038/s41467-019-10936-0 WATCH THE WATCH THE TALK IN SILICO IN SILICO TALKPAPER ON THIS ON THIS PAPER

SIB GROUP INVOLVED Computational and Molecular Population Genetics Group, led by Laurent Excoffier

Published in Nature, DOI: 10.1038/s41586-019-1279-z


2 6

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

The sequencing of the milkweed bug genome The milkweed bug feeds on poisonous milkweed seeds and uses the acquired toxins for its own protection and bright red warning coloration. But how? An international study including SIB Researchers at the Universities of Lausanne and Geneva sequenced its genome and carried out extensive comparisons with other insects using SIB orthology resources. SIB GROUPS INVOLVED Evolutionary-Functional Genomics Group, led by Robert Waterhouse

How asexual reproduction is purging genomes of their parasitic invaders

Of the importance of glycans in vaccine design and viral infection

Over 40% of our genome is made of transposable elements. These ‘genomic parasites’ may contribute to the genetic diversity of organisms to some extent – but often they disrupt gene functions. SIB Scientists at the University of Lausanne and colleagues show that asexually reproducing yeast is better at controlling the spread of these sequences.

Sugars present at the surface of a virion or of a host’s cells play a key role in the success of a vaccine as well as during the infection process. An SIB Glycoinformatician, and an SIB Molecular Virologist provide a tour d’horizon of the tools developed at SIB that can be used to understand viral activity in the context of glycosylation.

Computational Evolutionary Genomics Group, co-led by Evgeny Zdobnov and Evgenia Kriventseva

Evolutionary Bioinformatics Group, co-led by Marc Robinson-Rechavi and Frédéric Bastian

Published in Genome Biology, DOI: 10.1186/s13059-019-1660-0

Published in eLIFE, DOI: 10.7554/eLife.48548

SIB GROUP INVOLVED

SIB GROUPS INVOLVED Swiss-Prot Group, led by Alan Bridge

Proteome Informatics Group, led by Frédérique Lisacek Published in Viruses, DOI: 10.3390/v11040374 WATCH THE IN SILICO TALK ON THIS PAPER


An algorithm for largescale genomic analysis

Full references of the papers mentioned are available at sib.swiss/about-sib/news-2019

2 7

SIB Profile 2020

Haplotypes are a set of genetic variations which, located side by side on the same chromosome, are trans­ mitted in a single group to the next generation. Their examination makes it possible to understand the heritability of certain complex traits, such as the risk of developing a disease. Researchers from the Universities of Geneva and Lausanne and at SIB have developed a SHAPEIT4, a powerful computer algorithm that allows the haplotypes of hundreds of thousands of unrelated individuals to be identified very quickly. SIB GROUPS INVOLVED Systems and Population Genetics Group, led by Olivier Delaneau

Genomics of Complex Traits Group, led by Manolis Dermitzakis Published in Nature Communications, DOI: 10.1038/s41467-019-13225-y

Compound with anti-aging effects passes human trial A dietary supplementation with Urolithin A, a pomegranate metabolite, could help slow down certain aging processes by improving mitochondrial and cellular health in humans: this is the finding of a clinical trial led by EPFL spin-off Amazentis, in conjunction with EPFL and SIB Scientists who helped analyse the data. SIB GROUP INVOLVED Vital-IT Group, led by Mark Ibberson

Published in Nature metabolism, DOI: 10.1038/s42255-019-0073-4


2 8

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

COMING TOGETHER

At SIB, we create opportunities for our scientists to exchange know-how and collaborate within a lively community.

Promoting diversity in bioinformatics SIB’s Annika Lisa Gable (PhD Student in Christian von Mering’s lab at the University of Zurich) and Ute Roehrig, (Senior Scientist in the Group of Vincent Zoete and Olivier Michielin at SIB in Lausanne) obtained a SIB Travel Fellowship to attend a conference of a different kind: the Bioinfo4women conference (28-29 November 2019), where both organizers and speakers were women. It “provided an interesting reminder that things could be different and better if there was an equal representation of men and women in leading scientific positions”.

Reinforcing the links between life scientists and bioinformaticians In 2019, SIB and Life Sciences Switzerland (LS 2) created a joint “Bioinformatics intersection”, whose mission is to promote exchanges and bridge activities between the two communities, as well as accelerating the development of methods and tools supporting the latest experimental developments. One of the first activities proposed in this context was a symposium on the occasion of the LS 2 Annual Meeting, organized by SIB with the LS 2 Microscopy Intersection, and co-chaired by SIB Group Leader Sara Mitri, on “Smart Microscopy: Machine Learning Applied to Life Sciences”.

“The [BC] 2 conference is highly timely, it’s really the future of medicine and digital health.” Roy Kishony Technion, Israel Institute of Technology Keynote speaker at [BC] 2 2019

WATCH THE [BC]2 VIDEO

Turning big data into clinically relevant knowledge at the [BC] 2 Basel Computational Biology Conference As every two years, SIB organized [BC] 2, this time around the theme of big data and their clinical applications, through a series of workshops and plenary sessions. The conference welcomed students, early-career researchers, world experts and private-sector scientists from 30 countries. Three themes of broad significance emerged from the conference’s tracks and sessions:

➀ From single-cell data to precision oncology

➁ From pathogen sequencing

to fighting infectious diseases

➂ Biological big data analysis and methods


2 9

SIB Profile 2020

“...the assembly of the human genome sequence is one of the major scientific breakthroughs that opened the door for many other important discoveries”

8 2,298 677

Slavica Dimitrieva, on what she believes is the most fascinating discovery made possible by bioinformatics. Winner of the 2013 SIB Best Swiss Bioinformatics Graduate Paper Award Novartis Institutes of Biomedical Research (NIBR)

videos

views

views for the most popular video (V-Pipe)

Launch of the in silico talks series A new series of short online scientific talks that focus on recent advances in bioinformatics methods, research & resources by SIB Members – and on how they might help life scientists and clinicians in their work. WATCH WATCH THE PLAYLIST THE PLAYLIST ON YOUTUBE ON YOUTUBE

Congratulations to the winners of the SIB Bioinformatics Awards 2019 The 10th SIB Awards ceremony took place during [BC] 2 Early Career Bioinformatician Award – CHF 10,000 ELEONORA PORCU, University of Lausanne, for her outstanding early career, and in particular her work on the development of a new approach to unravel genetic determinants of complex and clinical traits. Before joining the SIB Group led by Zoltán Kutalik at the University of Lausanne and CHUV and the Group led by Alexandre Reymond (UNIL) in 2016 as a postdoctoral scientist, Eleonora obtained a PhD in Biomedical Science at the University of Sassari (IT), during which she spent more than two years at the University of Michigan (USA). Best Swiss Bioinformatics Graduate Paper Award – CHF 5,000 JOCHEN SINGER, today Bioinformatics Scientist at Novartis Pharma AG, for his paper “Single-cell mutation identification via phylogenetic inference”, published in 2018 in Nature Communications. This paper, which proposes a new approach to call mutations in individual cells, was published as part of his PhD in Niko Beerenwinkel’s lab at the ETH Zurich. Bioinformatics Resource Innovation Award – CHF 10,000 VELOCYTO, a resource offering a new framework for the analysis of RNA velocity in single cells, in order to quantitatively infer cell fates. The resource was represented by Sten Linnarsson, Karolinska Institute, Sweden on behalf of the team that contributed to its development and which includes Gioele La Manno (EPFL), Peter Kharchenko (Harvard) and Ruslan Soldatov (Harvard). Look back at 10 editions of the SIB Awards with 20 mini-interviews with past winners: where they are today, what fascinating discovery bioinformatics made possible, and more.


3 0

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

Bringing bioinformatics to society Bioinformatics is increasingly tied to health and societal issues. Through its outreach activities, SIB informs the public about the potential applications of this discipline. In 2020, meet the team at:

Explaining bioinformatics to the public matters greatly to SIB. Science fairs, activities within the classroom, a mobile game, a new web site or hackathons… our team’s imagination has no limits!

NUIT DE LA SCIENCE Geneva, 4-5 July

Discover outdoor bioinformatics activities on the shores of Lake Geneva!

In 2019, the SIB Outreach Team was present at: SCIENTIFICA, ZURICH

…what is the usefulness of bioinformatics in the identification of variants in oncology? The answer was at the Zurich Science Days, and included a... LEGO sequencer! scientifica.ch

Bioinformatics as a career path The LEGO sequencer at Scientifica

EXPANDING YOUR HORIZONS, GENEVA

…a biennial event to encourage girls aged 11-14 to study science, technology, engineering and mathematics. This year they discovered how to index and interpret the variants found in tumour cells.

“Now I am sure of it: I want to do the same job as you when I grow up!”

elargisteshorizons.ch/en

PLANÈTE SANTÉ LIVE 2019, MARTIGNY

…the Swiss Health Fair, where visitors to the SIB stand learned about the latest advances in the field of precision medicine. planetesante.ch/salon

OPEN GENEVA INNOVATION FESTIVAL, GENEVA

…at this edition, SIB organized a biohackathon for children aged 9-14 who used the SCRATCH programming language to tackle biological problems. opengeneva.org

For years, SIB has been promoting bioinformatics as a career path for women, through the activities led by SIB Scientists – also acting as role models – at schools and at events such as Expanding Your Horizons, specially designed to encourage young girls to engage in scientific careers.

Over

3,100 participants in our activities each year, including 2,000 students (12-18 years old)

Participant in the 2019 Expanding Your Horizons event


Existing information on altered proteins can be found in databases, or inferred thanks to bioinformatics predictive tools or molecular modelling. This illustration is from the new PrecisionMed.ch website launched by SIB in 2019. (SEE P. 42)

“It is really interesting to see that we do not all think the same way when tackling a biological problem – and that this can be very useful!” Participant in a 2019 Drug Design workshop

3 1

SIB Profile 2020

Bioinformatics in the classroom SIB offers regular activities in schools and has designed several workshops to introduce bioinformatics to schoolchildren, teachers and students:

DRUG DESIGN

A web-based workshop to acquire a simple yet realistic picture of how drugs are designed with the help of computers. drug-design-workshop.ch

THE METAGENOMICS PIZZA

A workshop to grasp the concept of metagenomics, and identify all the species present in a pizza, based on their DNA.

PRECISION MEDICINE

A new workshop to discover how to index and interpret variants in tumour cells in order to make a diagnosis and to choose a suitable treatment. precisionmed.ch

TECDAYS

Since 2013, SIB has taken part in these Switzerland-wide events organized by the Swiss Academy of the Technical Sciences (SATW) for students aged 15-20. In 2019, SIB took part in 14 sessions, all across Switzerland. satw.ch/en/tecdays/

More events and news on Facebook, More events and our dedicated news on Facebook, outreach channel in our dedicated French and Englishin outreach channel goo.gl/4c6xCZ French and English


3 2

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

Organization and governance Federating such pervasive a domain as bioinformatics, even across a modestly sized country like Switzerland, requires a unique organizational structure. COMPOSITION OF SIB’S GOVERNING BODIES

The Foundation Council

The Board of Directors (BoD)

Each of SIB’s partner institutions is represented on the Council.

The BoD consists of two Group Leaders elected jointly by the Council of Group Leaders and the BoD, two external members elected by the Foundation Council on the recommendation of the BoD, and the SIB Executive Directors. Members of the BoD are appointed for a renewable five-year period.

President Prof. Felix Gutzwiller Former Senator Founding Members Prof. Ron Appel SIB Executive Director Prof. Amos Bairoch Group Leader, SIB and University of Geneva Dr Philipp Bucher Group Leader, SIB and EPFL Prof. Denis Hochstrasser Former Vice-Rector, University of Geneva Prof. C. Victor Jongeneel Carl R. Woese Institute for Genomic Biology, University of Illinois, USA Prof. Manuel Peitsch Chief Scientific Officer Research at Philip Morris International Ex officio Members Prof. Cezmi A. Akdis Director, Swiss Institute of Allergy and Asthma Research (SIAF) Mr Thomas Baenninger Chief Financial Officer, Ludwig Institute for Cancer Research Prof. Edouard Bugnion Vice-President for Information Systems, EPFL Prof. François Bussy Vice-Rector for Research, International Relations and Continuing Education, University of Lausanne Prof. Daniel Candinas Vice-Rector Research, University of Bern Prof. Carlo Catapano Director, IOR Institute of Oncology Research Prof. Boas Erez Rector, Università della Svizzera Italiana Prof. Nicolas Fasel Vice-Dean for Research and Innovation, Faculty of Biology and Medicine, University of Lausanne Prof. Katharina Fromm Vice-Rector, University of Fribourg Prof. Cem Gabay Dean, Faculty of Medicine, University of Geneva

Prof. Brigitte Galliot Vice-Rector, University of Geneva Prof. Antoine Geissbühler Vice-Rector, University of Geneva Head of eHealth and Telemedicine Division, Geneva University Hospitals (HUG) Prof. Detlef Günther Vice-President Research and Corporate Relations, ETH Zurich Dr Corinne Jud Head of the Competence Division Method Development and Analytics, Agroscope Dr Caroline Kant Executive Director, EspeRare Foundation Switzerland Prof. Jérôme Lacour Dean, Faculty of Science, University of Geneva Dr Vincent Peiris Dean, School of Business and Engineering Vaud (HEIG-VD), HES-SO Prof. Jean-Marc Piveteau President, Zurich University of Applied Sciences (ZHAW) Prof. Alexandre Reymond Director, Centre for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne Prof. Patrick Ruch Head of Research, School of Business Administration (HEG-Geneva), HES-SO Prof. Michael Schaepman Vice-President for Veterinary Medicine and Natural Sciences, University of Zurich Prof. Falko Schlottig Director, FHNW School of Life Sciences Prof. Dirk Schübeler Co-Director, Friedrich Miescher Institute for Biomedical Research (FMI) Prof. Torsten Schwede Vice President of Research and Talent Promotion, University of Basel Prof. Juerg Utzinger Director, Swiss Tropical and Public Health Institute Co-opted Member Dr Laurent Duret CNRS Research Director, Laboratory of Biometry and Evolutionary Biology, Claude Bernard-Lyon 1 University, France

Dr Jérôme Wojcik (Chairman) Managing Director, Head of Data Science and Diagnostics at STALICLA Prof. Ron Appel and Dr Christine Durinx Joint SIB Executive Directors

Ms Martine Brunschwig Graf Former National Councillor Prof. Christian von Mering Group Leader, SIB and University of Zurich Prof. Christophe Dessimoz Group Leader, SIB and University of Lausanne

The Scientific Advisory Board (SAB) The SAB is made up of at least five members, who are internationally renowned scientists from the Institute’s fields of activity. Prof. Alfonso Valencia (Chairman) ICREA Professor Life Sciences Department Director Barcelona Supercomputing Centre, Spain

Prof. Alexey I. Nesvizhskii Department of Pathology and Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, USA

Prof. Søren Brunak Founder of the Centre for Biological Sequence Analysis, Technical University of Denmark, Denmark

Prof. Christine Orengo Department of Structural and Molecular Biology, University College London, UK

Dr Laurent Duret CNRS Research Director, Laboratory of Biometry and Evolutionary Biology, Claude Bernard-Lyon 1 University, France

Prof. Ron Shamir Computational Genomics Group at the Blavatnik School of Computer Science, Tel Aviv University, Israel

Prof. Melissa Haendel Director of the Ontology Development Group, Oregon Health & Science University, Portland, USA

Council of Group Leaders The Council consists of the Group Leaders and the SIB Executive Directors.

Honorary SIB Members Prof. Ernest Feytmans Honorary Director

Dr Johannes R. Randegger Former National Councillor, Honorary President of the SIB Foundation Council


3 3

SIB Profile 2020

As a non-profit foundation uniting bioinformatics in Switzerland and with 21 partner institutions (SEE P. 22), SIB has a robust governance structure ensuring both scientific independence and efficient internal functioning.

Foundation Council

Highest authority in the Institute, with supervisory powers. Its responsibilities include changes to SIB’s statutes, nomination of Group Leaders, and approval of the annual budget and financial report.

Scientific Advisory Board Board of Directors Acts as a consultative body, providing recommendations to the Board of Directors and the Council of Group Leaders. Its main tasks consist in monitoring service and infrastructure activities, such as SIB Resources. (SEE P. 13)

Define and implement the Institute's strategic goals as well as ensuring the organization’s representation at the national and international level. Support functions include finance & grant services, legal & technology transfer, human resources and communication & scientific events. They are dedicated to institutional and group matters.

External members from the political and industrial sectors

Executive Directors

Management and support teams

Takes the decisions necessary to achieve the aims of the Institute, such as defining the scientific strategy and internal procedures, and allocating federal funds to service and infrastructure activities.

Group Leaders

Council of Group Leaders

Discusses all matters relating to SIB Groups as a whole, and proposes new Group Leaders for nomination.

SIB Internal Groups

SIB Affiliated Groups

Staffed and headed by SIB Employees, they focus on SIB’s core missions.

Academic groups from partner institutions across Switzerland. They include those groups maintaining and developing an SIB-supported infrastructure, such as an SIB Resource (SEE P. 16) or a core facility (e.g. sciCORE and BCF, SEE P. 17) and can thus include SIB Employees as well.

(SEE P. 18-19)


3 4

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

FINANCES

SIB’s funds remained stable in 2019, thanks Other 0.2 to the sustained support of its funders. CHF 11.1 million

4

Swiss universities 0.2 0.2 European funds SNSF / Innosuisse 0.3

Allocation to SIB’s core missions

Detail of funding sources

39 % Swiss government – SERI

CHF 9.5 million

Private sector / Foreign universities

0.9

9 million

Swiss hospitals 0.3 Other 4 0.4 SNSF / Innosuisse 0.4

13 % Swiss universities

Private sector / Foreign universities

11 % National Institutes of

3.1 million

12 % Swiss government – SERI BioMedIT/SPHN 2 2.8 million

0.8

Health (NIH) 2.6 million

National Institutes of Health (NIH) 2.6

9 % Private sector / European funds

Foreign universities 2 million

1.4

7 % European funds 1.6 million

5 % Swiss National Science Fund (SNSF) / Innosuisse 1.1 million

Swiss government 1 1.4

3 % Other 4 0.6 million

1 %

Swiss hospitals

0.3 million Total 23.1 million Swiss universities

2.0

CHF 1.7 million

CHF 0.8 million

Swiss government 1 6.7

Databases & software tools

Swiss universities

Swiss government 2 BioMedIT/SPHN

2.8

Swiss government  0.6

Competence centres

Training

INFRASTRUCTURE 1 Swiss government funds are allocated to SIB Resources and Core facilities as per the recommendations of SIB’s external Scientific Advisory Board. 2 SIB received CHF 4.5 million in 2019 for BioMedIT (see P.39). 1.7 million were used for BioMedIT projects, the rest (2.8 million) is carried over to next year. SIB also received 1.1 million from SAMS for SPHN's DCC project. 3 SIB’s Management and support teams (SEE P. 33) are financed by the Swiss government as well as through overheads on external funds. The groups having entrusted SIB with the management of their funds benefit from specific support in legal affairs, human resources and financial monitoring. 4 Loss of income insurances etc. As per 2019 audited figures

0.2

Private sector / Foreign 0.3 universities Swiss government  0.3 SNSF / Innosuisse 0.4 Swiss universities

0.7

Scientific collaboration COMMUNITY

Allocation by activities ■ 77 % Infrastructure ■ 10 % Community ■ 13 % Management & support teams 3

85%

of financial resources allocated to the payment of salaries, reflecting the expertise-driven activities of the Institute.


3 5

SIB Profile 2020

EMPLOYEES

SIB Employees share a common passion: making a positive impact on society through the science of data.

SIB has 184 Employees, from 25 different nationalities *

There are 85 women (46 %) and 99 men (54 %) working at SIB

Geneva

Basel

76 Employees 7 Groups

20 Employees 4 Groups

Lausanne

Zurich

85 Employees 8 Groups

3 Employees 2 Groups

43

The median age at SIB is 43 years old, with a balanced pyramid of ages favouring knowledge exchange by bringing early-career scientists together with senior experts.

7

The median length of service is 7 years, with 36 % of employees having been at SIB for over 10 years.

* As of 31 December 2019


3 6


3 7

SIB Profile 2020

SPECIAL FOCUS

Making sense of health data

Health data are complex and have different properties, not the least being their sensitivity. With recent technological advances, new analytical methods and the promises of personalized health, discover how SIB Scientists attempt to harness these data for the benefit of society.


3 8

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

SIB in the Swiss Personalized Health Network

T

o promote the development of personalized health research, the Swiss Federal Government launched the Swiss Personalized Health Network (SPHN) in 2017. Its mandate: to make health data interoperable and broadly accessible for research, while ensuring that they are processed on IT infrastructure that fulfil stringent ethical and legal requirements regarding data protection and information security. SIB, through its Personalized Health Informatics Group (PHI) is in charge of the SPHN Data Coordination Centre and of the implementation of the national secure infrastructure, called BioMedIT, linking providers and users of sensitive data in Switzerland. The Group develops and implements nationwide standards for data semantics and exchange mechanisms to make data interoperable so that researchers can use them for their projects. It orchestrates the activities of several expert national Working Groups, and maintains strong ties with other national and international initiatives working on such standards. 1 Routine data such as diagnosis, demographics, drug allergy or samples, as well as imaging data and special­ized medicine registries

Cohorts, clinical study data and patient self-reported data

2

Data generated in hospitals or research facilities

3

4 Lifestyle, social media and wearable device data

Environmental data, geographical data, statistical data, etc.

5

DATA

The infrastructure to enable nationwide use and exchange of health data for research around personalized health is being set up.

1

2

3

4

5

6

7

8

9

10

TYPICAL BIOMEDICAL PROJECT

Data types typically used in biomedical projects ■ Hospital clinical data 1

■ Healthy citizen data 4

■ Clinical research data 2

■ Reference data 5

■ Molecular or -omics data

3

From health data to biomedical research: an interoperability journey Setting out common principles The variety of different data sources (SEE GRAPH ABOVE) required by typical bio­ medical research projects, and the resultant lack of interpretability, result in the need for harmonization and standardization of data. Therefore, in 2019, an overarching ‘SPHN Semantic Framework’ was laid out to make Swiss health data interoperable. This framework was developed by the Clinical Data Semantic Interoperability Working Group, one of the

five nationwide expert teams coordinated by the SIB Per­ sonalized Health Informatics Group, in close collaboration with the University Hospitals and eHealth Suisse. A strong semantic definition (a common understanding), an agreement on standards and a flexible exchange format are being set up, among other aspects, to serve the needs of the different use-cases and allow interoperability within projects, between projects as well as between institutions and over time.


3 9

SIB Profile 2020

Raising awareness and providing essential IT Security and Data Protection skills Researchers working with sensitive human data on the BioMedIT platform are provided with training opportunities (online or in a classroom) followed by a mandatory online exam. The training is delivered by representatives of each BioMedIT node.

Since autumn 2019, the national secure IT infrastructure dedicated to sensitive data processing has been operational at the three BioMedIT nodes based in Basel, Lausanne and Zurich. The infrastructure now enters a refining phase, in collaboration Lorem ipsum with various SPHN projects.

The BioMedIT Network A national secure infra­ structure for Switzerland

Hospitals, Universities, platforms

BioMedIT analysis environment

4x

SIS (ZH)

sciCORE (BS)

Core-IT (LS)

The classroom course was given four times between November 2018 and November 2019, in the cities of Zurich, Basel, Lausanne and Geneva. Over

130

DCC*

participants in total

Sensitive data require specific infrastructure Switzerland to contribute to the definition of international genomic data sharing standards The Global Alliance for Genomics and Health (GA4GH) has selected SPHN among its “2019 Driver Projects” – an international group of 22 leading genomic data initiatives – that develop and pilot standards for sharing genomic and health-related data.

The use of human sensitive data poses administrative, legal and technical challenges. Which data are available and where? Are they consented? Who is allowed to use them and for which purpose? What are the appropriate security standards? National and international data protection laws require specific technical and organiza­ tional measures when working with sensitive human data, including the use of IT infrastructure designed for data protection and security.

RESEARCHER

* DCC: Data Coordination Center


4 0

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

A vault for sensitive data How to exploit the wealth of knowledge contained in health data, while preserving the patients’ or study participants’ privacy? With a secure infra­structure, such as the Sensitive Data Processing Platform SIB set up in 2019, connected to the BioMedIT network (SEE PREVIOUS PAGE).

U

nderstanding the respective influence of environmental and genetic factors on a given trait, being able to diagnose a disease before the onset of their symptoms and to prevent its spread and adapting a treatment to a patient: these are only a few of the promises medical big data hold. To harness their power however, biomedical researchers must wade their way through rivers of sensitive human data, such as clinical trial data or molecular data generated in hospitals. SIB provides biomedical researchers with a platform that allows the secure processing of sensitive human data, combined with unique bioinformatics expertise. Discover how it operates and how it is supporting, in practice, a European project to fight diabetes.

2,314

exabytes The estimated amount of health data that will be produced worldwide in 2020. An exabyte is one quintillion, or 1018 bytes SOURCE: EMC DIGITAL UNIVERSE WITH RESEARCH AND ANALYSIS BY IDC, APRIL 2014

In April 2019, SIB set up a Data Protection and Security Board. It ensures a global multidisciplinary vision (legal, cyber­security, IT, clinical and broad bioinfor­matics fields) to provide and enforce recommendations on topics pertaining to sensitive data and information security.


SIB Profile 2020

Leveraging sensitive data to fight diabetes at the European scale

A gateway to a national network of secure infrastructure

As part of the Hypo-RESOLVE project, which aims to find better solutions to alleviate the burden and consequences of hypoglycaemia – a common and serious complication of diabetes – the Vital-IT Group is orchestrating the transfer and hosting of hundreds of clinical trials onto SIB’s Sensitive Data Processing Platform (see “A gateway...”).

Within the BioMedIT network of secure IT infrastructure (SEE PREVIOUS PAGE), SIB has developed a Secure Data Processing Platform, managed by its Core-IT Group, to serve the needs of international and local projects.

At the interface between clinical data managers (at the pharma companies) and end users (analysts in academic research groups), our computational biologists organize and structure the data in order to answer research questions related to the causes and impact of hypoglycaemia in people living with diabetes. The steps involve:

The secure transfer of the clinical trial data from pharma and device companies onto SIB’s servers;

The setting up of a clinical database on the secure server, only accessible to approved users via remote desktop;

The harmonization of the heterogeneous data, often collected using different clinical protocols and experimental design;

Finally, enabling queries through an API while ensuring data never leave the servers.

This pan-European project is supported by the Innovative Medicines Initiative (IMI) and includes 23 leading academic, pharmaceutical and device manufacturers, as well as patients’ organizations. HYPO-RESOLVE.EU/PROJECT

In practice: a tailored compute and storage environment is defined for each scientific project (including number of compute nodes, storage space, software tools and services) to match its respective needs. Secure hosting and access are guaranteed for projects with sensitive data. Research projects further benefit from SIB’s expertise in the handling of contracts (with hospitals, for example), data protection and IT security training, data management and analysis and research support. Finally, the platform – being seamlessly connected to the BioMedIT national network – provides a gateway to the Swiss Personalized Health Network. This way, researchers can access data sources across Switzerland.

4 1


4 2

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

Understanding precision oncology: a new website and a travelling exhibition

Supporting precision oncology

Thanks to the support of the SantéPerSo initiative, the PrecisionMed.ch website, developed by SIB in collaboration with the University of Lausanne and Lausanne University Hospital, offers illustrated information accompanied by videos and a glossary to improve understanding of the new approaches enabled by precision medicine.

Cancer treatment is among the medical disciplines where molecular approaches are already used to identify targetable alterations in cells.

As part of the Personalized Health & Society initiative, SIB also collaborated with the travelling exhibition A notre santé, organized by the Bioscope and the Musée de la Main, Lausanne. The new workshop developed with PrecisionMed.ch provided an oppor­ tunity to explain the key role played by bioinformatics in the choice of custom cancer treatment to patients, caregivers, teachers and members of associations active in this field.

I

n the field of oncology, precision medicine is already a reality. The analysis of tumours at the molecular level (DNA and proteins) – also known as molecular profiling – is already used in many cases today, and helps understand the causes of development of an individual cancer. Indeed, a one-off change in DNA can modify the tridimensional structure and the function of a protein, such as the celldivision control. Molecular approaches thus enable a more precise diagnosis and, sometimes, an offer of personalized treatment – hence the term of precision oncology. They also necessitate the combined multidisciplinary expertise of oncologists, pathologists, biologists, geneticists and bioinformaticians. Such expertise exists in Switzerland, in the shape of the Molecular Tumour Board (MTB) of the Réseau Romand d’Oncologie, in which SIB Scientists play an active role. To support clinicians and citizens in this revolution, SIB teams also develop tools, training and awareness material, such as the PrecisionMed.ch website.

PRECISIONMED.CH

Professional context of the participants on the first CAS course ■ Regional hospital ■ University ■ University hospital ■ Biotech ■ Private clinic

A successful first course for the CAS in Personalized Molecular Oncology A 100% success rate for the 20 par­ ticipants on the first Certificate of Advanced Studies in Personalized Molecular Oncology course, organized by SIB’s Clinical Bioinformatics Group jointly with the University Hospitals of Basel and Lausanne. Among the four modules taught, the clinical bioinfor­ matics module was particularly praised, with 90% of the participants (all with an MD and/or PhD degree) who would recommend it to others. The second CAS course started in November 2019. PMO.UNIBAS.COM


Interpreting genetic variants in cancer The Swiss Variant Interpretation Platform (SVIP), a project funded by SPHN and BioMedIT, aims to provide a central resource for clinicians and researchers to interpret clinically verified cancer variants. In practice, all variants identified in patients are fed from medical institutions into SVIP. For variants which were already annotated by the hospitals, annotations are submitted to the database, and, in case of discrepancies, resolved by a clinical panel composed of experts from partnering hospitals. For variants without previous annotation, expert biocurators from the SIB Swiss-Prot Group compose a draft annotation and likely clinical interpretation, which are also validated by the clinical expert panel. SVIP includes cross-reference links to complementary resources such as Swiss-PO (see “When mutations...”) and the Swiss Molecular Pathology and Tumour Immunology Breakthrough Platform (SOCIBP), an SPHN Driver project. SVIP.CH

4 3

SIB Profile 2020

When mutations found in patients are unknown… …the resource Swiss-PO has been developed as a decision-support tool for clinical bioinformaticians and clinicians. In a single interface, it gathers a collection of publicly available data to guide them in predicting how problematic (i.e. pathogenic) a new mutation could be: functional information on similar variants, experimental 3D structures of the involved protein to infer the impact on its interactions with other molecules such as drugs, and degree of conservation of the residue in the protein sequence among model organisms, which is a good indicator of its importance. The resource is developed by the Computer-Aided Molecular Engineering Group (University of Lausanne and SIB) and the SIB Molecular Modelling Group, in the context of their work with the Molecular Tumour Board of the Réseau Romand d’Oncologie. PRECISIONMED.CH/EN/ WHAT-IS-A-MUTATION/

“ The SVIP project is a great example of the unifying and colla­borative potential of bioinformatics. The project brings together six SIB Groups * with complementary expertise in clinical bioinformatics, biocuration, text mining and software architec­ ture. Partner­ing with Swiss medical institutions, we are working towards a single, nationwide clinical solution.”

Daniel Stekhoven Project co-lead Group Leader at SIB and Head Clinical Bioinformatics at NEXUS Personalized Health Technologies, ETH Zurich

* CO-PRINCIPAL INVESTIGATORS: NEXUS (DANIEL STEKHOVEN, ETH ZURICH); CLINICAL BIOINFORMATICS (VALÉRIE BARBIÉ); TEXT MINING (PATRICK RUCH, HES-SO), IN COLLABORATION WITH: SWISS-PROT, SIS (ETH ZURICH) AND CORE-IT


4 4

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

From tools to clues A glimpse of some of the tools and techniques used to make sense of health data.

S

ome are regularly in the spotlight, while others are most often unheard of. Techniques, resources and methods are not usually the focus of research breakthroughs, but nonetheless are fascinating by the sheer possibilities and range of applications they offer. While machine learning *, for instance, receives a lot of public attention, a flurry of approaches is used by scientists to extract knowledge from complex biomedical data. Sequencing at the individual cell level or at the community level with metagenomics, resources amassing knowledge on sugars, software to track pathogens outbreaks, and much more: here is a selection. * A BRANCH OF ARTIFICIAL INTELLIGENCE BASED ON THE IDEA THAT COMPUTERS CAN LEARN FROM DATA TO IDENTIFY PATTERNS AND MAKE DECISIONS WITHOUT EXPLICIT INSTRUCTIONS

30 At least

SIB Groups currently involved in machine learning, applied to text mining (as a support to annotation for instance), biological or medical research.

#Machine Learning

Machine learning techniques applied to health data: a training oldie Since 2015, SIB has organized introductory courses on key machine learning algorithms used in bioinformatics and applied to genomics, text mining, clinic or microbiology. For the first time in 2019, deep learning methods for the analysis of histopathological image data were taught by experts from SIB, Novartis and IBM in Basel (co-hosted with the Swiss Digital Pathology Consortium).

“In precision medicine, machine learning techniques are becoming essential – both to integrate the large variety of data types used to characterize each patient, as well as to identify, in these complex high-dimensional data, hidden patterns which may then be used as biomarkers that predict susceptibility to a disease”

Julia Vogt SIB Group Leader, ETH Zurich


4 5

SIB Profile 2020

#Single-cell

Single-cell analyses driving new therapeutic approaches SIB Scientists from the BCF Group are using data from single-cell sequencing to monitor how the proportion of immune cells (in orange, pink and red on the image ) varies in comparison to tumour cells (in blue) over time, and to assess the impact of novel therapeutic approaches on tumour growth.

#Machine Learning

#Glycomics

Exploring the human / machine interface in expert curation

A step closer to the secret recipe of maternal milk

Together with colleagues at the National Center for Biotechnology Information (NCBI), the Swiss-Prot Group is evaluating the ability of machine learning algorithms to accelerate the expert curation of biochemical transformations in Rhea – an approach already successfully applied to the curation of human variant data. DOI:10.1371/JOURNAL.PCBI.1006390

The molecular content of breast milk and its biosynthesis, despite obvious public health benefits, has eluded industrialists for more than a century. In recent years, significant progress was achieved through uncovering the variety of human milk oligosaccharides or HMOs and their beneficial role in protecting infants. The SIB Group led by Frédérique Lisacek is working with an Irish consortium to support HMO analysis in response to the presence of pathogens. It develops databases and tools to characterize the HMO repertoire, simulate biosynthesis and understand interactions with pathogens.

“I really liked the topics that were cover­ed, which included not only single-cell RNA sequencing but also single-cell techniques applied to proteomics, cytometry, etc. The continuous companionship of the other students and teachers also stimulated a lot of discussions. I really learned a lot!” A participant at the Autumn School

A single-cell autumn school A joint SIB / SciLifeLab (Sweden) Au­ tumn School on single-cell analysis was held in 2019, achieving a record number of 111 applications from scientists across 19 countries. With a blend of lectures and practical exercises, participants were guided through the main single-cell applications and techniques, with the aim of being able to apply them to their research when back at their labs.


4 6

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

1

#Metagenomics

2

Virosaurus, an encyclopedia of viruses Together with Vital-IT and the Geneva University Hospitals, the Swiss-Prot Group launched a new resource that provides high-quality, complete viral genome sequences to facilitate clinical metagenomics analyses. 1. Representation of the Bornavirus, the first virus to have led to a metagenomic diagnosis of atypical viral meningitis

2. Coloured trans­ mission electron microscopy image of a Bornavirus in a tissue culture. The virus has an outer membrane (blue) enclosing the genetic material (purple) which is comprised of ribonucleic acid (RNA).

Thanks to meta­ genomics approaches, clinicians are able to detect potential patho­gens (incl. bacteria, fungi and viruses) in a patient’s sample. They rely on the comprehensive analysis of microbial and host genetic material present in the samples. Swiss-wide ring trial to harmonize viral metagenomics practices in the clinic This first nationwide quality-control test aimed at benchmarking current metagenomic workflows used at Swiss clinical virology laboratories, and contributing to the definition of common best practices for their routine use. The study was designed and implemented by the Clinical Bioinformatics Group with colleagues from EPFL and the University of Zurich, and published in Genes.

#Genetic variants

“By allowing a harmonized interpretation of pathogen sequences within a centralized platform, SPSP represents a fantastic jump forward for our work and will support earlier detection and potentially better management of outbreaks.”

Diabetes, obesity, and SIB Resources Genetic variants can inform clinicians on the most appropriate treatment to adopt in cases of pediatric diabetes. At the Diabetes & Obesity Days of the University of Geneva, our outreach team demon­ strated how to use the protein database UniProtKB/Swiss-Prot in this context. DIABETE.UNIGE.CH

Jacques Schrenzel Head of the Bacteriology Laboratory and of the Genomic Research Laboratory Service of Infectious Diseases and Service of Laboratory Medicine


#Pathogen sequencing

A nationwide secure platform for near real-time sharing of pathogen genomes Many diseases are caused by rapidly mutating and increasingly drug-resistant microorganisms. Hence characterizing pathogens at the molecular and genomic level is essential for designing drugs and vaccines as well as for the monitoring of disease outbreaks. The first version of the Swiss Pathogen Surveillance Platform (SPSP) has been released: this nationwide secure platform enables near real-time sharing of pathogen whole genome sequences and their associated clinical and epidemiological metadata for surveillance of outbreaks. With an initial focus on antibiotic resistant strains, ultimately any isolated bacteria or virus sequenced would be uploaded, analysed, stored and interpreted within a single harmonized platform.

4 7

SIB Profile 2020

The visualization module is built on the Nextstrain tool for the real-time tracking of pathogen evolution (see map above of Zika virus transmissions), co-developed by the SIB Group led by Richard Neher (University of Basel). This National Research Programme “Antimicrobial Resistance” (NRP72) project is driven by a national consortium including the SIB Clinical Bioinformatics Group, the University Hospitals of Basel, Geneva and Lausanne, the University of Basel, and VetSuisse Bern and Zurich. SPSP.CH

“Tools such as Nextstrain already play an important role in analysing and tracking outbreaks of pathogens such as SARS-CoV-2 or Zika virus in realtime. Public health interventions will increasingly rely on such bio­infor­matics tools to allocate their resources in the future.”

Richard Neher SIB Group Leader, University of Basel


4 8


4 9

SIB Profile 2020

SPECIAL FOCUS

Open Knowledge Open Science – the movement to make scientific research and its dissemination accessible to the society as a whole – has many facets. Bioinformatics is at the intersection of all these and SIB has an essential role to play to foster generation and access to knowledge in the long-term.


5 0

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

On the road to Open Knowledge Open Access, Open Source, Open Data… there are many ways to add even more value to research and generate lasting and accessible knowledge.

A

s bioinformaticians are well used to the complexity of life-science data, they know how crucial it is that the results are robust and replicable. They have thus developed a long tradition of open-access publishing, code sharing and defining community standards to make data more open. But even when data are accessible, poor data quality, opaque reuse licences, heterogeneous formats, missing metadata, and resources under financial jeopardy are all potential barriers to facilitating reuse and interpretation by the community. SIB is committed to the principle that research data and findings should be freely accessible to everyone: scientists and clinicians, but also members of the public who are playing an increasingly active role as informed participants in biomedical research and health care initiatives. So, what are the key ingredients, for an institute federating bioinformatics at a national scale, for going beyond Open Data, towards Open Knowledge?

Bioinformatics, a philosophy of Open Access With 56 % of their papers published in Open Access since 2015, SIB Members are setting the example in the Swiss life-science research ecosystem. % of Open Access publications including Swiss affiliations in MEDLINE since 2015 50%

40%

30%

20%

10%

Since 2012 SIB teaches popular courses and workshops on best practices in coding and software development to foster reproducible research.

h

ne

ric

of

Zu

a

rs

* U

ni

ve

ty

of

it y

La

us

an

ev

rg ou

en

G

Fr ib

of

it y

rs

er iv

* U n

rs

si

l

rn

Be

of

of ni

* U

ve

ve

it y

it y

Ba

of

rs ni

* U

* U

ni

ve

it y

H n le

ve

* U

ni

al G St

se

ta

l

B

pi

SI

os

st In

er rr

rs

h

i tu

FL

ric

Zu

EP

H ET

he Sc

First steps with Git for reproducible research

ul

Best practices in programming EPFL

Pa

Data Management Plan

te

Docker for reproducible computational research

Cumulative proportion of biomedical publications in Open Access including Swiss affiliations in MEDLINE since 2015. Only institutions with over 1000 publications are displayed, by alphabetical order. The data underlying this graph were generated as part of the SONAR project (Swiss open access repository) funded by swissuniversities, coordinated by the Library Network of Western Switzerland and co-piloted by the SIB Text Mining Group of Patrick Ruch. * + Hospital


5 1

SIB Profile 2020

From Open Data to Open Knowledge Harmonizing licences of databases and tools, expertly curating the information they contain, looking into sustainable funding schemes…: the video “From Open Data to Open Knowledge” sheds light on what is needed to guarantee scientific knowledge in the long term and for the benefit of all.

“Beyond the bottomup commitments that individual researchers make towards Open Science, it is crucial that institutions step in to steer efforts in the right direction from the top down. I'm proud that SIB as an organization recognizes the value of open scholarship, and thereby encourages transparency, research integrity and knowledge discovery.”

WATCH WATCH THE VIDEO THE VIDEO

Mark D. Robinson Open Science Delegate at the University of Zurich SIB Group Leader


5 2

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

Enabling knowledge reuse by standardizing licences Being able to redistribute or reuse parts of scientific knowledge, such as informa­ tion encoded in biological databases, is an essential driver of innovation. While all SIB Resources are freely accessible, standard licences further facilitating their use by life scientists are being progressively implemented with UniProt, STRING, SWISS-MODEL, neXtProt and more adopting the CC-BY-4.0 licence and Bgee, which entered the public domain in 2019.

“In 2019 the licence of Bgee was clarified to be CC0 for all the data that we provide. This is the most liberal licence, allowing any and all reuse with no legal strings attached. The academic norm of citation is expected to still be respected. This followed discussions within the biocuration and bioinformatics communities on how to maximize reusability of our resources, and how to treat the reuse of other resources within a database such as Bgee.”

Marc Robinson-Rechavi SIB Group Leader, University of Lausanne

Bgee, a public domain database to retrieve and compare gene expression patterns in multiple animal species


5 3

SIB Profile 2020

“I am grateful for the contribution of our collaborators who shared their prepublication data to allow me to expand my set of species and pursue my own research questions. For me, this highlights the importance of early data access, which, in the end, maximizes their value.” Mathieu Seppey First author, SIB Group of Evgeny Zdobnov, University of Geneva

The use of freely accessible SIB Resources and early access to data leads to discoveries on the origin of beetle biodiversity Just a few days after the 2019 IPBES’ global biodiversity assessment was presented in Paris, SIB Scientists at the University of Lausanne published a paper, in Open Access, explaining the origin of the biodiversity observed among beetles, arguably the largest species richness in the animal kingdom. The study, led by SIB Group Leader Robert Waterhouse (University of Lausanne), involved the SIB Bioinformatics Resources BUSCO and OrthoDB to contrast the gene repertoires of plant-eating beetle species with predatory ones. The results support the view that coevolution with the plants beetles feed on, and in particular the evolutionary adaptations to neutralize toxic plant compounds, are key drivers of their diversification. The multi-species analyses were made possible by the availability of insect genomic data from initiatives such as i5k (Sequencing Five Thousand Arthropod Genomes), often as part of pre-publication datasets. DOI: 10.1186/S13059-019-1704-5


5 4

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

Resources, interconnected How to quickly access and leverage the massive amount of complex knowledge contained in bioinformatics databases?

SELECT ?protein ?rhea WHERE { # ECO 269 is experimental evidence BIND (<http://purl.obolibrary.org/obo/ECO_00

E

ach bioinformatics resource holds answers to a particular question: what is the function of this protein? What is the most plausible 3D structure for it? Where, in the body, is this gene expressed? While this information is freely available from each resource, making them interoperable, both on the semantic and technical level, and enabling user-friendly queries would add even more value to every minute of a researcher’s time. Data integration represents the ‘next step’ in Open Knowledge, by enabling new findings from the wealth of biological data already available. At SIB, such integration takes different shapes: SPARQL endpoints, common ontologies and controlled vocabularies, and more.

?protein up:reviewed true ; up:organism taxon:9606 ; up:classifiedWith keywords:1185 ; up:annotation ?a ; up:attribution ?attribution . ?a a up:Catalytic_Activity_Annotation ; up:catalyticActivity ?ca . ?ca up:catalyzedReaction ?rhea .

RDF

[] rdf:subject ?a ; rdf:predicate up:catalyticActivity ;

A method to describe information on the web in a way that complies with FAIR principles (Findable, Accessible, Interoperable, Reusable)

SPARQL Pronounced “sparkle”, a language to express questions over information expressed in RDF

SPARQL endpoint A website where you can enter questions in SPARQL to get answers

rdf:object ?ca ; up:attribution ?attribution . ?attribution up:evidence ?evidence . } In December 2019, a “Querying SIB Resources with SPARQL” tutorial was taught at the SWAT4HCLS conference in Edinburgh. The participants learned how to use the data from nine independent SIB Resources (GlyConnect, UniProt, Rhea, OrthoDB, OMA, Bgee, HAMAP, MetaNetX and neXtProt) to answer biological questions. Access this SPARQL query example: bit.ly/uniprot39rhea


“Data integration across hetero­ geneous biological databases promises to be one of the catalysts for gaining new biological insights in the postgenomic era.”

00269> as ?evidence)

Sima A. C. et al. Database, 2019 Enabling semantic queries across federated bioinformatics databases

SPARQLing insights from linked data ... The Swiss-Prot Group is a leader in the use of semantic web technologies such as RDF and SPARQL to connect and combine data and knowledge resources to gain new insights into biology. The UniProt SPARQL endpoint – a portal to the semantic web of connected resources – was one of the first of its kind and remains one of the most widely used in the life sciences today. It links UniProt to expression patterns in Bgee, orthologous relationships in OMA and OrthoDB, glycan structures in GlyConnect, metabolic models in MetaNetX, drug target interactions in ChEMBL, and many more.

... with a focus on human proteins In 2019, the interoperability of the SIB knowledgebase on human proteins neXtProt, developed by the CALIPHO Group led by Amos Bairoch and Lydie Lane, was further improved. This was done by publishing new SPARQL federated queries with OMA as well as with IDSM (Integrated Database of Small Molecules), DisGeNET (knowledge platform for disease genomics) and WikiPathways (database of biological pathways). SPARQL federated queries were also developed to retrieve all positional annotations from neXtProt that correspond to a list of clinically observed genomic variants in the frame of the European Joint Project on Rare Diseases. The neXtProt’s SPARQL tools were presented at over seven major conferences in 2019.

5 5

SIB Profile 2020

SwissBioPics – navigating the cell An open resource of interactive cellular maps is developed at the Swiss-Prot Group. It makes it possible to navigate cells with their constituent organelles (subcellular structures that perform specific functions within a cell, such as the mitochondria or the nucleus) “sign­ posted” using the UniProtKB vocabulary of subcellular locations and the Gene Ontology. SwissBioPics users – which will include UniProt – will be able to visualize the cellular components for any annotated molecule and embed the resulting visualizations in their own websites. SwissBioPics already covers 300 cellular components from 16 cell types and 18 model organisms. The image shown is that of an epithelial animal cell containing 55 different organelles, in which actin – a protein involved in numerous structural and functional roles – is highlighted.

Towards an intuitive search function for bioinformatics databases Complex bioinformatics databases hold enormous amounts of knowledge that can be retrieved with in-depth technical know-how. A recent study enables easy access to the wealth of complementary information contained in several SIB Resources through editable template queries in natural lan­ guage. For instance, by asking "What are the human genes associated with cancer, and their orthologs expressed in the rat’s brain?", users retrieve, within seconds, several human genes associated with cancer (UniProtKB) for which orthologs exist in the rat (OMA) and corresponding genes expressed in the rat brain (Bgee). The study was powered by the BioSODA project, supported by the National Research Programme “Big Data” (NRP75), whose goal is to develop an intuitive search engine designed to identify new correla­ tions in stored data. Four SIB Groups are involved in the project, together with colleagues from the Zurich University of Applied Sciences.


5 6

DATA SCIENTISTS FOR LIFE - MAKING SENSE OF HEALTH DATA - OPEN KNOWLEDGE

Joining forces SIB takes part in global initiatives to improve data sharing and foster the long term sustainability of essential resources.

F

rom contributing to the development of tools and guidelines for making life-science data FAIR (Findable, Accessible, Interoperable, Reusable), to joining forces as part of a Global Biodata Coalition for the long-term funding of Core Data Resources, SIB is well placed to serve as a national and international expertise platform for Open Knowledge.

A European project to improve data sharing and reuse in the life sciences

Towards the long-term sustainability of Core Data Resources for life sciences

SIB, through its Vital-IT Group, is lending its expertise in data management to the FAIRplus project, aiming to increase the discovery, accessibility and reusability of data from selected projects funded by the European Innovative Medicines Initiative (IMI), as well as internal data from pharmaceutical industry partners. Tools and guidelines for making lifescience data FAIR will be developed. Training for data scientists in academia, SMEs and pharmaceutical companies will be organized to enable wider adoption of best practices in lifescience data management.

The wheels are set in motion at the in­ ternational level to ensure the long-term sustainability of Core Data Resources for life sciences. In October, funders from across continents sat together to work towards this very goal at the first interim board meeting of the Global Biodata Coalition (GBC). SIB attended the meet­ ing as part of the GBC’s Steering Group.

SIB takes part in a cloud system to facilitate transcontinental human data exchange The CINECA project, led by EMBL-EBI, aims to unite more than a million human datasets from across Africa, Canada and Europe. SIB is among the initiative’s 18 partner organizations, with the Group led by Patrick Ruch at HES-SO bringing its text mining and automated annota­ tion expertise to help uncover clinically relevant information.


5 7

SIB Profile 2020

“We specialise in developing innovative text mining and automated annotation techniques with a particular focus on personalized health. Such know-how will be crucial in the CINECA project, to help uncover clinically relevant information from a uniquely broad and complex range of data: genomics, proteomics and physiological data, as well as narratives.” Patrick Ruch SIB Group Leader, HES-SO – Geneva School of Business Administration (HEG), one of the two Swiss co-Principal Investigators of the CINECA project

ELIXIR-CONVERGE, a European initiative to promote better data management in life-science research SIB, in collaboration with the Netherlands, Slovenia and the UK, is leading efforts to educate and train researchers on how to build and implement Data Management Plans, in order to make research reproducible, make data open and FAIR, and foster knowledge generation.


5 8


5 9

SIB Profile 2020

INDEX OF SIB GROUP AND TEAM LEADERS as of 1 January 2020

NAME

FIELDS OF ACTIVITY

A

LOCATION

Ahrens Christian Proteins and proteomes

Agroscope

Anisimova Maria Evolution and phylogeny

Zurich University of Applied

Sciences (ZHAW)

Arguello Roman

University of Lausanne

NEW

Evolution and phylogeny

B

Baerenfaller Katja Proteins and proteomes

SIAF – University of Zurich

Bairoch Amos Proteins and proteomes

University of Geneva

Barbié Valérie NEW Competence centres and core facilities SIB Bastian Frédéric Evolution and phylogeny

University of Lausanne

Baudis Michael Genes and genomes

University of Zurich

Beerenwinkel Niko Evolution and phylogeny

ETH Zurich, D-BSSE

Bergmann Sven Genes and genomes

University of Lausanne

Boeva Valentina NEW Systems biology

ETH Zurich

Borgwardt Karsten Text mining and machine learning

ETH Zurich

Bridge Alan Proteins and proteomes

SIB

Bruggmann Rémy Competence centres and core facilities University of Bern Bucher Philipp Genes and genomes

EPFL

C

Cascione Luciano Competence centres and core facilities Institute of Oncology Research Cavalli Andrea Structural biology

Università della Svizzera italiana

Chopard Bastien Systems biology

University of Geneva

Ciriello Giovanni Systems biology

University of Lausanne

Correia Bruno Structural biology

EPFL

Crameri Katrin

Competence centres and core facilities SIB

D

Dal Peraro Matteo Structural biology Delaneau Olivier

NEW

EPFL

Genes and genomes University of Lausanne

Delorenzi Mauro Competence centres and core facilities University of Lausanne Deplancke Bart Genes and genomes

EPFL

Dermitzakis Emmanouil Genes and genomes

University of Geneva

Dessimoz Christophe Evolution and phylogeny

University of Lausanne


6 0

INDEX

NAME

FIELDS OF ACTIVITY

E

Excoffier Laurent Evolution and phylogeny

F

LOCATION

University of Bern

Falquet Laurent Genes and genomes

University of Fribourg

Fellay Jacques Genes and genomes

EPFL

G

Gfeller David Proteins and proteomes

University of Lausanne

Gonnet Gaston Evolution and phylogeny

ETH Zurich

Goudet Jérôme Evolution and phylogeny

University of Lausanne

I

Ibberson Mark

Competence centres and core facilities SIB

Iber Dagmar Systems biology

ETH Zurich, D-BSSE

Ivanek Robert Systems biology

University of Basel & University Hospital Basel

K

Kriventseva Evgenia NEW Genes and genomes

University of Geneva

Kutalik Zoltán Genes and genomes

University of Lausanne

L

Lane Lydie Proteins and proteomes

University of Geneva

Lisacek Frédérique Proteins and proteomes

University of Geneva

M

Malaspinas Anna-Sapfo Genes and genomes

University of Lausanne

Mazza Christian Systems biology

University of Fribourg

Michielin Olivier Structural biology

University of Lausanne

Miho Enkelejda NEW Systems biology

FHNW University of Applied Sciences and Arts Northwestern Switzerland

Milinkovitch Michel Systems biology

University of Geneva

Mitri Sara Evolution and phylogeny

University of Lausanne


6 1

SIB Profile 2020

NAME

FIELDS OF ACTIVITY

N

Neher Richard Evolution and phylogeny

P

LOCATION

University of Basel

Palagi Patricia Training

SIB

Payne Joshua Evolution and phylogeny

ETH Zurich

Pedrioli Patrick

NEW

Proteins and proteomes ETH Zurich

Peña-Reyes Carlos-Andrés Text mining and machine learning

HEIG-VD

Pivkin Igor Systems biology

Università della Svizzera italiana

R

Rätsch Gunnar Text mining and machine learning ETH Zurich Rehrauer Hubert

Competence centres and core facilities ETH Zurich & University of Zurich

Riedi Marcel

Competence centres and core facilities University of Zurich

Rinaldi Fabio Text mining and machine learning University of Zurich Rinn Bernd

Competence centres and core facilities ETH Zurich & D-BSSE

Robinson Mark Genes and genomes

University of Zurich

Robinson-Rechavi Marc Evolution and phylogeny

University of Lausanne

Ruch Patrick Text mining and machine learning HES-SO - Geneva School of

S

Salamin Nicolas Evolution and phylogeny Schütz Frédéric

Business Administration (HEG)

University of Lausanne

Competence centres and core facilities University of Lausanne

Schwede Torsten Structural biology, Competence centres University of Basel and core facilities Sengstag Thierry

Competence centres and core facilities University of Basel

Snijder Berend NEW Systems biology

ETH Zurich

Stadler Michael Genes and genomes

Friedrich Miescher Institute

Stadler Tanja Evolution and phylogeny

ETH Zurich & D-BSSE

Stasiak Andrzej Genes and genomes

University of Lausanne

Stekhoven Daniel Competence centres and core facilities ETH Zurich Stelling Jörg Systems biology

ETH Zurich & D-BSSE

Stockinger Heinz Competence centres and core facilities SIB Sunagawa Shinichi Genes and genomes

ETH Zurich


6 2

INDEX

NAME

FIELDS OF ACTIVITY

V

van Nimwegen Erik Genes and genomes Vogt Julia

LOCATION

University of Basel

Text mining and machine learning ETH Zurich

von Mering Christian Proteins and proteomes

W

University of Zurich

Wagner Andreas Evolution and phylogeny

University of Zurich

Waterhouse Robert Genes and genomes

University of Lausanne

Wegmann Daniel Evolution and phylogeny

University of Fribourg

Wollscheid Bernd

ETH Zurich

NEW

Proteins and proteomes

Z

Zavolan Mihaela Systems biology

University of Basel

Zdobnov Evgeny Genes and genomes

University of Geneva

Zoete Vincent Structural biology

University of Lausanne


6 3

SIB Profile 2020

ACKNOWLEDGEMENTS

IMPRESSUM

We gratefully acknowledge the following funders, sponsors and partners for their financial support and encouragement in helping us fulfil our mission in 2019.

© 2020 – SIB Swiss Institute of Bioinformatics

The Swiss government and in particular: The State Secretariat for Education, Research and Innovation (SERI) The Swiss National Science Foundation (SNSF) Innosuisse Our institutional partners The European Commission The Leenaards Foundation The National Institutes of Health (NIH) The Research for Life Foundation

We thank the generous SIB Fellowship programme sponsors: The R. Geigy Foundation The University of Geneva

We also thank all industrial and academic partners who trust SIB’s expertise.

SIB MEMBERS PORTRAITS BY

Nicolas Righetti, www.lundi13.ch ILLUSTRATION BY

Aurel Märki, www.aurelmaerki.ch DESIGN AND LAYOUT BY

Bogsch & Bacco, www.bogsch-bacco.ch IMAGE CREDITS (from top to bottom and from left to right)

P. 6 Franziska Gruhl - SIB. All rights reserved Franziska Gruhl - SIB. All rights reserved Sutthaburawonk / iStock Fabio Rinaldi - SIB. All rights reserved P. 13 Pilar Garrido P. 24 CC BY 4.0 | 2019 Akarsu et al. P. 25 Dmitry Petlin / iStockphoto Russian Academy of Sciences P. 26 Akslocum / Shutterstock Tatyana Vyc / Shutterstock P. 27 Sara Cervera P. 28 CC BY 4.0 | SIB Swiss Institute of Bioinformatics P. 31 CC BY-NC 4.0 | SIB Swiss Institute of Bioinformatics, Atelier Poisson P. 43 CC BY-NC 4.0 | SIB Swiss Institute of Bioinformatics, Atelier Poisson P.45 Nadine Zangger, BCF Group, Lausanne P.46 Dr Jurgen Richt / Science Photo Library CC BY-NC-ND 4.0 | ViralZone: www.expasy.org/viralzone, SIB Swiss Institute of Bioinformatics F. Schrenzel P.47 CC BY 4.0 | Nextstrain.org P.50 CC BY 4.0 | Julien Gobeill, SIB / HES-SO Geneva P.51 CC BY 4.0 | SIB Swiss Institute of Bioinformatics Frank Brüderli P.52 Felix Imhof Dan Olsen / Shutterstock Marie Lemerle / Shutterstock P.53 Anton Kozyrev / Shutterstock Tanachai Chaisri / Shutterstock P.55 CC BY-NC-ND 4.0 | SwissBiopics, SIB Swiss Institute of Bioinformatics




SIB Swiss Institute of Bioinformatics Quartier Sorge Bâtiment Amphipôle CH – 1015 Lausanne T. +41 21 692 40 50 www.sib.swiss


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.