Page 1


Annual Report 2018



Foreword by Jean-Eric Paquet


Introduction by ELIXIR Director

Platforms 8

ELIXIR Platforms


Tools - services and connectors for access and exploitation


Data - sustaining europe’s life science data infrastructure


Compute - access, exchange and storage


Interoperability - integration of data and services


Training - professional skills for managing and exploiting data

Communities 26

ELIXIR Communities

Members 35

ELIXIR Nodes updates 2018

Highlights 44

Highlights from 2018

EU Grants 50



EU grants Supporting activities


ELIXIR 2019-23 Scientific Programme


Capacity building and Node development


Operations and programme management


Collaboration beyond Europe


Impact and sustainability


Industry engagement




ELIXIR Hub staff

Governance committees and financial data 74

ELIXIR committees


Financial data

ELIXIR Annual Report 2018


Foreword by Jean-Eric Paquet

I took up the office of Director-General for Research and Innovation at the European Commission in April 2018 with the ambition to help giving a new momentum to the development of the European Research Area (ERA) and deliver on the main Commission priorities for the EU’s policy for research and innovation: Open Science, Open Innovation and Open to the World. Research infrastructures play a central role in this as they help structuring the ERA and touch on all the pillars of the 3 O policy - they promote and encourage open science and innovation and they build bridges to the EU’s partners across the globe. The EU’s Research Infrastructures programme and the collective efforts of the Member States - through the European Strategy Forum for Research Infrastructures (ESFRI) - to prioritise and set up the key infrastructures for European research, have yielded very important achievements during the last decade. ELIXIR is a perfect example, demonstrating how research infrastructures can integrate national resources, drive forward research, foster innovation and play a major role on the global stage. Bringing together essential, publicly-funded, life sciences data resources from across Europe, ELIXIR is a champion for the open sharing and reuse of data. I have been delighted to see how ELIXIR has grown since it was first included on the ESFRI Roadmap in 2006. It is now one of the largest ESFRI infrastructures in terms of membership by countries, and has been recognised as a priority infrastructure by the European Council1, ESFRI and by the G7’s Group of Senior Officials2. Acknowledging the progress ELIXIR has achieved in establishing a coordinated data infrastructure for the life sciences, many funding agencies, including the European Research Council (ERC)3, now encourage their grantees to deposit data into the relevant ELIXIR Deposition Databases or to use ELIXIR’s Recommended Interoperability Resources. Major new initiatives, such as the declaration by the Member States to provide open access to one million genomes by 2022, rely on the ELIXIR infrastructure for their implementation.

I have watched with interest the support ELIXIR provides to industry and to small and medium-sized enterprises (SMEs), as well as the fundamental role that public bioinformatics resources play in stimulating innovation. ELIXIR’s Innovation and SME forum, which allows industrial users to learn about the resources provided by ELIXIR, has until now supported hundreds of researchintensive companies across Europe and across a range of sectors, including pharma, biotech, aquaculture, microbiome and agritech. Progress in these areas is key to tackling global challenges related to health, food security and the environment. ELIXIR has led or participated in many projects funded by Horizon 2020, the EU’s research and innovation programme. For example, it supports life sciences researchers through the ELIXIR-EXCELERATE project, and fosters interoperability with other ESFRI research infrastructures, providing support for their data-related needs, through the CORBEL project. It also contributes to the establishment of the European Open Science Cloud through the EOSCpilot, EOSC-hub and the new EOSC-Life project. In December 2018, ELIXIR celebrated its fifth anniversary in Brussels, bringing together ELIXIR partners, national funders, representatives of the European Commission and industry, and presented some of its major achievements during this period. More than ever, Europe needs sustainable and efficient research infrastructures, and I very much look forward to seeing ELIXIR grow even further in the next five years. Jean-Eric Paquet Director General, DG Research and Innovation, European Commission

1 2 3 Open Research Data and Data Management Plans: Information for ERC grantees by the ERC Scientific Council, online: https://erc.


ELIXIR Annual Report 2018

Introduction by ELIXIR Director

The year 2018 was the final year of the first ELIXIR programming cycle (2014-2018). With the successful conclusion of its first Scientific Programme, I would like to look back at 2018 and at the first five years of ELIXIR. Back in December 2014, when we first launched ELIXIR in Brussels, ELIXIR had just six members: five European countries and EMBL-EBI. In December 2018, when we celebrated ELIXIR’s Fifth Anniversary, ELIXIR had 23 members, including all the major bioinformatics centres in Europe. There is obviously more to ELIXIR than the number of member countries, but it illustrates the rapid growth of ELIXIR over the past years. Today, we bring together over 150 services and resources, skills and expertise from across 23 ELIXIR Nodes, and over 600 scientists and technical experts. We’ve made an immense leap forward through the ELIXIR-EXCELERATE project, which has helped us to establish our technical Platforms, our Training and Industry programmes, and which supports users through our Use Cases (now Communities). Our portfolio of Implementation studies is now a mature technical programme, which brings our services closer to users. In 2014, we ran four Implementation studies (then called Pilot actions); in 2018, we ran 34.

Scientific Programme 2019-23 Even though 2018 represented the final year of ELIXIR’s first programming cycle, we were busy preparing the ELIXIR Scientific Programme for the next five years 2019-2023. The main goal of the first Scientific Programme was to establish ELIXIR’s governance and operations, and to develop a portfolio of bioinformatics resources, covering all aspects of working with life science data. The goals for the next five years are to extend this portfolio of databases, data services, tools, workflows and clouds, into a fully integrated infrastructure for accessing, analysing and reusing life science data. The ambition is to build a truly pan-European system of federated life-science data services that meets the demands for FAIR data stewardship within every European life science project. We presented this Scientific Programme at our Fifth Anniversary Conference in Brussels in December 2018. Building on the results of ELIXIR’s first five years, this programme presents an unprecedented effort to align

national and international bioinformatics services using common standards and state-of-the-art technologies for the secure storage, access and analysis of data.

Recommended Interoperability Services A key goal of the first Scientific Programme was to develop a new model to identify and select bioinformatics resources that are vital for the long-term sustainability of life science data and for FAIR data management. In 2017, we established the ELIXIR Core Data Resources and the ELIXIR Recommended Deposition Databases; in 2018, we added the Recommended Interoperability Resources (RIRs). The RIRs have been selected by external reviewers, based on how they facilitate scientific research and improve FAIRness of life science data. The initial portfolio comprises ten tools and registries from across ELIXIR Nodes and includes resources for standards and APIs, applications, integrators and pipelines. The portfolio of ELIXIR Recommended Interoperability Resources will be regularly evaluated as ELIXIR evolves to accommodate emerging technologies and changing scientific needs.

ELIXIR Biohackathon In 2018, ELIXIR also brought to Europe the concept of Biohackathons. Following a decade of successful Biohackathons organised in Japan, the first European Biohackathon was organised by ELIXIR in November 2018 in Paris. During this event, around 100 participants worked together on nearly 30 ‘hacking’ projects, covering the main challenges identified by the ELIXIR Platforms and Communities. The impact of the event on all of the projects was clear to anyone present at the meeting. Having 100 software developers, data management

ELIXIR Annual Report 2018


experts, bioinformaticians and scientists working together for five days not only boosted the 30 selected projects, it also generated plenty of new ideas that can be tested and explored very quickly. We are already working on the second edition of the ELIXIR Biohackathon, which will take place in November 2019, again in Paris.

Looking ahead Our programme for 2019, and for the years that follow, is defined in the Scientific Programme. In addition to explaining the overall objectives for ELIXIR, this programme also contains detailed work plans for ELIXIR’s technical Platforms and Communities, which translate the strategic objectives into concrete goals and clear deadlines. We will monitor our progress towards these goals and will report back to our stakeholders at the European and national level. In 2019, we will also launch a number of important new projects. In January 2019, we started the FAIRplus project, an industry-academia collaboration supported by the Innovative Medicine Initiative to develop tools and guidelines for making life science data FAIR. In March 2019, we started the EOSC-Life project, which brings together 13 ESFRI biomedical research infrastructures to create an open collaborative digital space for life science in the European Open Science Cloud (EOSC). Bringing all the infrastructure resources together in EOSC and linking them to open and reusable tools and workflows, made accessible to users across Europe, offers tremendous potential. Our ultimate goal in EOSC is to maximise the collective potential of biomedical ESFRI research infrastructures and the potential of researchers using them. I’m excited to continue working with all our partners within and outside of ELIXIR and look forward to further contributing to ELIXIR’s growth. Niklas Blomberg ELIXIR Director


ELIXIR Annual Report 2018

ELIXIR Platforms

ELIXIR Platforms

Many of ELIXIR’s activities are divided into five Platforms, each focusing on one area of bioinformatics: Data, Tools, Interoperability, Compute and Training. The Platforms rely on and build upon technical expertise and resources from ELIXIR Nodes, and the activities of each Platform reflect the current needs and challenges within the bioinformatics community. Their strategic priorities are defined in the ELIXIR Scientific Programme (see chapter Scientific Programme). The strategic direction of each ELIXIR Platform is developed by a Platform Executive Committee (Ex-Co),

which brings together senior scientists from ELIXIR Nodes (Platform Leaders) and is supported by a Platform Coordinator at the ELIXIR Hub. ELIXIR’s Chief Technical Officer is responsible for the overall coordination of the Platforms. ELIXIR Platforms are primarily funded by the ELIXIREXCELERATE project (2015-2019), in which each Platform is represented by a Work Package. Other EU grants (CORBEL, AARC2, eTransafe and others) often support specific activities within the Platforms. Each Platform also manages a corresponding portfolio of ELIXIR Implementation Studies, funded through the ELIXIR Hub budget.

Data Platform Aims to identify key data resources across Europe and support the linkages between data and literature.

Tools Platform Helps researchers find the best software tools to analyse their data.

Compute Platform Develops services to make it easier to store, share and analyse large datasets.

Interoperability Platform Develops and ecourages the adoption of standards to describe life science data.

Training Platform Helps scientists and developers find the training they need, and also provides that training.


ELIXIR Annual Report 2018

People in ELIXIR Platforms Platform Coordinator

Platform leaders Training Skills for managing and exploiting data Patricia Palagi ELIXIR Switzerland


Celia van Gelder ELIXIR Netherlands

Gabriella Rustici ELIXIR UK

Jerry Lanfear, ELIXIR Chief Technical Officer, ELIXIR Hub

Pascal Kahlem


Access, exchange and storage

Bio tools for data access and exploitation

Platform leaders

Platform leaders

Luděk Matyska ELIXIR Czech Republic

Tools Services for integration of data

Steven Newhouse EMBL-EBI

Tommi Nyrönen ELIXIR Finland

Platform leaders

Carole Goble ELIXIR UK

Alfonso Valencia ELIXIR Spain

Chris Evelo ELIXIR Netherlands

Søren Brunak ELIXIR Denmark

Helen Parkinson EMBL-EBI

Platform Coordinator

Platform Coordinator

Platform Coordinator

Jonathan Tedds

Jen Harrow

Sira Sarntivijai

Platform leaders

Platform Coordinator

Data Sustaining Europe’s life science data infrastructure

Jo McEntyre EMBL-EBI

Christine Durinx ELIXIR Switzerland

Rachel Drysdale

ELIXIR Annual Report 2018



Services and connectors for access and exploitation The ELIXIR Tools Platform improves the discovery, quality and sustainability of software tools for exploiting life science data. The Platform’s main objective is to help researchers to find, re-use, deploy and benchmark software resources and analysis workflows, and to help research software engineers to develop and better describe their tools. The focus of the Platform in 2018 was to: (1) further develop the registry; (2) promote best practices for sustainable software development; (3) perform technical benchmarking of software tools; and (4) deploy software. Image caption: The five groups within the Tools Platform include: Biocontainers,, OpenEBench, Galaxy and Software sustainability. They are supported by Tools interoperability for the integration of these tools into a joined-up ecosystem for the development, discovery and use of software tools in the life sciences. an ELIXIR registry for software tools and services The ( registry group has been focussing on its core missions of providing a high quality registry, and of exploring new ways to foster partnerships between developers, end users and scientific communities. In 2018, the number of entries registered in the portal increased to over 12,000, with 214,771 annotations, including 67,851 scientific EDAM4 annotations. This represented an annual increase of over 3,000 entries. There have been many improvements to the website and methods, including utilities to facilitate the community annotation process and content improvement by using text mining tools and browsing through EDAM ontologies. In September 2018, the source code of the portal was released under an open (GPL 3.0) licence, allowing anyone to re-use and build upon the portal and to extend its functionalities.

Together with the ELIXIR Training Platform, the Tools Platform completed an ELIXIR Implementation study to integrate ELIXIR portals from a user perspective. The goal of this study was to integrate and cross-link the information from the ELIXIR Training portal (TeSS) and the portal. A preliminary implementation is available at

Benchmarking OpenEbench ( provides a platform for the evaluation of bioinformatics tools, webservices and workflows from a scientific and technical perspective. In 2018, OpenEbench entered the second phase of development to provide a private workspace to communities to upload and maintain reference datasets (either public or restricted), and to upload and execute workflows to perform quality assessments. The basic infrastructure for tools monitoring was also developed in 2018 to provide a hub for technical information about software tools. At present, information is retrieved from over 15,000 tools from repositories like, BioConda, BioContainers and Galaxy, and provides more than 10 metrics for software quality. The integration with enables all of the information for each individual tool to be displayed on the portal. An important milestone in the development of OpenEBench was the designing and initial development of a three-level architecture to support benchmarking by scientific communities. Level one serves to import data from mature communities to ensure their long-term preservation. Level two offers communities the ability to run evaluation workflows in OpenEBench via its Virtual Research Environment (VRE), effectively outsourcing the evaluation workflows to OpenEbench. In 2018, the pilot use case of The Cancer Genome Atlas’ Benchmarking Group5 was of paramount importance for driving the development of evaluation workflows by other communities.

4 Ison, J., Kalaš, M., Jonassen, I., Bolser, D., Uludag, M., McWilliam, H., Malone, J., Lopez, R., Pettifer, S. and Rice, P. (2013). EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats. Bioinformatics, 29(10): 1325-1332. 2013. doi: 10.1093/bioinformatics/btt113 5 Bailey, M. H. et al. Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell 2018, 173, 371–385.e18.


ELIXIR Annual Report 2018

To support efforts around software packaging & containers, e.g. BioConda/ BioContainer and support sustainable integration into and OpenEBench, a discovery portal for bioinformatics software information, providing curated description of tools and data services

OpenEBench, an infrastructure providing services for hosting scientific benchmark activities and technical monitoring of bioinformatics tools and service

To drive the development for execution platforms (e.g. Galaxy) and ensure integration with, OpenEBench and workflows using CWL

Raise software quality and sustainability, by producing and promoting software best practices and developing training activities

Tools Interoperability, guidelines and resources for guaranteeing platforms integration at the ELIXIR Tools platform ecosystem, with other platforms at ELIXIR and beyond.

Level three aims to fully automatize evaluation efforts. This will be achieved by OpenEBench receiving participating workflows and executing them in similar computational environments to ensure their fair, technical and scientific performance evaluation.


The evaluation for the CAMEO group (Continuous Automated Model Evaluation)6 - released in 2018 - is an example of a community driven effort that relies on leveltwo architecture, together with some level-three features (the automatization of the evaluation).

Biocontainers allows researchers to bundle up software into a ‘software container’. In 2018, the group carried out an ELIXIR Implementation Study to begin building a stable infrastructure for common container solutions across ELIXIR Communities. The resulting infrastructure will provide an access point for end-users to find, generate, store, and monitor software container solutions. Additionally, this technology will be used to support ELIXIR training activities, and to provide trainers with a stable and replicable technological framework for the delivery of training.

The software deployment group in the Tools Platform works to support community efforts on bioinformatics software deployment methods, in particular, the Biocontainers initiative7.

6 Haas J., Barbato A., Behringer D., Studer G., Roth S., Bertoni M., Mostaguir K., Gumienny R., Schwede T. Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12. Proteins. 2018, 86, 387-398. doi: 10.1002/prot.25431

7 da Veiga Leprevost F, Grüning BA, Perez-Riverol Y et al. BioContainers: an open-source and community-driven framework for software standardization, Bioinformatics 33, 2017 Aug 15;33(16):2580-2582. doi:

ELIXIR Annual Report 2018


The Biocontainers team has also developed a complete framework for the continued integration and deployment of docker containers, which can be retrieved from three different registries8. By the end of 2018, the Biocontainers registry (http:// of containers for working with biological data featured over 8,000 tools, with automated testing and validation of the minimum metadata required to publish a container. A new API9 was released that follows GA4GH standards; a new improved website for Biocontainers is expected to be released in early 2019. The BioContainers community published two manuscripts in 2018, the first article presents the ‘Best practices for containerization and packaging of bioinformatics software’10, and the second one presents the Bioconda project11.

Workflows and Galaxy Galaxy ( is a workflow management system that: 1) provides support for reproducible science; 2) facilitates the sharing of data and results; and 3) removes the need for users to compile and install tools. The Galaxy team in de.NBI (University of Freiburg) also host a European Galaxy instance (see also the ELIXIR Communities section). The Galaxy group combined a system for building highly portable packages of bioinformatics software (Bioconda), with containerization (BioContainers) and with virtualization technologies that isolate reusable execution environments for these packages, in order to build an integrated workflow system that automatically orchestrates the composition of software packages for analysis pipelines. This integrated workflow system significantly improved the computational reproducibility of data analysis tasks12.

The Galaxy group is an active part of the Galaxy Training Network (GTN). This network aims to enhance and deliver training on how to work with Galaxy to scientists, developers and admins. In 2018, Galaxy ran over 33 registered training events all over the world and held its first Galaxy conference in Cape Town, South Africa.

Software development best practices The Software development best practices Group focuses on promoting Open Source Software principles and best practices for sustainable software within the research software community. Following the publication of the research paper to support Open Source principles in research software development in 201713 , the Group in partnership with the ELIXIR Training platform, The Carpentries, and other communities created a collection of training materials to help researchers and developers implement the four Open Source Software (4OSS) recommendations14. 4OSS are a set of four simple recommendations that aim to help researchers and software developers to adopt Open Source Software (OSS) practices. The project kick-started with a workshop at CarpentryCon in June 2018, Dublin, Ireland, and with a hackathon in Utrecht in August 2018, which produced a first draft of the training materials15. This draft was further reviewed and improved in an online process involving a wider community of contributors during the sprint in October 2018 at the NETTAB 2018 workshop in Genoa, Italy. The first version of the training materials was released in early 201916.

8,, 9 10 Gruening B, Sallou O, Moreno P et al. Recommendations for the packaging and containerizing of bioinformatics software. F1000Research 2018, 7(ELIXIR): 742 doi: 11 Gruening B, Dale R, Sjoedin A et al. Bioconda: sustainable and comprehensive software distribution for the life sciences, Nature Methods 2018, 15, doi: 12 Grüning B, Chilton J, Köster J et al Practical Computational Reproducibility in the Life Sciences, Cell Systems 2018, 6, doi: https:// 13 Jiménez RC, Kuzak M, Alhamdoosh M et al. Four simple recommendations to encourage best practices in research software. F1000Research 2017, 6:876, doi: 14 15 16 Martinez PA, Wilson G et al. SoftDev4Research/4OSS-lesson: first lesson release (Version v1.0). Zenodo 2019, doi: http://doi. org/10.5281/zenodo.2565040


ELIXIR Annual Report 2018


Sustaining Europe’s life science data infrastructure The aim of the ELIXIR Data Platform is to build and manage a sustainably funded portfolio of Core Data Resources and Deposition Databases as flagships of excellence. Based on quality criteria for data resources, the Platform also drives ELIXIR’s sustainability strategy around databases, working together with funders and policy makers as part of the Global Biodata Coalition (GBC)17. In 2018, the Platform focused on three main topics: (1) expanding the list of ELIXIR Core Data Resources and strengthening collaboration among them; (2) working with international partners to establish the Global Biodata Coalition; and (3) establishing the infrastructure to support the automated deposition and sharing of text-mined annotations from multiple sources, for use in database curation workflows. In addition, seven new Data Platform Implementation studies were launched.

ELIXIR Core Data Resources and Deposition Databases ELIXIR Core Data Resources are the central component of ELIXIR Data Platform activities and form a major component of ELIXIR’s long-term sustainability strategy. They represent a set of life science data resources, operated by ELIXIR Nodes, that are of fundamental importance to life science research and to the long-term preservation of biological data. The initial list of ELIXIR Core Data Resources was established in 2017. In 2018, the Data Platform ran the second round of the selection process, and three new resources were added to the list: the BRENDA and SILVA databases18 (ELIXIR Germany), and Orphadata (ELIXIR France)19.

The Data Platform published updated guidelines for the ELIXIR Core Data Resources selection process to facilitate its second round20. This work supported the ELIXIR Interoperability Platform, which developed a related selection process for the selection of ELIXIR Recommended Interoperability Resources (see chapter ELIXIR Interoperability Platform).

The Core Data Resources Forum The Data Platform has established the Core Data Resources Forum, which consists of Senior Managers and Principal Investigators of the ELIXIR Core Data Resources, to facilitate collaboration and to agree on common strategies vis-a-vis funders and policy makers, for example in areas of data sustainability, open access, and in demonstrating the impact of data infrastructure. The first meeting of the Forum took place in October 2017, with two subsequent meetings held in February and September, 2018. In conjunction with the Forum, the Data Platform also began work on a scientific article to present the Core Data Resources as a coherent and integrated infrastructure with a demonstrable major impact on life science research (which is expected to be published in 2019).

Global Biodata Coalition ELIXIR continues, as part of the Global Biodata Coalition (GBC), to work towards the long term sustainability of life-sciences data resources worldwide. The goals of the GBC include ensuring a sustainable shared global ecosystem of life-sciences data resources, where data submission is open to all researchers, data access is free and unrestricted, and support for data resources, and in particular for core life-sciences data resources, is provided by international partners via a coordinated and proportionate funding mechanism. The GBC will improve the management of the global life-sciences data resource ecosystem, reduce the risk of data loss through loss of funding or lack of support for growth, reduce redundancy, strengthen international coordination, and improve research collaborations worldwide.

17 Warwick P. Anderson, A global coalition to sustain core data, Nature 2017, 543, page 179, doi: 18 BRENDA is the world’s largest and one of the most widely used information system on all aspects of enzymes, including function, structure, mutants and properties, Silva is the European ribosomal RNA gene (rDNA) reference database, a comprehensive web resource for databases of small and large subunit aligned rDNA sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services, www. 19 Orphadata is an extension of Orphanet, serving as a dedicated download platform for the datasets collected by Orphanet. It gives access to aggregated scientific data related to rare diseases from the Orphanet knowledge base, as well as to the Orphanet rare disease ontology and to the HPO-ORDO ontological module, 20 Drysdale R, McEntyre J, Durinx C and Blomberg N. The Process for the Selection of ELIXIR Core Data Resources. F1000Research 2018, 7(ELIXIR):1712 (document) doi:

ELIXIR Annual Report 2018


The Global Biodata Coalition is owned by funders, currently represented by the Heads of International (biomedical) Research Organizations. Initial support has been secured from the Wellcome Trust (UK) and the NHGRI (USA), to develop the governance structure and to develop a pilot process to select the global core data resources. Current activities include stakeholder consensus building with data producers and users, with data resource operators and custodians, and with research funding organisations.

Infrastructure for Scalable Curation In 2018, the annotations infrastructure for sharing text mining results via Europe PMC was extended in terms of its technical capabilities, contributions and usage. Contribution highlights include: • Gene mutations: Text mined annotations on gene mutations were incorporated from PubTator21. • Transcription factors/gene target associations: Annotations on transcription factor and their gene targets were provided by the Gene Regulation Consortium (GRECO). • Database accession and resource name mentions: the accession numbers of three databases were introduced in the Europe PMC seach methods, namely from CATH, MINT and Human Protein Atlas; and Pipelines for UniProt, EGA, and ENA were updated to incorporate new accession number patterns. Resource name variations for the String-DB database and for the databases listed above were also included. Most significantly, the Annotations database can now support the automatic ingest of text-mined concepts, relationships and phrases from any number of providers. These are then openly available via the annotations API and viewable by users in the Europe PMC interface via the SciLite application.

ELIXIR Data Platform Implementation Studies In 2017, the Data Platform initiated a peer-review based process to select a portfolio of Implementation Studies. The overall objective of these Implementation Studies was to strengthen the existing ELIXIR Data Resources in ELIXIR Nodes by increasing the degree of sustainability, coordination, and integration between them, by promoting good practice in resource management, and by facilitating ease of use and the capacity for the re-use of the data in these resources. The selection process, which concluded in the first half of 2018, selected seven studies for funding, which involve 13 Nodes. These studies, which all began work in summer 2018, are: • Apple as a Model for Genomic Information Exchange • Establishment of an ELIXIR Contextual Data Clearinghouse • Extending open proteomics data analysis pipelines in the cloud: Additional tools and focus on scalability, supporting the dramatic growth of public proteomics data • FAIRness of the current ELIXIR Core resources: Application (and test) of newly available FAIR metrics, and identification of steps to increase interoperability • Integration and standardization of intrinsically disordered protein data • Increasing Interoperability between ELIXIR Protein Structure and Sequence Resources and Expanding these Resources with 3D-Models of CATH Domains, built by SWISS-MODEL • Integrating reference taxonomic databases for metabarcoding and metagenomics identification

21 Wei CH, Kao HY, Lu Z. PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res. 2013, 41(Web Server issue) W518-22. doi:


ELIXIR Annual Report 2018


Access, exchange and storage The ELIXIR Compute Platform focusses on (1) researcher identity via which researchers can access services, and (2) secure transfer of key datasets to cloud providers in order to perform a range of bioinformatics analyses. ELIXIR AAI (Authentication and Authorisation Infrastructure) Since November 2016, the ELIXIR AAI22 has been the ELIXIR Compute Platform service for authenticating researchers and for helping ELIXIR services to decide what users are permitted to do in the service. The emerging AAI service has since been developed through a service definition with cost estimates, service desk and key functionalities. In 2018, the focus was to operate a stable AAI service and to integrate new Relying Parties (that is, services run within the community that make use of the ELIXIR AAI), thus paving the way for it to be launched as the first ELIXIR Infrastructure Service from 2019-20. Key outcomes were: • To continue to provide a sustainable AAI service. A report23 on the day-to-day operations and support produced in December 2018 included the Key Performance Indicators (KPI), including: the number of active ELIXIR IDs, the number of logins/month, the number of Relying Parties, and the availability of the Proxy IdP (percentage over a month). During 2018, 38,038 logins were performed in total. • To ensure the service’s future sustainability, according to the ELIXIR AAI strategy (by starting a migration to the Life Science AAI24). Obstacles were removed from this future migration by contributing to the Life Science AAI pilot, which is managed by the AARC2 project (see also, the EU grants section). The Life Science AAI, which is funded by the upcoming EOSCLife project, will enable all biological and medical sciences research infrastructures to rely on a common AAI, whose major components will be operated by

the e-infrastructures. This will enable and make easier the cross-infrastructure use of resources and it will provide further opportunities for collaboration within the European Open Science Cloud. • To develop new features for the ELIXIR AAI25 through an ELIXIR Implementation Study, with support from other ELIXIR-related projects. • To liaise with the Global Alliance for Genomics and Health (GA4GH) in relevant work streams26. Via its contribution to the GA4GH Data Use and Researcher Identity (DURI) workstream27, ELIXIR AAI has established a strong position within the GA4GH and, together with NCBI eRA Commons, it was identified as a key actor in the implementation of the GA4GH specifications for researcher identification. • To increase the number of Relying (service provider) Parties to ELIXIR AAI, e.g. cloud services suitable for workloads in biological computing. During 2018, 14 new Relying Parties were integrated, including ELIXIR Beacons in the Human Data Communities. • The delivery of two ELIXIR webinars, which focused on ELIXIR AAI: Access to Sensitive Human Data with the ELIXIR AAI, 23 May 201828 and de.NBI Cloud Integration to the ELIXIR AAI, 7 March 201829 • The publication of two key research papers, which present the ELIXIR AAI in general30 and the registered access tier developed as part of the ELIXIR Beacon project31.

Storage and data transfer Researchers performing bioinformatics analysis require access to key datasets in compute resources where code and workflows can be deployed as seamlessly as possible. To facilitate high-level data orchestration between ELIXIR Nodes, the ELIXIR Compute Platform developed a demonstrator service for managed data transfers. In particular, a test bed was deployed for the EMBL-EBI Reference Data Set Distribution Service , which provides a centralised dataset registry, and its management and distribution. This was enabled through the expansion and

22 23 24 25 26 27 28 29 30 D1.1: Report on the operations and support of the ELIXIR AAI (ELIXIR-CZ, M12), D2.1: Report on the migration to the Life Science AAI (ELIXIR-FI, M12) Section WP3 of D1.2: Project final report delivered with outcomes and impact of the study (ELIXIR-CZ, M12) D4.1: Report on the collaboration with the GA4GH (ELIXIR-FI, M12) Linden M, Prochazka M, Lappalainen I et al. Common ELIXIR Service for Researcher Authentication and Authorisation. F1000Research 2018, 7 (ELIXIR):1199, doi: 31 Dyke S O M, Linden M, Lappalainen I et al Registered access: authorizing data access. European Journal of Human Genetic 2018, doi:

ELIXIR Annual Report 2018


testing of the ELIXIR data transfers heartbeat service. At the technical level, FTS3 file transfer services can be used with ELIXIR identities via the RCAuth credential translation service operated by the Dutch National Institute for Subatomic Physics. Users can log in to a web portal and obtain X.509 proxy certificates that can be used to authenticate with the transfer service, which in turn can use the credentials to read and write data to ELIXIR AAIenabled storage endpoints. In addition to the command-line file transfer service, ELIXIR also integrated ELIXIR AAI with a web interface to FTS3. Via the web interface, a user can log in with his/ her ELIXIR ID, and then use the web application to set up, submit, and to monitor data-transfer jobs. In order to facilitate deploying FTS3-enabled endpoints, the Compute Platform improved container-based efforts initiated at CERN. The changes were then back-ported to the CERN software repository32. From September 2018, the ELIXIR AAI and EUDAT AAI have been successfully integrated, which means that ELIXIR members potentially have access to EUDAT services, such as B2DROP, a DropBox-like service33, and B2SHARE, a repository and data publishing service34. An EXCELERATE demonstrator35 was shared on GitHub to demonstrate the technical integration needed to achieve data transfers from geographically distributed sites onto the ELIXIR Compute Platform. This demonstrator included instructions, terminal session recordings and scripts for setting up a cloud resource using an ELIXIR ID, for deploying a storage endpoint Virtual Machine, and for using the ELIXIR Data Transfer Service to move a set of files to cloud instances. It also employed a plants use case to show how to transfer public data to de.NBI (ELIXIR-DE) and cpouta (ELIXIR-FI) clouds.

32 33 34 35 36 37 38 39


Cloud resource integration The ELIXIR Compute Platform coordinated the GA4GHcompatible cloud implementation study. This study engaged, co-developed and implemented the GA4GH Cloud Work Stream (WS) standards – Workflow Execution Service (WES)36, Task Execution Service (TES)37 and Tool Repository Service (TRS)38, along with ELIXIR AAI integration as an end-to-end Platform demonstration across the ELIXIR Nodes. The study’s technical aims were to: • Engage actively with the GA4GH Cloud Work Stream and to contribute to the specification development and to the implementation of the standards • Integrate ELIXIR AAI with technical implementations of the GA4GH standards • Deploy a central WES service along with a distributed collection of TES endpoints at ELIXIR Nodes • Operate the GA4GH-based analysis Platform as prototype service to a set of scientific use cases with Common Workflow Language workflows. The highlight of 2018 was the end-to-end demonstration39 of the GA4GH Cloud WS standards implemented as part of the ELIXIR Compute Platform at the 6th GA4GH Plenary in Basel, Switzerland. The Platform also engaged in a Workshop as a Service (WaaS) Implementation Study in collaboration with the Training Platform. This study investigated the options for providing cloud resources and technical help for ELIXIR bioinformatics courses. Three test case courses were supported with the WaaS funding, and experiences were collected from these, and from three other courses, in order to deduce best practice. The resulting recommendations will be published in 2019 and support material for trainers will be available in TeSS.

ELIXIR Annual Report 2018

Outlook In 2019-20, a key milestone will be achieved: the launch of ELIXIR AAI as the first ELIXIR Infrastructure Service. ELIXIR AAI will run in parallel to the forthcoming Life Science AAI, the core components of which will be operated by e-infrastructures through EOSC-Life funding. The transition from ELIXIR AAI to LS AAI will be an ongoing activity for the next two years at least and will serve as a template for the transition of existing AAI solutions from the other Biomedical Science RIs. The 2019-23 Compute Platform programme builds on ELIXIR AAI access and on the secure transfer of sensitive datasets from site-to-site and site-to-user, allowing access to multiple cloud compute endpoints in the Nodes. Reproducibility, as well as new modes of analytics, can be realised by orchestrating the deployment of workflows and of containers at scales across the ELIXIR Hybrid Cloud. The co-development of standards will be emphasized through a new Strategic Partnership on Cloud and AAI for Human Data in the Global Alliance for Health and Genomics GA4GH. At the global level, experts from the ELIXIR Compute Platform actively engage with the GA4GH workstreams, focusing on cloud and research identity, and on data use and access, including links to data security and discovery. The goal is for ELIXIR and for Europe to be drivers and early technological implementers of the global standards. With Europe being a global leader in tackling major societal challenges in health, the support from the ELIXIR Compute Platform will enable computational research on data from human subjects.

ELIXIR Annual Report 2018



Integration of data and services ELIXIR’s Interoperability Platform (EIP) is driven by four guiding principles: (1) FAIR: services and practices to Find, Access, Interoperate, and Reuse Data; (2) Interoperability for a Purpose driven by need rather than idealism; (3) Interoperable Interoperability: adopting emerging practices and technologies aligned with global standardisation efforts and communities; and (4) Reuse not reinvent: identifying preexisting ELIXIR and external services. The ELIXIR Interoperability Framework defines components and processes that comprise the ELIXIR Interoperability Platform. It provides a roadmap that guides the Platform’s organisation and structure. The ELIXIR Interoperability Platform also develops in collaboration with other projects and initiatives; the ELIXIR Interoperability Framework, for example, provides the foundation on which the interactions among the ELIXIR Communities, other ELIXIR Platforms, and ELIXIR Nodes are based. In 2018, the Interoperability Platform focused on the following areas: (1) ELIXIR Recommended Interoperability Resources; (2) Bioschemas markup of the ELIXIR Core Data Resources; and (3) promoting the growing adoption of compact identifiers, as provided by, with continued international collaboration to further define and recommend identifier best practice.

ELIXIR Recommended Interoperability Resources In 2018, the ELIXIR Interoperability Platform established a process by which to identify ‘ELIXIR Recommended Interoperability Resources’ (RIRs) – services that facilitate downstream interoperability between data, tools, and compute infrastructures. The first call for applications was announced in June 2018 and the first set of RIRs was presented in December.

An ELIXIR Recommended Interoperability Resource is an ELIXIR service that supports the description, reporting, annotation, sustainability and provision of biological data, facilitating interoperability and (re-)usability of knowledge and supporting the FAIR Principles40. The selection of RIRs makes non-binding suggestions (i.e. their use is not mandatory) of interoperability services that can aid ‘FAIRification’ activities both prospectively (e.g. by streamlining standardised process workflow and tooling), and retrospectively (e.g. by curating existing data). The selection of RIRs both exposes and promotes a mature service as good practice that the scientific community are encouraged to adopt41. The benefits of becoming an RIR include the increased visibility of the service within ELIXIR, a notable status of service maturity and functionality that is fit-for-purpose, and recognition as a recommended, externally reviewed service as part of the ELIXIR Interoperability Platform. The first round of ELIXIR RIR selection identified the initial, following ten services: • FAIRsharing: A FAIR-supporting resource, which includes a registry on data standards, repositories and policy, alongside informative and educational tools and services. • g:Profiler: A gene-centric data integrator with web UI and API services. • A provider of persistent and compact identifiers for data objects in the life sciences through a curated registry and associated resolver. • InterMine: A framework to integrate life-science data based on an extensible data model, providing web interface and RESTful web services. • ISA Framework: The ISA (Investigation > Study > Assay) framework provides formats and tools to manage experimental descriptions throughout the research life cycle, from collection, curation and deposition in public repositories, to analysis with existing tools, through to publication in data journals. • Ontology Lookup Service: A repository for biomedical ontologies that aims to provide a single point of access to the latest ontology versions through web UI or RESTFUL API

40 Wilkinson, MD et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 2016, 3, doi: 41 ELIXIR Consortium. ELIXIR position paper on FAIR data management in the life sciences F1000Research 2017, doi: 10.7490/ f1000research.1114985.1


ELIXIR Annual Report 2018

Interoperability Platform Resources Framework Standards and Ontology registries Registry

Identifiers Registry


Workflows Applications

Extract Tranform Load

Metadata Annotation Markup

Ontology Management

Identifier minting

Ontology Lookup


Ontology Mapping

Metadata Validation


Resources (Bioschemas)


Search Search

Identifier resolution


Type specific integration


Type specific mapping and resolution


Identifier mapping

Data type specific Ontologies, formats, reporting guidelines, APIs

Identifier Authorities

Tool and APIs

Workflows (CWL)

The ELIXIR Interoperability Platform guides the criteria for use of current services and inclusion of new Interoperability services and develops a framework aggregating services for data discovery, data integration and data analysis. The selection process of the Recommended Interoperablity Resources aims to identify tools to populate the ELIXIR Interoperability Resources Framework.

• 3DBIONOTES API: A reusable platform-independent API call component for protein and gene metadata alignment, annotation, and integration across major bioinformatic data resources. • BridgeDb: A software framework combined with an API for mapping identifiers for related objects in life sciences. • DisGeNET RDF API: An RDF SPARQL Endpoint for DisGeNET, a knowledge platform on human disease genes and variants • MOLGENIS: A flexible data integration platform to facilitate FAIR research data and accelerate scientific collaborations

The Interoperability Platform aims to run the RIR selection process on a regular basis as new tools and requirements emerge from the communities of users and developers.

Bioschemas The Bioschemas community ( focuses on developing and promoting the adoption of metadata specification for describing life-science datasets and other information. Bioschemas is a simple and powerful way to support resource discovery (Find), resource citation and indexing and inter-resource metadata exchange (Reuse), and to strengthen collaborations and integration among ELIXIR Data Resources. It builds on top of schema. org42, providing a long-term, lightweight and sustainable approach to metadata markup, embedded in webpages and indexable by web search engines, registries and aggregators. By the close of 2018, 58 resources had successfully marked up their content with Bioschemas, accounting for over 6 million web pages43.

42 43

ELIXIR Annual Report 2018


In 2018, the Bioschemas community participated in the ELIXIR Biohackathon, held in Paris in November, which resulted in the adoption of the Bioschemas markup by UniProt, Bgee/neXtProt, SynBioHub, ChEMBL, CATHGene3D, SIB HAMAP-Proteomes, Ensembl, BUSCA, BioSamples, and EGA. Additional ELIXIR Interoperability services also engaged with Bioschemas to benefit from markup, including InterMine, MOLGENIS and OmicsDI.

Compact identifiers though Compact identifiers facilitate the citation of scientific data in the scientific literature and on the web. They represent an easy to read and easy to process system, and use a unique prefix to indicate an individual archive and locally assigned identifiers (e. g. uniprot:P04150). Identifiers. org, in collaboration with the California Digital Library, developed a global approach for resolving compact identifiers by establishing a namespace registry with clear governance and maintenance rules. The new system was presented in Nature Scientific Data44, which also adopted the new system as their new citation practice45. Since the adoption of compact identifiers by Nature Scientific Data, the focus has been to promote this system and to engage with international projects (such as FREYA, NIH Data Commons, and through the RDA) and with publishers. Guidelines for the use of compact identifiers have now been rolled out across all Springer Nature journals, and are reflected in their data citation guidance46.

Looking ahead In 2019, the Interoperability Platform will start to implement the plans detailed in the ELIXIR Scientific Programme for 2019-23. The community engagement trends set out in the programme include a focus on agriculture and food security, medicine, synthetic/systems biology, and microbial biotechnology. The new tools to be developed will support retrospective and prospective FAIRification.

The key expected outcomes for the Interoperability Platform for 2019 - 2023 are to: 1. Identify valuable interoperability services within ELIXIR and support their improvement 2. Identify key existing services outside of ELIXIR and develop strategic partnerships with them to promote the reuse-not-reinvent principle 3. Identify gaps in interoperability services that need to be addressed 4. Encourage the use of and Bioschemas markup within life sciences web resources to improve the findability of information 5. Harmonise existing practices in persistent identifier assignment and resolution, support resources in implementing community standards, and promote the creation of identifier services 6. Design, test, and implement a metadata strategy for ELIXIR, using the Communities’ Use Cases as drivers 7. Deliver best practices for the implementation and use of Linked Open Data (LOD) in ELIXIR Communities and use these best practices to inform the future development, deployment, and sustainable production of LOD resources as ELIXIR resources 8. Create specifications that enable data scientists to describe analysis tools and workflows to make them interoperable across Use Cases 9. Create an online knowledge hub and use it to: inform infrastructure and tool developers about current ELIXIR Interoperability Platform activities; assess current best practices; and to determine which interoperability services and standards to implement under specific circumstances. In 2019, the Interoperability Platform will continue working with the Tools Platform on container interoperability as part of the EOSC-Life project and will also participate in the FAIRplus project as a key driver for the work packages on FAIR guidelines and FAIRification tools.

44 Sarala M. Wimalaratne et al. Uniform resolution of compact identifiers for biomedical data. Sci. Data 2018, 5, doi: https://doi. org/10.1038/sdata.2018.29 45 Editorial, On the road to robust data citation, Sci. Data 2018, 5, doi: 46 Submission Guidelines for data citation, Nature Scientific Data, citations


ELIXIR Annual Report 2018


Professional skills for managing and exploiting data The ELIXIR Training Platform builds a community of trainers across the ELIXIR Nodes and offers a portal to a collection of training materials and events on all topics for the life sciences. It also strengthens national training programmes and helps to develop training capacity in ELIXIR Nodes. In 2018, the Training Platform entered the final phase of ELIXIR-EXCELERATE and helped to shape the ELIXIR Programme 2019-23. The Platform’s activities consist of three major components: (1) ELIXIR Courses for researchers, developers and trainers; (2) Training quality and impact evaluation; (3) A training infrastructure, including the ELIXIR Training Portal (TeSS), the e-learning platform by ELIXIR Slovenia, and the Virtual Coffee Room. These activities are implemented in the ELIXIR Nodes by ELIXIR Training Coordinators, who function as contact points between the Training Platform and the ELIXIR Nodes.

ELIXIR Training Courses In 2018, the Training Platform organised 28 events as part of the ELIXIR-EXCELERATE project. Additionally, 19 ELIXIR Nodes held a total of over 300 events at national level, reaching over 7000 participants. The Training Platform organised eight Train-the-Trainer courses. All materials are publicly available47 and its Exchange Programme supported 6 participants in ELIXIR and Galaxy courses.

After a successful pilot48, ELIXIR agreed in August 2018 with the Carpentry Foundation49 to organise 20 training courses and two instructor trainings50 in the ELIXIR Nodes. One Instructor training and five Carpentry courses were already organised in 2018. An additional 27 Galaxy51 related workshops took place in various ELIXIR Nodes. The Training Platform also contributed to the development of training materials for software developers, in collaboration with The Carpentries, the Software Development Best Practices Group52 and other communities. All materials are now openly accessible online53,54.

Training quality and impact evaluation Following a wide consultation, the strategy55 for Key Performance Indicator (KPI) data collection was endorsed by the ELIXIR Head of Node Committee and was adopted by the ELIXIR Training Coordinators. This strategy will be implemented by all ELIXIR training events and nationally through ELIXIR Nodes’ national training programmes. The data collected through short-term (directly after training) surveys covers 307 courses, organised between September 2015 and August 2018, and correspond to 5040 responses and 18 participating ELIXIR Nodes. Most respondents (90%) thought the courses were ‘Excellent’ to ‘Good’ and 76% would recommend the course to others. Long-term (6 months to 1 year after training) survey responses collected for 72 of these events indicate that attending an ELIXIR training course (either European or national) has improved respondents’ ability to handle data, work more quickly, or to communicate with a bioinformatician (78%). 88% of respondents indicated that they are now able to use bioinformatics tools/resources independently or by using the training materials.

47 48 Pawlik A, van Gelder CWG, Nenadic A et al. Developing a strategy for computational lab skills training through Software and Data Carpentry: Experiences from the ELIXIR Pilot action. F1000Research 2017, 6:1040, doi: f1000research.11718.1 49 50 51 52 53 54 Martinez PA, Wilson G et al. SoftDev4Research/4OSS-lesson: first lesson release (Version v1.0). Zenodo 2019, doi: http://doi. org/10.5281/zenodo.2565040 55 ELIXIR Training - Quality and Impact Assessment:

ELIXIR Annual Report 2018


ELIXIR and FAIR Training The FAIR Training Working Group was formed after the “How to make training FAIR” workshop56, which was held at the All Hands in Berlin in 2018. The Working Group defined two objectives and two corresponding task forces. The first Task Force explored the availability and findability of training resources on FAIR Data Stewardship on the TeSS platform; the second Task Force looked into how training resources can comply with FAIR metrics.

Virtual Coffee Room The Virtual Coffee Room is an online platform that facilitates the exchange of information between ELIXIR developers and trainers and that identifies training needs among ELIXIR Node staff.57 Technical updates in 2018 enabled a better integration with widely used communication platforms, such as Slack and Fleep.

ELIXIR-funded Implementation studies: 1.

Training Infrastructure ELIXIR Training portal The TeSS portal allows scientists to browse, discover and to organise life-science training events and materials that have been aggregated from ELIXIR Nodes and from thirdparty providers (e.g., RI-Train, BioExcel, GOBLET, etc.). By the end of 2018, TeSS had promoted over 325 upcoming events and provided access to 1100 training materials collected from 65 content providers. 2018 also saw a 78% rise in TeSS users, as compared to 2017. ELIXIR eLearning ELIXIR Slovenia supported local and distributed courses and also self-taught e-learning courses with the ELIXIR eLearning Platform (EeLP). ELIXIR Slovenia, in collaboration with ELIXIR Spain, also tested the ‘trainground principles’ for High Performance Computing (HPC) courses, as part of the ELIXIR Staff Exchange project. The trainground principles were developed to enable participants to access bioinformatics tools and services necessary for training via a web browser. The trainground principles embed HPC, cloudbased resources, containers and bioinformatics tools into the EeLP, easing the access of course participants to different resources and overcoming the usual problems with technical installations. In 2018, 19 courses and over 500 course participants benefited from EeLP.

56 57 58 59 60 61 62 63 64







“Towards Data Stewardship in ELIXIR: Training and Portal”: The Wizard58 now integrates FAIRSharing, TeSS, Over 160 users are registered online and several institutions are running their own instance of the wizard “Beacon 2018”: Training materials based on the Beacon workshop59 are available in the ELIXIR-SI eLearning Platform (EeLP)60. “Mapping the landscape of Biocuration in ELIXIR: Practice, capability and training requirements”: This study started with a pilot survey to map human resources in biocuration, skills, working practices and training. “Community Adoption of Bioschemas”: As part of this project, a set of practical “How to do” guidelines and training materials were generated61, which will in the future be included in the Carpentries curriculum. Three events supported these activities: the France/ EMBL-EBI Staff exchange program; a tutorial on Nettab62; and the Biohackathon 201863 “Biocontainers”: Several rules were set up to include Training materials on and users recommendations were published64. “Workshop as a Service”: This project aims to define the best mechanisms to request and to use ELIXIR Node cloud resources for bioinformatics training. It conducted a survey of cloud providers, provided cloud resources and technical help to three ELIXIR bioinformatics test cases courses, and is preparing best practice recommendations and learning materials. Gruening B, Sallou O, Moreno P et al. Recommendations for the packaging and containerizing of bioinformatics software. F1000Research 2018, 7(ELIXIR):742, doi:

ELIXIR Annual Report 2018

7. “ELIXIR integration from a user perspective”: This study delivered an interactive training resource that allows users to learn about typical data analysis pipelines, the available tools, databases and standards. ELIXIR-UK and ELIXIR-EE nodes developed an editor for constructing workflows using TeSS resources, where each step is annotated with an ontological classification and retrieves matching resources across ELIXIR registries. 8. “Learning paths”: This study defined a set of core competencies and a curriculum for bioinformatics and data science65. Learning paths are being built for researcher, developer and industry personas, profiling their bioinformatics roles, skills and training challenges.

Outlook In the next ELIXIR programming (2019-2023), the Training Platform will further develop its portfolio of training courses and will contribute to the community led implementation studies funded by ELIXIR. The Platform work programme include four major tasks: (1) quality and impact assessment; (2) gap analysis, training materials development, and training delivery by the co-production model; (3) development and maintenance of the training infrastructure using TeSS as the ELIXIR Training portal; and (4) building capacity by training the trainers and the managers.


ELIXIR Annual Report 2018


People in ELIXIR Communities Human Data Communities

Federated Human Data

Rare Diseases

Jordi Rambla ELIXIR Spain

Marco Roos ELIXIR Netherlands

ELIXIR Head of Human Genomics and Translational Data

Human Data Communities Coordinator

Serena Scollen

Gary Saunders

Thomas Keane EMBL-EBI

Sergi Beltran ELIXIR Spain

Human Copy Number Validation

Christophe Béroud ELIXIR France

David Salgado ELIXIR France




Community leaders

Community leaders

Community leaders

Christoph Steinbeck ELIXIR Germany

Juan Antonio Vizcaíno EMBL-EBI

Thomas Hankemeier ELIXIR Netherlands

Oliver Kohlbacher ELIXIR Germany

Björn Grüning ELIXIR Germany

Frederik Coppens ELIXIR Belgium Claire O’Donovan EMBL-EBI

Lennart Martens ELIXIR Belgium

John Hancock ELIXIR Communities Coordinator

Merlijn Van Rijswijk ELIXIR Netherlands Community leaders

Marine Metagenomics



ELIXIR Annual Report 2018

Nils Peder Willassen ELIXIR Norway

Gildas Le Corguillé ELIXIR France

Community leaders

Plant Sciences

Cyril Pommier ELIXIR France

Celia Miguel ELIXIR Portugal

ELIXIR Communities

ELIXIR Communities

ELIXIR Communities bring together experts from across Europe to develop standards, services and training within specific life science domains. They are ELIXIR’s means to capture the needs of a particular research community into formal requirements that drive the work of the ELIXIR Platforms. This close collaboration ensures that the services developed by the ELIXIR Platforms are fit for purpose and serve the needs of their research communities. The Communities were established in 2017 as successors to the ELIXIR Use Cases, which were defined more narrowly and linked to the ELIXIR-EXCELERATE project. The Communities began to be fully operational in 2018, with their role defined in the Scientific Programme for 2019-2023, and will allow ELIXIR to develop collaborations with new research areas that were not included in the first ELIXIR Scientific Programme.


The second Scientific Programme establishes a formal process for setting up new ELIXIR Communities (see the figure below). New ELIXIR Communities receive funding in the form of an ELIXIR Implementation study to integrate their activities into ELIXIR and to forge links with existing Communities and with ELIXIR Platforms. Communities can also lead or participate in Community led Implementation studies, which are selected through a regular Request for Proposals (RFPs) mechanism.

ELIXIR Communities in 2018 ELIXIR recognises three different kinds of user communities: (1) those dealing with a specific research area (eg. rare diseases); (2) those dealing with a major technology (eg. proteomics); and (3) those providing specialist user support (eg. Galaxy). In 2018, there were eight active communities in ELIXIR. As well as the original four Use Cases, the Galaxy, Metabolomics and Proteomics Communities are now well

Expression of interest

Possible attendance by Platform Coordinators and Platform Leaders

Mature community: annual meetings, eligible for RFPs

The process of establishing an ELIXIR Community.


ELIXIR Annual Report 2018

Implementation study to establish the community

Approval by HoNs

Hold inaugural face-to-face meeting

F1000 paper to describe plans for the community

established and are executing their initial Implementation Studies. The Human Copy Number Variation Community was approved by the ELIXIR Heads of Nodes in December 2018 and will initiate its first Implementation Study in 2019. The Human Data Communities umbrella covers the Human data, Rare Diseases and the Human Copy Number Variation (CNV) communities and will provide integrated solution for working with access-controlled human data. The remaining five communities are Marine metagenomics, Plant sciences, Proteomics, Metabolomics, and Galaxy. The current ELIXIR Communities and their goals are as follows: • Federated Human Data: Developing long-term strategies for managing and accessing sensitive human data • Rare Diseases: Supporting the development of new therapies for rare diseases • Human Copy Number Variation: Coordinating CNV detection, data interpretation, and sharing • Marine metagenomics: Developing a sustainable metagenomics infrastructure to nurture research and innovation in marine science • Plant science: Developing an infrastructure to facilitate genotype-phenotype analyses for crop and tree species • Proteomics: Supporting research in the expression and interaction of proteins • Metabolomics: Infrastructure services for metabolite identification • Galaxy: Integrating Galaxy with ELIXIR resources and services

A further two Communities, Microbial Biotechnology and Intrinsically Disordered Proteins, held their initial workshops in 2018 and submitted their roadmaps in early 2019. Three more emerging Communities: Structural Bioinformatics (3DBioInfo), Food and Nutrition, and Toxicology are in earlier stages of development and are expected to advance towards fully recognised status during 2019. ELIXIR will consider its portfolio of Communities leading up to the mid-term review of the 2019-23 Scientific Programme in 2021. During this review, it will aim to identify any gaps in the portfolio of Communities and to assess the progress of the existing Communities

ELIXIR Annual Report 2018


Human Data Communities

ELIXIR Beacons

In 2018, the ELIXIR Human Data Communities (HDCs) were established as an umbrella structure, which contains the three human data-focussed ELIXIR Communities: Federated Human Data, Rare Diseases, and Human Copy Number Variation. The HDCs create a forum within which these Communities can work with the ELIXIR Platforms and Nodes, alongside key collaborative projects and partners, to co-develop a long-term European strategy for the sharing of sensitive human data consented for research (Figure 1).

The ELIXIR Beacon is a flagship project of the Human Data Communities. In 2018, the Beacon Implementation Study established the ELIXIR Beacons as a GA4GH Driver project that was full aligned with the GA4GH Technical Work Streams, and it consolidated the process of “Lighting a Beacon� for any ELIXIR Node, based on the ELIXIR reference implementation.66

Inherent to the HDCs structure is an important flexibility and agility that aims to remain modular over time. This is crucial for ensuring that the Human Data Communities as a whole can construct and operate a sustainable infrastructure that can support life science research and its translation into medicine at a population scale (4-5 million participants). The overall aim of the HDCs is to ensure that data can and will be shared responsibly and in line with General Data Protection Regulations and ethical policies.

At the 2018 GA4GH Plenary meeting, the Beacon v1.0.0 specification67 was approved as an official standard. Additionally, the ELIXIR Beacon Implementation Study Beacon & Beacon Network As A Service68 designed and prototyped the Beacon Network interoperability, while evaluating related security concerns and preparing for future scalability. It also began the process of evaluating how to procure the ELIXIR Beacon Network and the Node Beacons as an ELIXIR Infrastructure Service.

Overview of the ELIXIR Human Data Communities ELIXIR Platforms ELIXIR Node services

Nodes projects

ma Hu


ata Communiti


Flagship projects

e celerate


66 67 68


ELIXIR Annual Report 2018

Key partners

Federated Human Data The vision for the ELIXIR Federated Human Data Community is to create a federated ecosystem of interoperable services that enables population-scale genomic and biomolecular data to be accessible across international borders, thereby accelerating research and improving the health of European citizens. The Federated Human Data Community is the continuation of the ELIXIR-EXCELERATE Human Data Use Case. The transformation process from the Use Case into the new Community started in 2018 and continues until Autumn 2019 when the ELIXIR-EXCELERATE project finishes. The focus of the ELIXIR Federated Human Data Community is to coordinate an infrastructure that meets the challenge set in the EU Declaration to share transnationally at least 1 million human genomes by 202269. This Declaration has, to date, been signed by 19 European Union Member States, 13 of which are also ELIXIR Members. Throughout 2018, the ELIXIR Federated Human Data Community coordinated an Implementation Study proposal for 30 months (June 2019 - December 2021). The proposal was submitted to the ELIXIR Heads of Nodes for approval in early 2019.

Rare Diseases In 2017, the International Rare Diseases Research Consortium (IRDiRC) announced its 10-year vision for Rare Diseases (RD) research (2017-2027) to enable all people that are living with a RD to receive an accurate diagnosis, care, and available therapy within one year of coming to medical attention. ELIXIR RD Community aligns its strategy and specific objectives to these IRDiRC goals in order to support the RD community to achieve its challenges. The ELIXIR Rare Disease infrastructure will underpin the required increase in efficient and effective use of data in rare disease research.

Similarly to the ELIXIR Federated Human Data Community, the ELIXIR Rare Diseases Community has been formed an as extension of the ELIXIR-EXCELERATE Rare Diseases Use Case. The process of this transformation began in 2018 and will continue into 2019, culminating in September when the ELIXIR-EXCELERATE project ends. In 2018, the ELIXIR Rare Disease Community established its working structure, which includes a key collaboration with the European Joint Programme for Rare Diseases. The focus of the ELIXIR Rare Diseases Community is to work with the International Rare Diseases Research Consortium (IRDiRC) to meet their 10-year vision (20172027). The Rare Disease Community also concluded an ELIXIR Implementation Study on Remote real-time visualisation of human rare disease genomics data70. The project focused on designing a robust system for real-time visualisation of genomic data through the RD-Connect genomicsphenome analysis platform to help clinical researchers to interpret genomics data. The study defined and assessed the technical requirements of accessing genomics data stored in the European Genome-phenome Archive (EGA). As proof of concept, one dataset at EGA (a trio of exomes and a genome issued from the HapMap project) is available for testing alignment visualisation71.

Human Copy Number Variation While current Next-Generation Sequencing technologies, especially Whole Genome Sequencing, are increasingly the primary choice for genomic screening analysis, their ability to efficiently detect CNV is still heterogeneous and remains to be developed. The aim of the ELIXIR hCNV Community is to provide a framework for the human CNV community’s contributions to ELIXIR, thereby positioning Europe at the forefront of this field with implications beyond human disease diagnostics and population genomics. As such, this community is expected to provide broader benefits to other ELIXIR Communities.

69 Towards access to at least 1 million sequenced genomes in the European Union by 2022 en/news/eu-countries-will-cooperate-linking-genomic-databases-across-borders [Accessed 20 February 2019] 70 Matalonga L, Serf A, Martinez O et al. ELIXIR Implementation Study End Report: Remote real-time visualization of human rare disease genomics data (RD-Connect) stored at the EGA. F1000Research 2018, 7(ELIXIR):1216 (document), doi: https://doi. org/10.7490/f1000research.1115872.1 71

ELIXIR Annual Report 2018


The ELIXIR Human Copy Number Variation (hCNV) Community was formed in 2018, and is the newest of all the ELIXIR Communities. Led by ELIXIR France, the Community worked in a bottom-up fashion across the year first to survey the ELIXIR Nodes for interest in joining and forming the Community, and then in the latter part of the year, it coordinated a white paper to outline the roadmap for Community development. The ELIXIR Heads of Node Committee reviewed and approved the ELIXIR hCNV Community roadmap in December 2018, officially endorsing the establishment of the Community within ELIXIR. The ELIXIR hCNV Community was invited to submit an Implementation Study proposal in early 2019.

Galaxy The Galaxy platform72 is an open, web-based platform for computational biomedical research. It allows researchers without programming experience to run data analysis workflows on their data, and to share their analyses with others. This makes science reproducible, it facilitates the sharing of data and results, and removes the need for users to compile and install software tools. The ELIXIR Galaxy Community combines outreach and training activities to increase the uptake of the Galaxy platform across ELIXIR, with technical developments and service provision supporting its community of users. The Community kicked off by hosting the Galaxy User Conference in Freiburg in March 2018 and was awarded a dedicated Implementation Study that started in Se ptember 2018. The Implementation Study included support for the organisation of meetings and workshops (including the European Galaxy Days and training workshops), established a dedicated Community manager role, and supports the usegalaxy network, a network of openly accessible galaxy servers that are available in an increasing number of countries. The Community also launched the European Galaxy server -, which is hosted and maintained by ELIXIR Germany.

Through its Implementation Study, the Community is also introducing new technical developments, including: (1) integration with the ELIXIR authentication and authorisation infrastructure (AAI), to ensure compliance with the EU data privacy rules (GDPR); (2) improved integration of ELIXIR data resources into ELIXIR; (3) support for visualisation packages (BioJS, Shiny), and for the ISA-Tab metadata exchange format, which is supported by the ELIXIR Interoperability Platform.

Marine Metagenomics The Marine Metagenomics Community continues to be active in the ELIXIR-EXCELERATE project, developing a sustainable metagenomics infrastructure to enhance research and industrial innovation within the marine domain, and is also involved in a number of other activities supported by ELIXIR Implementation Studies. The Community provides expertise, training and resources for the broader marine metagenomics community. Its data resources and tools encompass the Marine Metagenomics Portal (which contains three manually curated sequence and contextual marine genome/ metagenome databases), the marine section of the ITSoneDB, to support metabarcoding surveys of marine eukaryotic communities, and Ocean Gene Atlas, a resource that can be used to explore the biogeography of genes from marine planktonic organisms. The Community also provides the analysis pipelines, MGnify and META-pipe. The MGnify platform is a continuation of the EBI Metagenomics Resource. The ELIXIR Marine Metagenomics Community played an important role in upgrading this platform, working closely with the ELIXIR Interoperability Platform in making use of the Common Workflow Language (CWL) to describe the metagenomics analysis workflows. In 2018, the Community began to develop links with other ELIXIR Communities, in particular with three newly established Communities (Galaxy, Metabolomics and Proteomics) and two emerging ones (Microbial Biotechnology, and Food and Nutrition). In addition, it received funding to integrate partners from the Italian Node.

72 Afgan E, Baker D, van den Beek M, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 2016; 44(W1):W3-W10. doi:


ELIXIR Annual Report 2018


Plant Sciences

The Metabolomics Community aims to work with experimental scientists and with developers to provide the resources, analysis tools and infrastructure that will assist metabolite identification. The community will also establish an infrastructure of services, standards and datasets to help researchers discover, annotate and analyse metabolomics data from around Europe.

The ELIXIR Plant Sciences Community aims to establish a technical infrastructure and associated social practices to allow plant genotype-phenotype analysis, based on the widest available public datasets. It continues to participate in the ELIXIR-EXCELERATE project (as a Use Case), while broadening its engagement both with other Communities within ELIXIR and externally.

The major reference database of the Metabolomics Community is MetaboLights73, which holds information about individual metabolites, their chemistry, their spectral data (MS, NMR), as well as their role in pathways and biological systems. The Community began its dedicated Implementation Study in June 2018 with a kick-off meeting in September. The Implementation Study focuses on Metabolite identification, the area that the community believes will have maximal impact on computational metabolomics and metabolomics data management. This area will benefit most from interactions with the existing five ELIXIR Platforms and its progress will contribute the most to other ELIXIR Communities.

The Community acts as ELIXIR’s key contact point with the EMPHASIS ESFRI infrastructure75, which develops and provides access to facilities and services to address multi-scale plant phenotyping. The Community developed and announced a joint ELIXIR-EMPHASIS strategy in May76, which identified the development of the MIAPPE standards77 for plant phenotyping as a priority area for ELIXIR-EMPHASIS collaboration.

The Metabolomics Community has a strong collaboration with the Galaxy Community, due to the importance of a number of metabolomics workflows (such as Workflow4 Metabolomics, PhenoMeNal and Metaboflow) in Galaxy, and the utility of integrating ISA-Tab integration into Galaxy to aid submissions to the MetaboLights database. The Community also has close links with the PhenoMeNal infrastructure for medical metabolomics74.

In particular, the collaboration will support the improved description of environmental variables and of the data structures used to collect different phenotyping datasets. The strategy also highlighted the need to improve interoperability between information systems. This joint strategy is expected to lead to a more formal document in the future. The Plant Community supports the development of a plant data lookup service that will link phenotype and genotype information, making use of the Breeding API (BrAPI). It also participates in the development of relevant ontologies, such as the Plant Phenotype Experiment Ontology, and helps to organise the PhenoHarmonIS workshop on plant phenotyping ontologies. In March 2018, Cyril Pommier from ELIXIR France took up the role of the Plant Community Co-Leader, replacing Paul Kersey who left his position at EMBL-EBI to join the Kew Botanic Gardens as Deputy Director of Science.

73 Haug K, Salek RM, Conesa P, Steinbeck C et al. MetaboLights - an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucl. Acids Res. 2013, doi: 10.1093/nar/gks1004 74 75 76 Data standards and Information Systems: strategies of the European infrastructures EMPHASIS and ELIXIR, ELIXIR, May 2018. [accessed on 13 March 2019] 77 MIAPPE: Minimal Information About a Plant Phenotyping Experiments,

ELIXIR Annual Report 2018


Proteomics The Proteomics Community aims to develop and maintain sustainable proteomics tools and data resources with an emphasis on FAIRification. It was awarded a dedicated Implementation Study that began in August 2018 with the aim of developing open, robust, scalable and reproducible proteomics data analysis workflows based on OpenMS, an open-source software for mass spectrometry analysis78, that is directly

connected to the PRIDE database (an ELIXIR Core Data resource) and to deploy these pipelines in the EMBL-EBI “Embassy Cloud” as a proof of concept. This work builds upon a previous Implementation Study called Mining the Proteome: Enabling Automated Processing and Analysis of Large-Scale Proteomics Data, which concluded in June 201879. The Community has an open strategy of Implementation Study development summarised in the diagram below:

The goals of the Proteomics Community. Green cylinders indicate ELIXIR Core Data Resources.

78 Röst HL, Sachsenberg T, Aiche S, Kohlbacher O et al. OpenMS: A flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 2016, 13, doi: 79 Mining the Proteome: Enabling Automated Processing and Analysis of Large-Scale Proteomics Data - End report, http://bit. ly/2KFR9mb fsd


ELIXIR Annual Report 2018

ELIXIR Members

ELIXIR Nodes in 2018 Members Belgium Czech Republic Denmark EMBL-EBI Estonia Finland France Germany Greece -Hungary Ireland


ELIXIR Annual Report 2017

Israel Italy Luxembourg Netherlands Norway Portugal Slovenia Spain Sweden Switzerland UK

Observer Cyprus

ELIXIR Nodes updates 2018

Belgium • Secured a regional grant of €2.7 million for the coming two years. The project “ELIXIR Infrastructure for Data and Services to strengthen Life-Sciences Research in Flanders” focuses on data management, data analysis platforms, services in Human Data and Plant Sciences, and training. • Sciensano – the Belgian Institute for Health – joined as partner in ELIXIR Belgium. • Organised 14 training workshops covering “Basic Computing Skills” (x10), “ELIXIR standards and recommendations: Galaxy” (x2), “Node Services specialized training” (x1) and “Train the Trainer” (x1). • Established the ELIXIR Belgium Interest Group, connecting Belgian research institutes that are active in the life sciences. • Initiated the following implementation studies (7 in total): Development of Architecture for Software Containers at ELIXIR and its use by EXCELERATE Communities; Data validation; Apple as a Model for Genomic Information Exchange; Extending open proteomics data analysis pipelines in the cloud; Crowd-sourcing the annotation of public proteomics; Learning Paths; and Exploration work about the use of Beacons for proteomics data. • Initiated a collaboration with the Flemish Supercomputer Centre to develop Services on their cloud infrastructure.

Czech Republic • The purchase of hardware for dedicated services was initiated, funded by the national funding programme project, to support a small molecule database and proteomic community support tools. • A computing cluster from OP RDI was put into operation. Two parts of this cluster are integrated into MetaCentrum’s national infrastructure. To manage access to ELIXIR resources, a separate ELIXIR VO (virtual organization) was set up to enable the registration of users authenticated via the federation of the identity and via ELIXIR AAI, operated by the ELIXIR EXCELERATE project. • ELIXIR CZ became a founding member of the 3D Bioinfo community, with a representative on the steering committee. • ELIXIR CZ participated in Implementation Studies DataStewardshipPortal and AAI, and in the organization of thematic hackathons and workshops, and in AAI training. • Presented ELIXIR CZ at a day on national Research

Infrastructures, organised by the Czech Ministry of Education, Youth and Sport. • Organised an annual ELIXIR CZ conference for infrastructure partners and users. • Organised several thematic workshops for users of the Galaxy platform, targeting the proteomics community, drug development and nucleic acids model building. • Developed new bioinformatics tools for the annotation of plant genomes and to improve existing tools, based on in-house analysis and on feedback from users. • Implemented algorithms into a programme for MD simulation GROMACS to accelerate and refine the description of conformational protein and nucleic acid behaviour. • Completed the development of three frequently used tools, two of which have been published (HotSpot Wizard 3.0 and Caver Analyst 2.0); the third tool is being developed from completely new software (Calfitter 1.0) • FireProt tool for the design of stable protein mutants developed by ELIXIR Czech Republic was added to the portfolio of ELIXIR services. In 2018, the FireProt tool was cited in 191 scientific publications. • Created a beta version of the Metabolic Pathway Visualisation Tool and pilot tested it among a selected group of users. • Developed a user interface for another SPCI, for structural and physico-chemical interpretation of data. • Launched the HCVIVdb database and the P2Rank tool for ligand-binding prediction as ELIXIR CZ services. • Put into service two versions of the HERVd database (MERVd and ChERVd) for mice and chicken genomes. • Installed the metadata analysis tool “Metadyminer” in the CRAN storage facility, where it became freely available to the scientific community. • Launched the bioinformatics and cheminformatics tools: 3DPAtch for structural biology and EmbedSOm for data from flow cytometry. IDSM – the small molecule database – was also put into operation. • Published 44 articles in impactful scientific journals, which describe the tools developed for the ELIXIR CZ infrastructure.

ELIXIR Annual Report 2018


Denmark • Expanded the ELIXIR registry of bioinformatics tools and data services ( to over 12,000 entries, with an annual increase of over 3,000. • Released under an open licence (GPL 3.0 licence), allowing anyone to re-use and build upon the portal and extend its functionalities. • Organised the fourth annual Danish Bioinformatics Conference (August). • Completed an ELIXIR Implementation study to integrate ELIXIR portals from a user perspective (by cross-linking the information from the ELIXIR Training portal and the portal).

EMBL-EBI • Implemented global resolving services for compact identifiers (through and in collaboration with California Digital Library). • Implemented a system for replicating data sets between ELIXIR Nodes. • Integrated the ELIXIR Authentication and Authorization Infrastructure at EMBL-EBI. • Contributed to the development of the MIAPPE data standards for plant phenotyping data, to BrAPI, the plant breeding API, and to the integration of phenotypic data with EMBL-EBI data resources. • Deployed ELIXIR AAI for the BioSamples database. • Introduced the Human Cell Atlas project to the ELIXIR Human Data Use Case. • Collaborated with the Ontology Lookup Service (OLS) to resolve CURIEs used for ontology terms. • Implemented ELIXIR Beacon network for relevant EMBL-EBI data resources.

Estonia • Continuously listed on the Estonian Infrastructure Roadmap (updated in 2018). • Celebrated ELIXIR 5 year anniversary with a series of events in Tartu in December (4 training events and ELIXIR Info day conference) with external presenters from ELIXIR-UK, ELIXIR-EBI, ELIXIR-FI and ELIXIR-NL • Participated successfully in H2020 project proposals, including CINECA and IMI-EHDEN. • Launched PAWER,, a web tool for protein microarray analysis. • Held the first Data Carpentry workshop in Estonia, with the workshop being run by local instructors. • Participated in the Learning Paths ELIXIR Implementation Study.


ELIXIR Annual Report 2018

Finland • Achieved an advanced status on the national research infrastructure roadmap and secured funding for the Node coordination, 2019-2023. • Participated in a number of successful H2020 project proposals, including CINECA, EOSC-Life and EJP-RD, in collaboration with partners. • Collaborated with the NeIC-funded Trygge project to support Nordic cross-border research sensitive data use cases, and achieved the first genome submission system operational within the ELIXIR Finland production system. This is based on the Nordic collaboration on EGA technology that aims to support the future federation of EGA. • CSC IT Center for Science is building a new environment for data management in collaboration with Atos. This new computing environment is dedicated for Finnish research use cases, including, for example, the management of sensitive data and machine learning for bioimaging. • Finalised the renewal and extension of the ePouta cloud service for sensitive data processing, funded by the Academy of Finland and the Ministry of Culture and Education. • Re-structured data management services in order to support national research organisations in their requirements for submitting data to the ELIXIR Core Data Resources. • Participated in national efforts that aim to enable the extensive use of genome data in research. • Started developing the remote desktop service for sensitive data processing on the ePouta cloud. • Participated in developing cloud, AAI and discovery standards in collaboration with other ELIXIR Nodes in the GA4GH community. HoN, Tommi Nyrönen, was nominated as the co-leader of the GA4GH DURI workstream. CSC is also actively contributing to related specifications on researcher identification in GA4GH. • Participated in the standardisation efforts of Beacon API 1.0 specification as a key contributor. The specification was approved as the official GA4GH standard in June 2018. Participated in the standardisation of Beacon Network specification, which remains a work-in-progress. Implemented the first demonstrator of ELIXIR Beacon Network user interface. • Expanded the single-cell RNA-seq data analysis functionality of the Chipster platform to cope with new types of data and with more complicated

experimental setups. In addition to running face-toface course, created learning modules containing lecture videos, exercises and documented example analysis sessions, which enable self-study. • Led the ELIXIR-EXCELERATE capacity building effort in single-cell transcriptomics. • Led the Workshop as a Service Implementation study, which conducted a survey to cloud providers, provided cloud resources and technical help for three ELIXIR bioinformatics courses selected as test cases, and is currently preparing recommendations and learning material based on these experiences. • Published highlights of Finnish life-science stories from the past five years ( en/cases/).

France • Orphanet’s Orphadata platform designated as an ELIXIR Core Data Resource. • Achieved the NNCR Infrastructure (National Network of Computing Resources), Cloud and Cluster Federation which consists of ~20,800 vCPU, 9400 Tb of Storage and 173,000 Gb of RAM. • Co-led two ELIXIR Communities (Galaxy, Plant Sciences). Launched and coordinated the new h-CNV community (approved in December 2018). • Coordinated the emergence of the Microbial Biotechnology Community. • Held the first ELIXIR Software Carpentry Workshop in Paris. • Organised an ELIXIR Train the Trainers at the Pasteur Institute. • Involved in seven new ELIXIR Implementation Studies. • Launched a national MOOC on metabolomics at FUN (France Université Numerique) Platform, with over 2000 registrations so far. • Organised the second session of Workflow4metabolomics workshop at the Pasteur Institute. • Launched eight projects on Integrative Bioinformatic that involve 10 INBS National research infrastructures (IFB, France Génomique, MetaboHub, ProFI, FRISBI, FBI, FLI, Phenome, EMBRC-Fr, F-CRIN), with 2 animal model facilities and human cohorts involved (PHENOMIN, BIOBANQUES). • Organised an ELIXIR Plant Genome Assembling and Annotation Workshop in Montpellier. This Workshop is a pilot of the Implementation Study WaaS (Workshop as a Services).

• Ran a Bioschemas Staff exchange and prepared the tutorial on Nettab 2018: Bioschemas, a lightway approach to enable FAIRer data resources. • Involved in the ELIXIR-EXCELERATE capacity building effort in single-cell transcriptomics. • Initiated work to develop a machine-actionable Data Management Plan (maDMP) based on the GenOuest tool, Cesgo, in collaboration with OPIDoR. • Hosted and co-organized the first ELIXIR BioHackathon (12–16 Nov, 2018) at aux Berges de Seine (south of Paris, France). 150 participants attended and 29 hacking projects took place during this event.

Germany • Significantly participated in the establishment of three new ELIXIR Communities on Proteomics, Metabolomics and Galaxy. • Involved in six new Implementation Studies that are connected to the topics of all five ELIXIR platforms. • Organized the ELIXIR Cloud Platform Face to Face meeting in Heidelberg in January 2018. The main topics of this meeting were the progress of multiple ongoing implementation studies, further collaboration in different cloud-specific tasks, and the possible participation of the de.NBI cloud in further ELIXIR implementation studies. • Launched the new server during the Galaxy User Conference at the University of Freiburg. • Presented all existing and projected ELIXIR Communities during the CeBiTec Symposium ‘Big Data in Medicine and Biotechnology’ at Bielefeld University in March 2018. • Organized 77 training courses with a total of 1,520 participants. • Supported the ELIXIR Hub to organize the ELIXIR All Hands meeting in June in Berlin. • Hosted and co-organized an ELIXIR Innovation and SME Forum on ‘Data Driven Innovation in Industrial Biotechnology’ in Frankfurt in October 2018. • Extended the de.NBI Cloud in storage and compute components, and successfully completed the integration with ELIXIR AAI. All cloud sites are registered as service providers in ELIXIR and are accessible through ELIXIR AAI.

ELIXIR Annual Report 2018


Greece • ELIXIR Consortium Agreement signed by the Secretary General of Research and Technology. • Transitioned from Observer to Provisional Member. • Organised kick-off event for 16 ELIXIR Greece partners and for the Greek life-science community. • Organised a kick-off meeting for four national pilot actions (Marine bioinformatics, Computational Metabolomics/Protein interactomics, NcRNA biomarker identification, Pathogen metagenomics). • Participated in Metabolomics and Galaxy communities and the emerging Microbial Biotechnology Community. • Coordinated the Standardizing the fluxomics workflow Implementation Study. • Completed the configuration of ELIXIR Greece compute resources and published a Request for proposal. • Organised two bioinformatics held in Heraklion and Lamia. • Prepared and co-organised tutorials, talks and posters presentations at ECCB2018 in Athens. • Released new tool for the transcriptome-wide detection of miRNA-target interactions80.

Israel • Hosted ELIXIR board meeting in April 2018 in Tel Aviv. • The physical location of the Node was moved to the Nancy and Stephen Grand Israel National Center for Personalized Medicine.

Ireland • Continued developing the ELIXIR Service Delivery Plan and setting up the ELIXIR Ireland Node. • Defined three priorities for ELIXIR Ireland: (1) bioinformatics (including bioinformatics operations at the UCD (University College Dublin) Conway Institute, and at the >50 bioinformatics groups around Ireland); (2) systems biology (at CSET Systems Biology Ireland, UCD and the Centre for Systems Medicine, Research Centre INSIGHT); and (3) data linkage, including coordination, management and analytics (Research Centre INSIGHT).

Italy • Welcomed six new member institutions, which joined ELIXIR-IT on Dec 2018: –– University of Turin –– University Federico II of Naples –– Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA) –– Istituto Superiore di Sanità –– Telethon Institute of Genetics and Medicine (TIGEM) –– Stazione Zoologica Anton Dohrn • Launched, or significantly updated, several databases, including: DISNOR, MyoREG, GLOSSary, TARA BLAST Server, ComParaLogS, Genoma, Pseudonitzschia multistriata genome portal, Marine ITSoneDB, Bioschemas specification for ITSoneDB, PRGdb 3.0, ComParaLogs, Galactosemia Proteins DB 2.0. • Launched or significantly updated software tools: Laniakea, BUSCA, DeepSig, ClusterScan, Barcoding gap, Multi-query system and Primer design tools integrated into ITSoneDB Galaxy Workbench, CoVaCS. • Updated Node Application approved by the ELIXIR Board. • Secured H2020 funding as a partner in: EOSC-Life, CIRCLES, MASTER, and JPI-HDHL. • Started Laniakea@ReCaS Galaxy on-demand pilot service beta-test. • Expanded list of ELIXIR-IT Tools registered on bio. tools to 190. • Organised 12 training-related events across Italy (including in Rome, Naples, Bari, Milan, Cagliari, Turin, and Genoa): ran 5 practical courses, 1 tutorial, 3 dissemination events, 1 workshop for 300 high-school students, 1 Train the trainer for high-school teachers, and 1 ELIXIR-EXCELERATE Train the Trainer course. • Organised 3 ELIXIR-EXCELERATE Train the Trainer courses in collaboration with other ELIXIR Nodes (Lausanne, Paris, Stockholm). • Held the 6th Summer School on Rare Disease and Orphan Drug Registries, including the BYOD Workshop, in Rome. • Co-organised a six-day EMBO Practical course on Computational Analysis of protein-protein interactions in Rome. • Developed training course materials and made them available in an open repository81:

80 Paraskevopoulou MD, Karagkouni D, Vlachos IS, Tastsoglou S, Hatzigeorgiou AG. microCLIP super learning framework uncovers functional transcriptome-wide miRNA interactions. Nature Communications 2018, 9: 3601, doi: 81


ELIXIR Annual Report 2018

• Organized the first Italian Bioschemas Workshop. • Planned the Elixir-Industry event in Italy, with related starting activities, in agreement with the central Hub Elixir Industry officer. • Activated a Bachelor degree in Bioinformatics at Sapienza University of Rome. • Led or participated in 14 ELIXIR implementation studies initiated in 2018. • Engaged in most existing ELIXIR communities (Galaxy, Rare Disease, Marine metagenomics, Plant sciences, Metabolomics, Human data), and actively participated in and supported the proposed 3D-bioinfo and IDP communities. IDP community kickoff held in Padua on 31st Oct 2018. • Provided HPC resources to 15 projects through the ELIXIR-IT HPC@CINECA initiative.

Luxembourg • Established the ELIXIR-LU Scientific Advisory Board (SAB) and held its first annual meeting in May 2018. • ELIXIR-LU established a National Committee of Stakeholder Representatives and exchanged information/feedback on its activities and on national needs at ELIXIR-LU’s first annual assembly in June 2018. • Formed a National ELIXIR-LU User group and a Luxembourgish GDPR Working Group with our stakeholders for regular interactive exchanges. • Joined the AAI implementation study 2018 to codevelop AAI-REMS scenarios for the hosting of sensitive human data. • Completed the project work for the implementation study ‘Integrating ELIXIR-Luxembourg’. • Continued project work on two Training implementation studies (Data management and Learning Paths). • Lit the ELIXIR-LU Beacon in March ( • Sustained our productivity on Data Privacy Management and Tool Development for GDPRcompliant data hosting. • Produced and delivered one ELIXIR-Webinar, three ELIXIR-Workshops, and one paper on GDPR-related topics. • Organised and hosted the 2nd instance of ELIXIR-LU Training in Data Processing with R tidyverse (February 19-22, University Campus Belval).

• Organised and taught the EMBO Practical Course on Phenotyping Neurogenerative Symptoms (October 4-10, Luxembourg City). • Co-organised and run the international COSI track session on Translational Medicine at ISMB2018 (July 6-10, Chicago USA). • Secured EC project funding in a consortia with other ELIXIR and non-ELIXIR partners from academia and industry, for the further development of Translational Medicine Metadata Catalogue and Clinical/Health Data Hosting: FAIRPlus (IMI2) and Smart4Health (H2020). • Engaged with the 1Mio Genomes group as Luxembourg is a signatory of the 1Mio Genomes Declaration.

Norway • Integrated the SEEK system from FAIRDOM UK with our NeLS platform, enabling users to store metadata and systems biology models in SEEK and to link them with associated omics data in NeLS. • Continued to collaborate with other Nordic ELIXIR nodes and with NeIC in the Tryggve2 project, on the topic of secure infrastructure for human data supporting cross-border research use cases, including local EGA and the Nordic imputation server project. • Published a white paper on the NeLS platform in F1000Research82, • Participated in six Implementation Studies: Towards data stewardship in ELIXIR: Training & portal; Establishment of an ELIXIR contextual data clearinghouse; FAIRification of genomic tracks; Crowd-sourcing the annotation of public proteomics datasets to improve data reusability; Annotation and curation of human genomic variations; and Beacon and Beacon network as a service. • Organised the European Galaxy Administrator Workshop in Oslo. • Co-organised two ELIXIR EXCELERATE workshops on Marine Metagenomics, in Lisbon and Tromsø. • Co-organised an ELIXIR workshop in genome assembly and annotation in Montpellier. • Organised four user focused hands-on workshops on the NeLS platform, NGS data analysis, meta analysis and on data storage. • Contributed to the data management hands-on courses organized by Digital Life Norway.

82 Tekle KM, Gundersen S, Klepper K et al. Norwegian e-Infrastructure for Life Sciences (NeLS). F1000Research 2018, 7(ELIXIR):968, doi:

ELIXIR Annual Report 2018


• Initiated a collaboration with major Norwegian data generating research infrastructures, which aims to help users to perform the FAIR management of data, from its generation to its archiving. • Initiated strategic work to establish a sustainable funding model for ELIXIR Norway, beyond the current funding period. • Continued the operation of a national help desk, which supports users across the country, spanning approximately 100 research projects in 2018.

Netherlands • Via four workshops co-organised with ELIXIR-CZ, taught researchers and research supporters about the Data Stewardship Wizard, and extended the knowledge model pf the wizard with new questions and with new functions, such as automated FAIRness metrics. • Held five carpentries workshops, in a new partnership with 4U, and grew the instructor pool to 20, with 6 more to come. Participated in Top10 FAIR data things sprint. Organised 4OSS lesson sprint in Utrecht. • Continued work on various tools, including FAIRifyer, FAIR data point, ORKA, MyFAIR and EMPUSA, which might lead to future RIR submissions, and on FAIR tooling, with a national focus on interoperability. Two tools were accepted as ELIXIR Recommended interoperability resources: MOLGENIS and BridgeDb. • Began an implementation study between interoperability and data platforms to identify ways to further improve the FAIRness of the ELIXIR Core Data Resources. • Boosted ELIXIR-NL participation in communities with initiatives in metabolomics, proteomics, galaxy, structural bioinformatics, biotechnology and microbiology, human data communities, food and nutrition, and toxicology. • Dutch groups studying rare diseases, who have worked on the European Joint Programme for Rare Diseases proposal, will come together within ELIXIR in the next five years. The proposal includes plans to continue the annual Rare Disease Bring Your Own Data workshops, which initiated the Dutch work on FAIR in the Interoperability platform. • Updated the processes for service selection and quality monitoring described in the node service delivery plan; changes were approved by the ELIXIR SAB.


ELIXIR Annual Report 2018

• In spring 2018, ELIXIR-NL participated with its own track in the Dutch Bioinformatics conference BioSB2018; in autumn 2018, the DTL five-year anniversary conference Communities @ Work included the annual face to face meeting of the ELIXIR technical and training coordinators, which also had a thematic session about Data Stewardship in ELIXIR. • Organised a session on the addition of Digital Sequence Information to Access and Benefit Sharing at International Data Week in Botswana. • Organised a focus meeting on Electronic Data Capture, in which many academic hospitals participated; organised 5 programmers’ meetings for bioinformaticians from academia and industry; and 5 meetings of our Data Stewards Interest Group, which shows growing international participation. • Published a notable paper: A design framework and exemplar metrics for FAIRness. Mark D. Wilkinson, Susanna-Assunta Sansone, Erik Schultes, Peter Doorn, Luiz Olavo Bonino da Silva Santos & Michel Dumontier. Scientific Data volume 5, Article number: 180118 (2018). doi: 10.1038/sdata.2018.118

Portugal • Began the renovation of the Bioinformatics Training Room to extend the capabilities of the learning and teaching space. • Organised 10 training courses and two workshops (Marine metagenomics and Galaxy). • Participated in the release of MIAPPE version 1.0, including specification of a data model, interoperability between MIAPPE and other resources, and enrichment of MIAPPE with clear definitions and examples for all fields. • Implemented the Portuguese BrAPI endpoint, which currently includes information on cork oak and rice datasets. • Further developed the Plants RNA portal, including the creation of a database model for sRNA data, development of a RESTful API service, and development of a web interface to expose the database using the API service. • Extended the Woody Plant Ontology, in collaboration with ELIXIR France, to cover additional variables used for woody plant observations. • Organized the final meeting of the Staff Exchange Programme “Enhancing the implementation of the Breeding API standard in ELIXIR and semantifying it” in IGC and iBET, Oeiras in April 2018.

• Continued the development of the Cork Oak Genome Database web portal to integrate information from the first draft of the cork oak genome83. The first draft of the portal is in test stage and already includes various functions (such tools to search and explore data, to import related publications, to visualise RNAseq data via JBrowse integration, to provide straight-forward access to pre-loaded InterPro and Blast Analysis results, to cross reference links, and also Blast tools, which allow users to compare fasta sequences against Quercus suber proteins and/or nucleic acids, among others). • Yeastract84 (Yeast Search for Transcriptional Regulators And Consensus Tracking) is a curated repository of ~163,000 regulatory associations between transcription factors and target genes in Saccharomyces cerevisiae. In 2018, YEASTRACT updated gene information from multiple databases (Saccharomyces Genome Database, Candida Genome Database), and updated Gene Ontology terms. • Improved PHYLOViZ Online, a Web-based tool for phylogenetic inference, analysis and visualization, in terms of scalability and visualisation options. A new phylogenetic inference algorithm that can deal with missing data was developed in 2018 and was integrated in the service in early 2019. • Developed GOEnrichment, a Galaxy tool for performing GO enrichment analysis on gene sets, and created a tutorial on its use. Both resources are available in the European Galaxy (

Slovenia • Establisheda new collaboration with University of Maribor. • Continued the development of the ELIXIR-SI eLearning Platform (EeLP -, which complemented bioinformatics courses and webinars executed across Europe. • Organised nine training events in collaboration with several ELIXIR Nodes in Belgium, Czechia, Spain, Finland, France, Luxembourg, Netherlands, Norway and Sweden. • Created a new e-learning activity group to collect information about the Nodes current and planned e-learning activities and to prepare e-learning future plans.

• Started preparation for application for a large national call for ESFRI ELIXIR-SI • Co-authored paper Training bioinformaticians in High Performance Computing85. • Partnered with ELIXIR-ES in a Staff Exchange project (Enhanced Cloud Computing with Resource AutoScaling for Educational Software).

Spain • Two resources from ELIXIR-ES have been named ELIXIR Recommended Interoperability Resources. • Co-organized the XIV Symposium on Bioinformatics in Granada in November with more than 200 participants. • Hosted a workshop in Seville in September 2018, to provide an extensive training on GATK Best Practices and a disease variants prioritization tutorial. • Co-organized the 3rd European Conference on Translational Bioinformatics (ECTB) in Barcelona in April 2018, which brought more than 100 participants during two days and have the attendance of numerous relevant ELIXIR partners. • Co-authored an ELIXIR F1000R paper on ‘Ten steps to get started in genome assembly and annotation’ as part of ELIXIR Capacity Building activities5. • Hosted first two meetings of TransBioNet in May 2018 and November 2018 in Barcelona and Granada, respectively. TransBioNet is the dedicated network of bioinformatics groups and units working at healthcare settings, which aims to accelerate the adoption of the personalized medicine in the Spanish National Health System. • Co-sponsored the VI Bioinformatics and Genomics Symposium held in Barcelona in December 2018. Roderic Guigó (CRG) and Patrick Aloy (IRB Barcelona) were part of the organizing committee. • Hosted the 2018 Annual meeting of the ELIXIR tools platform in January 2018.

83 CorkOakDB: A web portal dedicated to the Quercus suber scientific research data integration. Ramos D et al. 2018 84 85 Perez-Wohlfeil E, Torreno O, Bellis L, Fernandes P, Leskosek B, Trelles O. Training bioinformaticians in High Performance Computing. Heliyon 2018, 4, 12, 1-18. doi: https://doi.org10.1016/j.heliyon.2018.e01057.

ELIXIR Annual Report 2018


Sweden • Highly active in developing systems needed for Local/ Federated EGA, in collaboration with other Nordic Elixir nodes and Elixir Spain. According to our plans, the EGA-SE will be set up during Q2-2019. • Organised one Capacity Building Workshop on Genome Annotation and Assembly in France, and a planning workshop in Uppsala for Capacity Building in Single-cell Transcriptomics. • Wrote a report “Community of practice”, which highlighted efforts to support advanced capacity building, including efforts to build a community of practice for genome assembly and genome annotation, and to enable expert groups across several countries to communicate within this community, which has enabled high-level discussion and the wider sharing of tools and knowledge. • In Jan 2018, we published our best practices document “10 steps to get started in genome assembly and annotation” in the F1000 ELIXIR gateway. The publication has been well received with the highest number of downloads of all publications in the ELIXIR gateway, and the second highest number of views. • The national infrastructure NBIS contributed to Bioconda, a widely used system for the automatic installation of bioinformatics software. Bioconda greatly simplifies software installations, in particular when setting up bioinformatics pipelines that require numerous other programs to be installed. An article on Bioconda was published in 2018 featuring several NBIS experts as consortium authors (Nature Methods 15:475-6). • Participated in three Elixir Implementation Studies: Data movement, Elixir Beacons, and Local Ensembl. • Continued all other activities in the EU-funded project ELIXIR-EXCELERATE. • Organised ~20 advanced training events in bioinformatics with over 400 participants. • Organised ~250 drop-in sessions at all major sites in Sweden, where researchers could meet bioinformaticians to discuss projects. • NBIS has provided project support to ~150 PIs in Sweden.


ELIXIR Annual Report 2018

Switzerland • Connected with ELIXIR partners in the Czech Republic, Germany, Norway, Luxembourg, Sweden and The Netherlands concerning various issues related to the Swiss Personalised Health Network (, where SIB leads the Data Coordination Centre and BioMedIT. ELIXIR-Switzerland has a strong focus on data protection and IT security with respect to sensitive human data and interoperability. • Added one additional resource to the Service Delivery Plan (SDP): MetaNetX. • Participated in the evaluation process of ELIXIR’s “Interoperability Services”. • Participated in several new implementation studies, strengthening the collaboration with ELIXIR partners in the domains of cloud computing, interoperability and human variation analysis. • Initiated and launched the Implementation Study ”Mapping the landscape of Biocuration in ELIXIR: Practice, capability and training requirements” in collaboration with ELIXIR-UK and EMBL-EBI. • Organised 56 courses in bioinformatics-related topics, spanning 105 days of teaching, involving 86 trainers and training 1,265 researchers. • Led and organised the workshop “How to make training FAIR” held at the ELIXIR All Hands meeting in June in Berlin, in collaboration with ELIXIR-UK and ELIXIR-NL, which led to the creation of the ELIXIR Working Group FAIR Training. • Hosted and taught an ELIXIR Train the Trainer workshop in Lausanne (January 2018), contributed to developing TtT training materials, and contributed to teaching in workshops hosted by ELIXIR-FR and ELIXIR-IT. • Contributed to defining ELIXIR metrics, for measuring the quality and impact of training, and to integrating ELIXIR-CH training metrics with the ELIXIR metrics collection. • Celebrated our 20th Anniversary, honouring all those who have shaped SIB since its inception. Launched several SIB awareness projects, including a short institutional video, an animation on the history of the SIB, a photography book, and an educational mobile game “Genome Jumper”.

United Kingdom • Three ELIXIR-UK Node Resources (Intermine, FAIRsharing, ISA Tools & Commons) were identified as ELIXIR Recommended Interoperability Resources. • The ELIXIR Training Portal (TeSS) saw a 79% increase in users this year and acquired new automatic feeds from nine new content providers, including the Galaxy Training Network, eNanoMapper, and Open Targets. The team developed a new feature called Concept Maps through an ELIXIR implementation study, and released a library of embeddable widgets for communities to share relevant training within their sites86 • Secured new funding for Node resources: FAIRsharing (Wellcome Trust), CATH and The Dundee Resource for Sequence Analysis and Structure Prediction (BBSRC). • Co-led the 3DBioInfo Community for Structural Bioinformatics (launched in Basel, Oct 2018) and led the Implementation Study on Interoperability between CATH and SWISS-MODEL. • Co-authored the Microbial Biotechnology community white paper (launched Athens, Sept 2018). • Led the Bioschemas Community, including an Implementation Study and a Staff Node Exchange award. During 2018, the 15 Bioschemas recommendations were refined and 17 new types were identified, with properties developed for inclusion in 8 training events were run throughout Europe, some co-located within conferences and events, such as ECCB, ELIXIR All Hands, and the Biohackathon. Bioschemas was a major hacking topic at the Biohackathon, led by the UK, resulting in 11 new deployments of Bioschemas markup. • Bioschemas generator tool was launched to support the development of the initial Bioschemas markup. • Developed cross-Node co-operations in Common Data Management toolkits with ELIXIR Norway, ELIXIR Belgium and ELIXIR German (de.NBI). • Hosted the ELIXIR Innovation and SME Forum: Enabling Discoverability in Bio-Data Innovation in Cambridge in January 2018. • Sponsored two international events: Integrative Bioinformatics 2018 (June 2018) and NETTAB 2018 (Oct 2018). Co-organised NETTAB 2018, which featured ELIXIR in the theme of “Building a FAIR Bioinformatics Environment”.

• Neil Hall (Earlham) acting Joint Head of Node, Nicola Soranzo (Earlham) replaced Justin Casey-Clarke (UCAM) as Technical Coordinator, Gabriella Rustici became Training Coordinator, Aidan Budd appointed Node Coordinator. • Contributed to, and co-led, the Biosciences Case study of the BEIS UK Govt report by the Open Research Data Task Force (completed June 2018, released Feb 2019). • In 2018, ELIXIR-UK training resources trained ~3,315 researchers through 108 face-to-face training courses and 4 online courses – equivalent to ~40 days of training. This training covered various bioinformatics topics, as well as the ELIXIR train-the-trainer curriculum. • Co-led work on developing and implementing the strategy for ELIXIR Training Quality and Impact assessment. • Collected ~900 feedback survey responses, regarding audience demographic and training quality, from 46 ELIXIR UK-led training events run between January and August 2018. • Sponsored the ELIXIR Workshop for Galaxy training material and skills improvement, in collaboration with the ELIXIR Training Platform and the Galaxy Training Network. • The Birmingham Metabolomics Training Centre operated fourteen face-to-face courses and two Small Private Online Courses from November 2017 to November 2018, with a total of 178 attendees. Operated two Massive Open Online Courses through an external platform (FutureLearn).


ELIXIR Annual Report 2018


Highlights from 2018 ELIXIR Innovation and SME Forum promotes discoverability of biomedical data To facilitate the discussion about the impact and implications of the FAIR data in academia and industry, an ELIXIR Innovation and SMEs Forum organised on 24-25 January 2018 in Cambridge, UK, brought together industry representatives and bioinformatics service providers to present their perspectives and experiences on how to best facilitate discoverability in data-driven innovation. The theme that run throughout the programme was cultural change. The vast majority of speakers agreed that the main barrier to the wide adoption of FAIR principles is not technical or political, but cultural. Carole Goble, the Head of ELIXIR UK, summarised the discussion: ”We need to change our attitude towards data sharing and data curation, and to change the culture of data management.”


February ELIXIR Equal Opportunity Strategy ELIXIR launches its Equal Opportunity Strategy to highlight ELIXIR commitment to fairness across all areas and ELIXIR Nodes It lays down guidelines and recommendations on how ensure equal opportunities in the ELIXIR Hub and Nodes, and also ELIXIR’s advisory and governance committees, working groups and other formal and informal bodies within ELIXIR. The also outlines the data and metrics that the Hub and Nodes are recommended to collect relating to equality and diversity. The Strategy covers the cycle for data collection, monitoring and feeding this back into practice.

“ ELIXIR is fully committed to equality, diversity and inclusion. The organisations that make up ELIXIR work to promote and to reinforce the values of diversity and inclusiveness. They wish to attract and develop a diverstalent, and to leverage the rich variety of experience, skills and potential of all their employees, users and communities. The opportunity to bring together differing perspectives is one of ELIXIR’s greatest assets.”


ELIXIR Annual Report 2018

The Galaxy Community meeting and the Galaxy User Conference More than 120 participants joined the first Galaxy User Conference in Freiburg on 14-16 March 2018, organised with the support of ELIXIR. Besides a programme full of presentations, workshops and discussions, the conference also saw the launch of the Galaxy European server ( managed by University of Freiburg (part of ELIXIR Germany). The European server is the biggest Galaxy instance in Europe, and one of the biggest worldwide. It offers free compute and storage resources, more than 1000 different, well-documented and constantly maintained bioinformatics tools and 250 GB of space per user (500 GB for ELIXIR members). The event was also a kick-off meeting of the ELIXIR Galaxy Community and discussed the Community’s work plan for 2018 and 2019.



come ELIXIR Title to


ELIXIR: Public

data resource

s as a business

model for SMEs


ELIXIR maps for the first time how SMEs rely on public bioinformatics data for their business ELIXIR publishes a report that describes the use of public bioinformatics data by SMEs in Europe. Looking at business models of different types of companies that work with open biological data, the report explores how open data contributes to innovation and generates business value.


ERC Scientific Council guidance for Open Research Data and Data Management Plans87 recommends ELIXIR Recommended Deposition Archives for storage of life science data and ELIXIR Interoperability Platform for guidance on metadata resources.

This report focuses on the growing segment of SMEs who have built their business around public bioinformatics databases and software. Their value proposition fundamentally rely on their deep scientific understanding of open data or open software in relevant domains, combined with an ability to build innovative value-add services and analysis capabilities on top of the public resources. “Europe has a vibrant and rapidly-developing ecosystem of life-science SMEs with business models that rely on unrestricted access to public bioinformatics data. These companies all use public databases such as the ELIXIR Core Data Resources and demonstrate the fundamental value of open life science data in the knowledge economy. This report showcases the opportunities for open innovation in life science and health and hopefully inspires others to follow.” Niklas Blomberg, ELIXIR Director

87 Open Research Data and Data Management Plans: Information for ERC grantees by the ERC Scientific Council, online: default/files/document/ file/ERC_info_documentOpen_Research_Data_ and_Data_Management_ Plans.pdf [accessed 20 March 2019]

ELIXIR Annual Report 2018


ELIXIR All Hands in Berlin 2018 The fourth ELIXIR All Hands meeting was held in Berlin, Germany, on 4-7 June and brought together more than 300 participants form all ELIXIR Nodes and ELIXIR partners. The keynotes by Ellen McDonagh from Genomics England and Peer Bork from EMBL showcased concrete cases how ELIXIR resources support life science research and management of life science data.

ELIXIR Special Track at ISMB 2018 in Chicago, USA The Global Biodata Coalition and the sustainability of bioinformatics resources were the main topic of the ELIXIR Special session organised by as part of the ISMB Conference in Chicago, USA on 6-10 July 2018. The session first introduced the ELIXIR Core Data Resources, the selection criteria and two ELIXIR Core Data Resources (UniProt and CATH). The second part focused on the Global Biodata Coalition and how the experience from establishing the ELIXIR Core Data Resources can be used in building a global core data resources. The session included presentations from funders and international data resources and panel discussions to introduce the concept of core data resources at the global level and the challenges of sustaining the global life sciences infrastructure with a time horizon measured in decades rather than years.


BRENDA and SILVA named ELIXIR Core Data Resources BRENDA and SILVA, two biological databases operated by ELIXIR Germany, have joined the list of ELIXIR Core Data Resources that are critically important for life science research. Previously, the two databases used a dual licensing model whereby users from industry had to purchase a licence for any commercial use of the data. By opening their data to all users, BRENDA and SILVA will follow the licence recommendations for the ELIXIR Core Data Resources. BRENDA is the world’s largest and - with more than half a million users per year - one of the most widely used information systems on all aspects of enzymes, including function, structure, mutants and properties. SILVA is the European ribosomal RNA gene (rDNA) reference database. It is a comprehensive web resource for up to date, quality-controlled databases of small and large subunit aligned rDNA sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services.


ELIXIR Annual Report 2018


Six Data Platform Implementation Studies launched The ELIXIR Data Platform launched six new Implementation Studies for 2018, selected via a rigorous peer-review process by a panel of independent experts in bioinformatics and bioinformatics service provision. Although each Implementation Study has a different focus, they all have common goal of increasing the sustainability of the ELIXIR Data Resource landscape. The overall objective of the call for proposal was to strengthen the existing ELIXIR Data Resources in ELIXIR Nodes by (1) increasing the degree of sustainability, coordination, and integration between them, (2) promoting good practice in resource management, (3) facilitating easy of use and the capacity for re-use of the data in these resources. The six Implementation studies involved 13 ELIXIR Nodes and started in July or August 2018.

ELIXIR teams up with The Carpentries to boost its training programme ELIXIR and The Carpentries initiative agreed to extend their collaboration in organising and delivering bioinformatics training in ELIXIR Nodes. As part of a new collaboration, The Carpentries aims to organise close to 20 Software and Data Carpentry training workshops across ELIXIR Nodes, and two Instructor Training workshops to train a number of specialised instructors. The agreement allows researchers in ELIXIR Nodes to fully benefit from the popular and proven model of hands-on teaching of research computing and data manipulation.

ELIXIR at ECCB 2018 in Athens ELIXIR was the main organising sponsor of the 17th European Conference for Computational Biology (ECCB) held in Athens on 8-12 September 2018. ELIXIR showcased its activities and achievements in a dedicated ELIXIR Application Track which featured 12 talks covering projects across ELIXIR Nodes, Platforms and Communities. The ELIXIR Poster session presented over 50 individual posters showing the full breadth of ELIXIR services and resources.

In 2018, The Carpentries already organised five training workshops and one instructor training.




ELIXIR steps up its activities in global bioinformatics collaboration ELIXIR updates its International Strategy to set out the international (i.e. beyond Europe) activities of ELIXIR and define the key partners for global collaboration in bioinformatics service provision. For the first time, ELIXIR’s activities are articulated around UN Sustainable Development Goals. Research infrastructures play a critical role in supporting bioinformatics applications in the fields of health (personalised medicine), food security (e.g. aquaculture) and the environment (e.g. pollution), which have significant societal and economic benefits. The International Strategy was presented at the International Conference for Research Infrastructures (ICRI) in Vienna, Austria, on 12-14 September 2018, organised as part of the Austrian Presidency of the Council of the EU.

Greece sings the ELIXIR Consortium Agreement Greece is the latest country to sign the ELIXIR Consortium Agreement. The signature by Professor Kostas Fotakis, Alternate Minister of Research and Innovation, was announced during the European Conference for Computational Biology hosted in Athens. Greece has been involved in ELIXIR from its inception and has already built a national ELIXIR Node linking together 16 institutions in the country. The Greek bioinformatics community will now be able to fully integrate into ELIXIR activities and collaborate with other ELIXIR Nodes more extensively.

ELIXIR Annual Report 2018


Recommended Interoperability Resources

Beacon GA4GH and ELIXIR Release V1.0.0 of Beacon API with increased security measures ELIXIR and the Global Alliance for Genomics and Health (GA4GH) release the Beacon API 1.0, the first genomic data interoperability standard from the GA4GH 2018 Strategic Roadmap. Beacon API is a data discovery protocol that allows users to determine the presence or absence of a particular allele in a dataset, without disclosing any further data differentiating the individuals it contains. The accompanying ELIXIR Beacon reference implementation integrate the ELIXIR Authorization and Authentication Infrastructure with Beacons, which allow data owners to light Beacons at different tiers of data access: open, registered, or controlled.



ELIXIR organises the first Biohackathon in Europe Following a decade of successful Biohackathons organised by the National Bioscience Database Center in Japan, the concept of Biohackathon was adopted by ELIXIR who organised the first European Biohackathon in November 2018 in Paris. Over 100 participants worked together on nearly 30 projects covering Data, Tools, Interoperability and Training topics. The themes proposed were aligned to challenges identified by the ELIXIR Platforms and Communities. ELIXIR awarded a number of travel grants to each of the selected projects. In addition, the ‘Bioschemas’ travel exchange programme funded participants coming to work on Bioschemasrelated tasks. In total ELIXIR supported 70 participants to take part in the Biohackathon from across ELIXIR Nodes.


ELIXIR Annual Report 2018

Recommended Interoperability Resources announced ELIXIR announced its first portfolio of Recommended Interoperability Resources (RIRs). RIRs are a set of resources to facilitate interoperability and reusability of life science data and support the principles of FAIR data management. They have been selected by external reviewers, based how they facilitate scientific research and how they improve FAIRness of life science data. The portfolio include ten tools and registries from across ELIXIR Nodes and include resources for standards and APIs, applications, integrators and pipelines.


5th Anniversary event in Brussels ELIXIR celebrated its fifth anniversary in December 2018 by hosting a one-day conference on ‘Open Data for Impact and Innovation in Brussels. Aimed at funders, policy-makers, industry representatives, and ELIXIR partners, the programme presented ELIXIR’s achievements from the past five years and outlined the priorities for the next Scientific Programme (2019-2023). Complementary talks from high profile external speakers included Wolfgang Burtscher (Deputy Director General, Directorate General Research and Innovation, European Commission), Dame Janet Thornton (member of the Scientific Council, European Research Council) and Gabriela Pastori (BBSRC, and Chair of ESFRI Food and Health Strategic Working Group).

EU Grants


ELIXIR-EXCELERATE is a €19 million Horizon 2020 project to fast-track the implementation of ELIXIR by coordinating national data infrastructures and by ensuring the delivery of life-science data services through its Platforms and Use Cases. Basic facts • €19.8 million • Four years (2015-2019) • 48 partners in 18 countries

Overall goals ELIXIR-EXCELERATE is implementing key scientific and organisational aspects of ELIXIR and facilitates the integration of bioinformatics resources that are offered and managed by ELIXIR Nodes. The goals of ELIXIREXCELERATE are to: • Deliver world-leading data services for academia and industry • Increase bioinformatics capacity and competence across Europe • Complete the management and organisational processes for an efficient distributed infrastructure The ELIXIR-EXCELERATE project is fully embedded into ELIXIR’s operations. This means that all activities and objectives of the project reflect and complement the objectives of ELIXIR’s Scientific Programme 2014–2018. The project is organised into five platforms and three scientific use cases, each represented as Work Packages, (WPs), complemented by dedicated Work Packages on Capacity Development, Operations, Communications and Ethics. These Work Packages are as follows: • WP1: Tools Platform: Tools Interoperability and Service Registry • WP2: Tools Platform: Benchmarking • WP3: Data Platform: Data Resources and Services • WP4: Compute Platform: Compute, Data access and exchange services • WP5: Interoperability Platform: The ELIXIR Interoperability Backbone


ELIXIR Annual Report 2018

• WP6: Marine Metagenomics Use Case: Marine metagenomics infrastructure as a driver for research and industrial innovation • WP7: Plant Sciences Use Case: Integrating Genomic and Phenotypic Data for Crop and Forest Plants • WP8: Rare Disease Use Case: ELIXIR infrastructure for Rare Disease research • WP9: Human Data Use Case: Secure archiving, dissemination and analysis of human accesscontrolled data • WP10: ELIXIR Node Capacity Building Programme Training • WP11:Training Platform: ELIXIR Training Programme • WP12: Excellence in ELIXIR Management and Operations • WP13: Communications, Industry and Community Engagement • WP14: Ethics requirements The EXCELERATE Platforms are fully integrated into ELIXIR, and will continue to work outside of and beyond the ELIXIR-EXCELERATE project. However, in 2018, the four Use Cases defined within ELIXIR-EXCELERATE (WP6, WP7, WP8 and WP9) were transitioned into ELIXIR Communities (see section on User Communities). As such, these Use Cases will continue until the end of the ELIXIR-EXCELERATE project (September 2019), after which time they will be completely subsumed by their respective Communities. In 2018, the ELIXIR-EXCELERATE project held its third Annual General Meeting in Berlin on 4-7 June 2018, which was held in conjunction with the ELIXIR All Hands meeting.

e celerate The main ELIXIR-EXCELERATE outputs in 2018 Data Platform (WP3) • Published a revised list of ELIXIR Core Data Resources; these are data resources from ELIXIR Nodes that are fundamental resources for life-science research and for the long-term preservation of biological data. • Published a document in Zenodo88, entitled ‘Plan for collation of metrics and quality data at the ELIXIR Hub’ that discusses the indicators used to select and monitor Core Data Resources. • Extended the annotations infrastructure based on Europe PMC, in terms of the technical capabilities, the contributions and the usage. Most significantly, the Annotations database can now ingest text-mined concepts, relationships and phrases automatically from any number of providers. These are then openly available via the annotations API and are viewable by users in the Europe PMC interface via the SciLite application.

Plant Sciences Use Case (WP7) • Provided a set of plant phenotypic data exposed through an integrating API (BrAPI) and conforming to the MIAPPE standard. • Developed and announced a joint ELIXIR-EMPHASIS strategy, which identified the development of the MIAPPE standards for plant phenotyping as a priority area for ELIXIR-EMPHASIS collaboration. Rare Disease Use Case (WP8) • Launched real-time visualisation of genomic data archived at the EGA through the RD-Connect genomics-phenome analysis platform. Human Data Use Case (WP9) • Demonstrated a working setup in a production-like environment of the code components that will enable sensitive data archiving instances to be established as part of the Federated European Genome-phenome Archive (EGA).

Interoperability Platform (WP5) • Selected and announced the initial list of ELIXIR Recommended Interoperability Resources that supports the description, reporting and annotation of biological data. Tools Platform (WP1 and 2) • Started developing a three-level architecture of OpenEbench to support benchmarking by scientific communities. Training Platform (WP11) • Defined and implemented a strategy for Key Performance Indicator (KPI) data collection. Marine Metagenomics Use Case (WP6) • Delivered a set of tools, pipelines and a search engine for the interrogation of marine metagenomic data.

88 Stockinger H, Barlow M, McEntyre J et al. Plan for collation of metrics and quality data at the ELIXIR Hub. Zenodo 2018, doi: https://

ELIXIR Annual Report 2018


EU grants


eTRANSAFE is a €40 million project that is funded by the Innovative Medicines Initiative (IMI), which began in September 2017. This five-year project aims to develop an advanced data integration infrastructure and new computational methods to improve drug safety. ELIXIR is part of the project consortium of 8 academic institutions, 6 SMEs and 12 pharmaceutical companies. It leads two tasks: (1) creating a policy framework that allows industry and other organisations to share drug safety data and to adhere to consistent guidelines for predictive toxicology models; and (2) data interoperability and integration. The technical and scientific work is carried out by three ELIXIR Nodes: EMBL-EBI, ELIXIR Denmark (Technical University of Denmark) and ELIXIR Spain (through the Barcelona Supercomputing Centre and IMIM). In 2018, ELIXIR successfully delivered the Initial Data and Knowledge Management Plan. This plan set out a general overview of the best practices and recommendations for how data and knowledge were to be shared among key project stakeholders and publicly disseminated outside of the consortium. This initial plan will be used as the foundation for a complete set of recommendations concerning best practices on the sharing of knowledge, data, and of toxicology model validation guidelines, which will be produced at a later stage of the project. In the domain of data integration, ELIXIR leads the coordination of the incoming data normalisation and formalisation via standards and ontologies. This work motivated the goal to identify the activities involved in a user’s journey throughout a sequential analysis process, which led to the establishment of the Workflows Taskforce. The Workflows Taskforce, as derived from the data integration drive, was then initiated under ELIXIR’s lead coordination. The eTRANSAFE Workflows Taskforce, for example, aims to specify a test case of task-based, step-by-step activities (such as ontology mapping and text mining) that toxicology scientists perform when querying the eTRANSAFE knowledgebase for drug safety concerns. Since the Workflows Taskforce was established in 2018, another test case has looked at driving biological questions in the cheminformatics domain. This test case aims in 2019 to cover such queries from an adverse events perspective.


ELIXIR Annual Report 2018


CORBEL is a collaboration of 13 ESFRI biomedical research infrastructures funded through EU’s Horizon 2020 programme. The project’s goal is to establish a framework of shared services between the participating infrastructures. The CORBEL consortium is led by ELIXIR as the coordinator, with the Biobanking and Biomolecular Resources Research Infrastructure (BBMRI) as cocoordinator. In March 2018, CORBEL launched its second Open Call for research projects that offered all academic and industrial scientists in Europe access to technologies and services from ten European research infrastructures (CORBEL project partners). The call consisted of four submission rounds (in April, June, August and October) and attracted 23 project applications from academia and industry, covering all access tracks. 13 applications were accepted as CORBEL user projects following a thorough scientific and technical review. All projects will receive continuous support by the CORBEL Open Call project managers during the access phase, as needed. The CORBEL project will conclude in 2020. However, the existing consortium will continue its work within the EOSC-Life project. The project proposal was prepared and submitted in 2018, led by ELIXIR as the project coordinator. The project starts in March 2019 with the overall budget of over €23 million and will run until 2023. EOSC-Life represents the life-science component of the European Open Science Cloud (EOSC). The overall goal of the project is to ensure that life scientists can find, access, and integrate life-science data for analysis and reuse in academic and industrial research. EOSC-Life will provide an open, continent-scale, collaborative, and interdisciplinary environment for data science within all domains of life science and will provide researchers with direct access to FAIR data and tools in a cloud environment that is available throughout the European Research Area.


The European Open Science Cloud for Research pilot project (EOSCpilot) is supporting the first phase in the development of the European Open Science Cloud (EOSC), as described in the European Commission Communication on European Cloud Initiatives. It is a consortium of 33 pan-European organisations and 15 third parties, which aims to reduce fragmentation, and to improve interoperability, among European data infrastructures. The objectives of the project are to: (1) develop and trial the governance framework for the EOSC and to contribute to the development of European open science policy and best practice; and (2) launch several demonstrators that can function as high-profile pilots, and that integrate services and infrastructures, to show interoperability and its benefits in a number of scientific domains. ELIXIR is a partner in the Data Interoperability Work Package and in the Governance Work Package. In 2018, the project delivered the EOSC Dataset Minimum Information (EDMI) Model as a set of guiding principles for FAIR data and common metadata guidelines. This document recognises Bioschemas as the standard for exposing scientific metadata. To help users, services, data resources and metadata catalogues to find metadata catalogues, EOSCpilot also highlights FAIRsharing, which is managed by ELIXIR UK. FAIRsharing participated in a demonstrator to help find data resources and catalogues compliant with the EDMI guidelines. The project Governance Work Package delivered a report outlining a minimal set of Rules of Participation for Service Providers and Users in EOSC89. These proposed organisational rules document the main participation rules for all EOSC service providers, which are complemented by a series of seven specific requirements that can be applied, depending on the needs of each scientific field. These proposed Rules of EOSC Participation also build on the principles of openness, transparency and inclusiveness.


The AARC2 (Authorisation and Authentication for Research and Collaboration) project is an e-infrastructure project that aims to authenticate researchers and to manage their access rights to services. The AARC2 project builds on the results of its predecessor (AARC), and develops reference architectures that enable research and e-infrastructures to adopt similar approaches to user authentication and authorization (AAI), and removes obstacles from cross-infrastructure interoperability. ELIXIR participates in AARC2 through ELIXIR Finland (CSC – IT Center for Science) and ELIXIR Czech Republic (CESNET). In 2018, AARC2 focused on running a pilot on Life Science AAI, which is a common service for the ESFRI Life Science research infrastructures that authenticates researchers and helps the relaying services to decide their permissions. The Life Science AAI pilot was operated by e-infrastructures and it paves the way for the production deployment in the upcoming EOSC-Life project.


ENVRIplus is a Horizon 2020 project that links Environmental and Earth System Research Infrastructures, projects and networks together with technical specialist partners to create a more coherent, interdisciplinary and interoperable cluster of environmental research infrastructures. ELIXIR is represented by EMBL-EBI and provides expertise and resources in the ‘Biodiversity and Ecosystem’ field and in the ‘Data for science’ theme.

89 Kahlem P, Jimenez R, Smith A, Lecarpentier D, Castelli D, Zoppi F Recommendations for a minimal set of Rules of Participation 2018, online: [accessed on 20 March 2019]

ELIXIR Annual Report 2018




The EMBRIC project connects marine biotechnology initiatives that focus on science, industry and regional growth. EMBRIC brings together six research infrastructures (EMBRC, MIRRI, EU-OPENSCREEN, ELIXIR, AQUAEXCEL and RISIS) to build value chains of services for the exploration of marine bioresources and their sustainable exploitation as sources of biomolecules and/or as whole organisms for food.

RI-PATH (Charting Impact Pathways of Investment in Research Infrastructures) is a Horizon 2020 project that aims to develop a model to describe the socio-economic impact of research infrastructures and of their related financial investments. This €1.5 million project started in January 2018 and will run until June 2020.

Represented by EMBL-EBI, ELIXIR focused on providing consultancy services for marine data planning, management, analysis and interpretation through the EMBRIC configurator90. It presented the service in a workshop organised as part of the EuroMarine 2018 General Assembly meeting in January 2018 in Porto.

ELIXIR is part of a consortium of 8 organisations and serves as a test case, together with CERN, Alba and Desy. Throughout 2018, the project organised participatory workshops to build a shared understanding of when, how and under what conditions investment in research infrastructures brings about various types of impacts. The workshops also aimed to establish the largest, common denominator– that is, the impact pathways that are most relevant and that can be feasibly accounted for in an impact assessment.

FAIRplus ELIXIR hosted the third participatory workshop on 4 December 2018, focusing on distributed and virtual data infrastructures (see also the section ‘Impact and Sustainability’). FAIRplus is an industry-academia collaboration that is funded through the Innovative Medicine Initiative (IMI) and is led by ELIXIR and Janssen. The project aims to increase the discovery, accessibility and reusability of data from selected projects that are funded by the EU’s Innovative Medicine Initiative (IMI), and of internal data generated by pharmaceutical industry partners. It will also organise training for data scientists in academia, SMEs and in pharmaceutical companies to enable the wider adoption of best practises in life-science data management. In 2018, ELIXIR coordinated the development of the project proposal and formed the project consortium (which consists of 22 partners from academia and industry). The project itself started in January 2019 and will run until June 2022.



ELIXIR Annual Report 2018

Supporting Activities

ELIXIR 2019-23 Scientific Programme ELIXIR’s scientific programmes define the technical and scientific development of ELIXIR and the infrastructure services it offers. The first Scientific Programme ran from 2014–2018, the next Scientific Programme spans 2019–2023. The Scientific Programme is developed alongside the 2019–23 Financial Plan, which sets out how the ELIXIR Budget will be used, both for coordination and for Commissioned Services activities. The process of developing the new Scientific Programme began in 201, when the first draft of the Programme was prepared. This draft then underwent extensive discussion throughout 2018. As a strategic document that sets forth the future direction of ELIXIR, it has to reflect the priorities and activities of ELIXIR Nodes, while at the same time meeting the highest scientific and technical standards. The development of the Scientific Programme was driven by the ELIXIR Heads of Nodes, and involved regular consultation with the ELIXIR Scientific Advisory Board. More than 100 scientists from the ELIXIR Nodes, Platforms and Communities contributed to the development of the Programme, which details the scientific and technical roadmaps that link together national services into a Europe-wide infrastructure.

for the access, analysis and reuse of life-science data. ELIXIR’s ambition is to operate a truly pan-European system of federated life-science data services for use by national and international projects in all fields of life science, with widespread support for data accessibility, reproducibility and data reuse. This ambition is reflected in the five strategic objectives of the Scientific Programme, which are presented as a set of key results to be achieved by 2023: •

ELIXIR will operate a portfolio of integrated services that meet the data needs of life scientists at a European scale; ELIXIR Communities will drive service uptake, support standards development, and connect ELIXIR’s experts in life-science disciplines; ELIXIR Core Data Resources will be the global standard for bioinformatics resource management and will form the foundation of an international funding and life-cycle management strategy that secures the long-term sustainability of those resources; ELIXIR will be the recognised and trusted life-science foundation of the European Open Science Cloud; All ELIXIR Nodes will connect life-science users in academia and industry to our open, federated service network.

The final version of the Scientific Programme was reviewed and approved by the ELIXIR Board at their 2018 autumn meeting in November, in Ljubljana, Slovenia. It was then officially presented at the ELIXIR Fifth Anniversary conference on 11 December in Brussels, Belgium.

The programme first outlines the main challenges for datadriven biology in 2019-23 and then presents its technical and scientific plans to address them.

The Programme also contains detailed work plans for ELIXIR’s technical Platforms and Communities, which translate the five Strategic Objectives into concrete and measurable goals with clear deadlines. This will enable the Programme to be annually reviewed and will form the basis for ELIXIR’s annual work plans and project proposals.

The goal for ELIXIR in 2019-23 is to extend the Europewide portfolio of databases, data services, tools, workflows and clouds, into a federated infrastructure Open biological data drives research and the bio-economy landscape for secure data sharing

Access to FAIR data enabling reuse

Describing rich biological data at all scales

Genomes linked to phenotypes for large populations

Translating open data through bioinformatics for the bioeconomy

The main challenges and opportunities in life science research defined by the ELIXIR Scientific Programme 56

ELIXIR Annual Report 2018

Capacity Building and Node development Staff Exchange Programme The purpose of the ELIXIR Staff Exchange programme is to support capacity building in ELIXIR Nodes, as well as the exchange of best practice in bioinformatics service provision. The programme also strengthens the links between ELIXIR Nodes and supports the interoperability and sustainability of ELIXIR services and data resources. The first seven Staff Exchange projects started in late 2017 and continued in 2018. The staff exchange projects involved 10 ELIXIR Nodes. Within these exchanges, eight individual researchers spent a total of 17.5 months working on joint projects in other ELIXIR Nodes. Two, additional staff-exchange projects were approved in 2018 (Bioschemas and the Train-the-Trainer programme) and are being implemented in 2019.

ELIXIR Finland–ELIXIR Spain: Collaboration on sensitive data management The staff exchange project between ELIXIR Finland and ELIXIR Spain enabled Juha Törnroos from ELIXIR Finland to spend seven months working in ELIXIR Spain (at the Center for Genomic Regulation and the Barcelona Supercomputing Center) and to participate in day-to-day activities relevant to the project. The goal of this exchange project was to strengthen the collaboration between ELIXIR Finland and ELIXIR Spain in the management of sensitive data and to support the mutual uptake of technological solutions developed in the two respective countries. The collaboration enabled the adoption of key technological solutions from the European Genomephenome Archive (EGA) to Local EGA, and vice versa, and supported the integration of ELIXIR AAI by EGA and RD-Connect. Overall, the staff-exchange programme boosted knowledge and technology exchange between ELIXIR Finland and Spain and paved the way to new project proposals in Federated Human Data 2019-2021, ELIXIR Beacon 2019-2021, and others.

EMBL-EBI–ELIXIR Czech Republic: web application for fast 3D structure visualisation with residue conservation The collaboration between EMBL-EBI and ELIXIR Czech Republic enabled the development and release of a reusable, web-based tool for linking sequence conservation information with biomolecular structural data. As part of the project, the team at the ELIXIR Czech Republic (Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, IOCB) pooled their expertise in linking sequence data to structure stability, interactions and functions, with EMBL-EBI’s experience in providing sequence similarity searches (via the HMMER service). The project allowed David Jakubec from IOCB to spend three months at EMBL-EBI working on the development of the 3DPatch web application, which was complemented by a suite of 3DPatch tools, to extend the current functionality of the EMBL-EBI resources PDBe, lnterPro and Pfam. The results were presented in a research paper that was published in the Bioinformatics journal91 and at the XV Discussions in Structural Molecular Biology conference in Nave Hrady, Czech Republic92.

Node development With the second run of the Executive Master Programme for the Management of Research Infrastructures (EMMRI), which was developed as part of the RItrain project93, ELIXIR invited ELIXIR Node staff members to take one or several modules of the programme to develop and improve Nodes’ expertise in different aspects of the management and operation of research infrastructures. The course fees, travel, and accommodation costs related to the programme is funded by the ELIXIR Hub. Following a formal call for interest launched in the summer of 2018, seven members of staff from five ELIXIR Nodes registered for a total of 25 different learning modules. The first modules began in November 2018 and the programme will continue until 2020.

91 Jakubec D, Vondrášek J, Finn RD 3DPatch: fast 3D structure visualization with residue conservation. Bioinformatics 2019, 35, 332–334, doi: 92 Jakubec D, Vondrášek J, Finn RD, 3DPatch: fast sequence and structure residue-level information content annotation in a web browser, online: 93

ELIXIR Annual Report 2018


Operations and programme management ELIXIR operates as a virtual organisation and executes joint technical programmes, funded through its core budget. These operations are underpinned by two legal agreements: the international ELIXIR Consortium Agreement (ECA) between countries and the bilateral Collaboration Agreement between a Node and the ELIXIR Hub. The ELIXIR Consortium consists of 21 Member States and EMBL (March 2019). In 2018, Greece became a full member of ELIXIR by signing the ECA and Cyprus joined as an Observer.

Collaboration Agreements The Collaboration Agreements provide a national community with the status of an ELIXIR Node. They specify which activities ELIXIR Nodes will carry out and which services they will provide to ELIXIR and under which conditions. For some services, called “Commissioned Services” (see below), the ELIXIR Nodes receive funding from the ELIXIR Hub.

In 2018, a Collaboration Agreement was concluded with ELIXIR Slovenia. This brings the total number of concluded Collaboration Agreements to 17 out of the 22 ECA signatories. ELIXIR will continue with the remaining Nodes in 2019 to conclude Collaboration Agreements with all ELIXIR members. In 2018, ELIXIR Hub developed an updated Collaboration Agreement template, which was approved by the ELIXIR Board in December 2018. The first Board-approved Collaboration Agreement template has been in use since 2015, thus, ELIXIR has gained an understanding of where the template could be improved to better fit the needs of the growing virtual infrastructure. The most significant change was introduced in 2018 to allow Nodes to be more easily integrated into EU research grants as Linked Third Parties. The European Commission increasingly encourages research infrastructures to use the linked third parties model, wherein the Hub joins a project as the beneficiary and Nodes join as third parties linked to the Hub.

The number of Collaboration Agreements concluded per year from 2013–2018. 25





0 2013


ECA signatories


ELIXIR Annual Report 2018




Collaboration Agreement


Service Delivery Plans ELIXIR Node Service Delivery Plans (SDPs) describe the scientific services that each ELIXIR Node provides in ELIXIR. In 2018, the Hub prepared a SDP with ELIXIR Portugal and ELIXIR UK, bringing the number of approved SPDs to a total of 13 Nodes. All services included in approved SPDs are presented in the portfolio of ELIXIR services and are listed on the ELIXIR website94. In 2018, we introduced a new way to access service information. Services have now been categorised based on scientific domain and by type of service, allowing users to browse the portfolio more easily (see also the Communication section).

In 2018, ELIXIR ran 35 Implementation Studies. 12 Implementation Studies were ongoing and completed in 2018, 21 Implementation Studies started in 2018 and will continue into 2019, and 3 Implementation studies started and finished in 2018. A total of 161 teams, across 21 ELIXIR Nodes, were involved in these studies. In comparison to 2017, the number of Implementation Studies that started in 2018 increased by over 40%. Proportionally, these Implementation Studies represent a total funding of 3.2 million Euros allocated for 2018, and 327 Person Months.

Commissioned Services Contracts Commissioned Services are funded through the ELIXIR Budget to drive the integration of ELIXIR Node-funded services provided through national funding schemes. The development of robust processes for Commissioned Services was a strategic objective of the first ELIXIR Programme, and the establishment of a portfolio of around twenty concurrent service projects was one of ELIXIR’s main deliverables in the first programming cycle. In total, 170 Commissioned Services Contracts were concluded with Nodes during the first ELIXIR Programme. In the ELIXIR 2019–23 Programme, Commissioned Services will involve investments over a number of years and will provide the long-term, essential Infrastructure Services upon which other services can be built.

Implementation Studies Implementation Studies are short-term projects that are carried out by ELIXIR Nodes and that address key scientific and technical issues within ELIXIR. The outcome of an Implementation Study might be a description of service requirements, a piece of software, or a technical deliverable with an accompanying report. Implementation Studies are funded through the budget of the ELIXIR Hub and form part of ELIXIR’s ongoing activities in a particular Platform or Community. They are proposed by Platforms or Communities, are agreed with the ELIXIR Heads of Nodes Committee, and the contracts approved by the ELIXIR Board.


ELIXIR Annual Report 2018


ELIXIR Implementation studies in numbers

35 Implementation studies running in 2018 10 Implementation studies finished in 2018 23 Implementation studies started in 2018

and will continue in 2019


Implementation studies started and finished in 2018

3.2 million Euros of funding from ELIXIR Hub



Total of teams across 21 Nodes were involved in ELIXIR Implementation studies in 2018

Over Person Months spent on ELIXIR Implementation studies in 2018

Growth of Implementation Studies 2014-2018 23 Studies initiated per year Studies concluded per year


COS budget (1000kâ‚Ź)


All ongoing and completed Implementation Studies are listed at about-us/implementation-studies


ELIXIR Annual Report 2018

6 2











Timeline of ELIXIR Implementation Studies in 2018




Data identification and interoperability Sustainability of data reources Visualisation of rare diseases data Remote compute infrastructure Distributed Ensembl Mining the proteome Bioschemas: specifications and demonstrators

2017 A microbial metabolism resource for systems biology 2018 Data identification and interoperability ELIXIR integration Sustainability of data reources


from a user perspective ELIXIR Luxembourg integration

Visualisation of rare diseases data


Remote compute infrastructure

GA4GH Compatible Distributed Ensembl Mining the proteome

Cloud Analysis Platform

Using clouds and VMs for bioinformatics training

Bioschemas: specifications and demonstrators

ELIXIR Italy integration

A microbial metabolism resource for systems biology

Establishment ELIXIR integration from a user perspective

of an ELIXIR Contextual Data Clearinghouse

ELIXIR Luxembourg integration Integration and standardization

of intrinsically disordered protein data


Integrating reference taxonomic databases

GA4GH Compatible Cloud Analysis Platform

Bioschemas: Community Adoption and Training

Using clouds and VMs for bioinformatics training

ELIXIRofItaly FAIRness theintegration current ELIXIR Core Resources Establishment of an ELIXIR Contextual Data Clearinghouse

Beacon & Beacon Network as a service

Integration and standardization of intrinsically disordered protein data

Development of Architecture for Software Containers Integrating reference taxonomic databases

Reuse, extension, scaling, Community and reproducibility scientific workflows Bioschemas: Adoption andofTraining FAIRness of the current ELIXIR Core Resources

Data Validation

Beacon & Beacon Network as a service

Towards Data Stewardship

Development of Architecture for Software Containers

Learning paths for users of ELIXIR Reuse, extension, scaling, and reproducibility of scientific workflows


Data Validation FAIRification of genomics tracks Towards Data Stewardship

Genomic Variation Annotation and Curation Learning paths for users of ELIXIR services

Implementation Studies led by:

Metabolite indentification

FAIRification of genomics tracks

AppleGenomic as a Model forAnnotation Genomic Exchange Variation andInformation Curation

Implementation Studies led by: ELIXIR Communities

Supporting theMetabolite growth indentification of public proteomics data Apple as a Model for Genomic Information Exchange

Interoperability between protein data resources

ELIXIR Communities

ELIXIR Platforms ELIXIR Platforms

Supporting the growth of public proteomics data

Extending open proteomics dataprotein analysis pipelines in the cloud Interoperability between data resources Extending open proteomics data analysis pipelines inFAIR the cloud Scalable approach to personal data


Scalable approach to personal FAIR data management

The Galaxy Community

Node driven studies Node driven studies

The Galaxy Community

Biocuration in ELIXIR

Biocuration in ELIXIR

ELIXIR Annual Report 2018


Implementation studies started in 2018 Name

Nodes involved






Mikael Linden Michal Procházka

Beacon & Beacon Network As A Service


Jordi Rambla Ilkka Lappalainen Michael Baudis Dylan Spalding

Development of Architecture for Software Containers at ELIXIR and its use by EXCELERATE use-case communities



Yasset Perez Riverol Björn Grüning

A GA4GH Compatible Cloud Analysis Platform on ELIXIR Compute Platform (ECP) resources



Steven Newhouse

Enabling the reuse, extension, scaling, and reproducibility of scientific workflows



Rob Finn

Towards Data Stewardship



Celia Van Gelder Rob Hooft

ELIXIR Implementation Study on Learning Paths



Gabriella Rustici Celia Van Gelder

ELIXIR Implementation Study on Data Validation



Frederik Coppens Melanie Courtot

Apple as a Model for Genomic Information Exchange



Alessandro Cestaro

Bioschemas: Community Adoption and Training



Alasdair Gray

Establishment of an ELIXIR Contextual Data Clearinghouse



Nils Peder Willassen

FAIRness of the current ELIXIR Core resources: Application (and test) of newly available FAIR metrics, and identification of steps to increase interoperability.


FAIRification of genomics tracks



Eivind Hovig

Integration and standardization of intrinsically disordered protein data



Silvio Tosatto Norman Davey

Metabolite identification



Michel Dumontier Data

Thomas Hankemeier

Extending open proteomics data analysis pipelines in the cloud: Additional tools and focus on scalability, supporting the dramatic growth of public proteomics data



Juan Antonio Vizcaíno

Increasing Interoperability between ELIXIR Protein Structure and Sequence Resources (CATH, SWISS-MODEL, PDBe and InterPro) and Expanding these Resources with 3D-Models of CATH Domains, built by SWISS-MODEL



Christine Orengo Torsten Schwede

Integrating reference taxonomic databases for metabarcoding and metagenomics identification



Monica Santamaria

Genomic Variation Annotation and Curation



Ilkka Lappalainen Fiona Cunningham

Extending open proteomics data analysis pipelines in the cloud



Lennart Martens

The Galaxy Community



A scalable approach to Personal FAIR Data Management and Analysis


Mapping the landscape of Biocuration in ELIXIR: Practice, capability and training requirements



ELIXIR Annual Report 2018

HDCs Training

Frederik Coppens Gildas Le Corguillé Björn Grüning Andrew Stubbs Peter McQuilton

Handbook of Operations The ELIXIR Handbook of Operations continues to be the authoritative source of information on ELIXIR procedures, recommendations and guidelines, strategies and reference documents. It is aimed at the whole ELIXIR community, including all staff in ELIXIR Nodes, ELIXIR Hub staff, ELIXIR Board members, and national funders. In 2018, the Handbook was updated with detailed information for event organisers in ELIXIR Nodes.

ELIXIR Equal Opportunity Strategy An important update of ELIXIR operation guidelines was the ELIXIR Equal Opportunity Strategy, which highlights ELIXIR’s commitment to diversity and equality across all areas and ELIXIR Nodes. The strategy lays down guidelines and recommendations for how to ensure equal opportunities in the ELIXIR Hub and Nodes, in ELIXIR’s advisory and governance committees, working groups, and other formal and informal bodies. The strategy was presented in an ELIXIR webinar in March 2018 and at the ELIXIR All Hands meeting in Berlin in June.

ELIXIR Annual Report 2018


Collaboration beyond Europe Global collaboration is a cornerstone of ELIXIR’s implementation: users access ELIXIR services from across the world, and many databases are run as part of collaborations involving partners beyond Europe. As a result, and to ensure an effective, integrated infrastructure, relevant initiatives need to cooperate internationally.

1. Scale-up the international user base of ELIXIR’s Services; 2. Improve bioinformatics services and promote global standards; 3. Expand Membership in ELIXIR beyond Europe; and 4. Ensure that ELIXIR is recognised as an infrastructure of global relevance, and as a partner of choice by intergovernmental organisations.

ELIXIR’s collaborations beyond Europe are outlined in its International Strategy95, which underwent a major revision in 2018. The Strategy was launched in the margins of the International Conference on Research Infrastructures (ICRI) 2018, in Vienna in September 2018.

In 2018, ELIXIR was also invited to present at the 12th meeting of the G7’s Group of Senior Officials (GSO) held in Oxford, UK, on 5–7 November. ELIXIR presented how its activities align with the GSO’s Framework for Global Research Infrastructures, with a particular focus on measuring its socio-economic impact, a key theme of the meeting.

For the first time, ELIXIR’s activities were articulated around UN Sustainable Development Goals96. Five Goals in particular were identified in which bioinformatics research infrastructures play a critical role: (1) Zero hunger; (2) Health and wellbeing; (3) Industry, innovation and infrastructure; (4) Life below the water; and (5) Life on land. This version of the ELIXIR International Strategy presents updated objectives for engaging and working with international partners:

ELIXIR also continued working within the Global Biodata Coalition (GBC) on establishing a sustainable model for the funding of life-science data resources worldwide. In July, ELIXIR organised a special session at the ISMB conference in Chicago, USA, which presented the selection process for ELIXIR Core Data Resources and how it can be leveraged to build a global system to preserve research data resources for the long term, and to reduce redundancy, strengthen international coordination, and to improve research collaborations worldwide (see also the Data Platform section).

UN Sustainable Development Goals, with those relevant to bioinformatics infrastructures highlighted.

95 96


ELIXIR Annual Report 2018

Impact and sustainability Throughout 2018, ELIXIR undertook a range of activities, which aimed to demonstrate and to inform researchers and organisations of the benefits and impact of public lifescience resources. Showcasing the value of investments in ELIXIR lies at the core of ensuring its long-term sustainability and hence its capacity to support the users of its services in responding to scientific, industrial and societal challenges. In January 2018, a Horizon 2020-funded project called RI-PATHs (Research Infrastructure imPact Assessment paTHways) began that aimed to develop a common methodology for assessing the socio-economic impact of research infrastructures. Working with economists and impact evaluators, the project brings together four research infrastructure case-studies (ELIXIR, CERN, Desy and ALBA), with ELIXIR representing the life-science domain. This project aims to develop a modular handbook for impact assessment and also takes into account the efforts led by the Organisation for Economic Co-operation and Development (OECD) in creating a suite of impact indicators for research infrastructures97. Through a participatory approach, a number of workshops have been held, including one hosted by ELIXIR in December 2018, which focused on distributed and virtual research infrastructures. These workshops aimed to establish which impact pathways are the most relevant and feasible to account for in an impact assessment – in other words, the largest common denominator. ELIXIR aims to roll out the approach developed under RI-PATHs in the form of a Toolkit, for Nodes to

demonstrate, to national-level and other funders, the valuable benefits of pan-European coordination. In addition to this, ELIXIR had also given presentations at an OECD workshop on impact assessment in Paris in March 2018, highlighted its portfolio of impact work during ICRI (International Conference on Research Infrastructures) in September in Vienna, and presented to ESFRI during workshops in both Vienna, in September, and Milan, in November. This year saw numerous other activities taking place to support impact and sustainability. In March, ELIXIR attended the Bulgarian Presidency conference on the Longterm Sustainability of Research Infrastructures, publishing in parallel its position paper on Horizon Europe, which calls for an increase in the budget to support research infrastructures, more appropriate funding schemes, and better links to thematic areas, such as health, where ELIXIR’s services will be used by consortia. As mandated by its Heads of Nodes, ELIXIR has also begun to work on estimating the “cost of ELIXIR”, to build a picture of the public investments (at European and national level) that are needed to create and operate ELIXIR. ELIXIR’s positioning and advocacy efforts with funders and key initiatives continued to be successful. The Declaration on access to a million European genomes by 2023, which was signed by several Member States in spring 2018, specifically mentions the importance of utilising ELIXIR’s infrastructure. Topics in Horizon 2020, such as a Personalised Medicine topic between the EU and Canada and various ERA-NET funding calls, actively encouraged applicants to work with ELIXIR.

RI-PATH workshop hosted by ELIXIR in December 2018 in Hinxton, UK.


ELIXIR Annual Report 2018


Industry Engagement

Research and scientific discoveries are increasingly data driven. As a provider of data infrastructure services, ELIXIR acts as a bridge between the public and private sectors. ELIXIR’s overall aim is to drive open access and open data practices in life-science research and enable companies and SMEs to integrate public data into their businesses and to build new services on top of them.

In 2018, two successful Innovation and SME Forums took place: • Enabling Discoverability in Bio-Data Innovation, 23–24 January 2018, Cambridge, UK • Data Driven Innovation in Industrial Biotechnology, 15–16 October 2018, Frankfurt, Germany The Innovation and SME Forum held in Cambridge in January 2018 focused on the discoverability of bioinformatics resources that help to advance the life sciences. It included interactive sessions to present challenges in the implementation of FAIR principles, as well as tools development and standards in the life sciences.

ELIXIR creates opportunities for knowledge exchange and provides networking spaces for precompetitive collaborations, to remove bottlenecks in industry and academia, such as the adoption of standards and interoperability.

ELIXIR SME and Innovation Forum A key component of the ELIXIR industry strategy is the ‘Innovation and SME Forum’. This programme hosts a series of specialised events for companies – with a particular focus on SMEs – to enable attendees to learn about ELIXIR services and to forge strong links with the local ELIXIR Node representatives who run these services.

The Innovation and SME Forum in Frankfurt (October 2018) on data driven innovation in industrial biotechnology was aimed at companies active in the field and at all research institutes active in data science. The topics included big data challenges in industrial biotechnology, perspectives on using public resources across domains and disciplines, and the re-use of biomedical data in research. In total, over 160 people attended the two events, with about half of them coming from the private sector.

Attendees from the Public & Private Sector

How well did the conference meet your needs?







18.22% 30.95%


Private Sector Private Sector Public Sector


ELIXIR Annual Report 2018

Public Sector



Extremely well Extremely Very well well

Very well

Somewhat well Somewhat Not well so well

Not so well

Work on a new, follow-up report began in Autumn 2018. This new report will present an in-depth study of the ecosystem of open life-science data, with the research and analysis carried out in collaboration with Prof. Hannes Rothe, the co-founder of the Digital Entrepreneurship Hub established at the Freie Universitaet, Berlin. As part of this collaboration, the ELIXIR industry strategy was presented in Berlin, in November 2018, at a oneday conference called ‘Digital Startup Ecosystem on Health & Bio Data’, which was organised by the Digital Entrepreneurship Hub during the Berlin Science Week.

ELIXIR’s Industry Advisory Committee

The impact of open data: Public data resources as a business model for SMEs Based on a review of business models and the different types of companies that use open biological data, ELIXIR published a report in early 2018 that explored how open data resources contribute to innovation and generate business value.98 The report features case studies of seven innovation-driven SMEs in Europe, each with a different way of utilising open bioinformatics data. The value proposition of these seven SMEs fundamentally relies on their deep scientific understanding of open data and/or of open software in relevant domains, combined with an ability to build innovative, value-added services and analysis capabilities on top of public resources. The companies differ in size, location and business model but they all use open data from public scientific resources.

The ELIXIR Industry Advisory Committee (IAC) met for its face-to-face meeting in early 2018. New Members appointed in 2018 included: Maria Rodriguez Martinez (IBM) and Klaus Maisinger (Illumina). The IAC’s public recommendations for ELIXIR were published for the fourth time in early 2018, and included a reflection on ELIXIR’s Industry Strategy. These included the need to increase the reach of the Innovation and SME programme by helping Nodes to form strong links and partnerships with local industries. ELIXIR was also further encouraged to think about industry involvement in ELIXIR Communities to provide a pre-competitive space for informal knowledge exchange and to consider schemes to incentivise staff exchanges between industry and ELIXIR members.

By looking at how these companies derive value from open data, the report describes three different types of company that use public data: Customisers, Aggregators, and Enablers. These company types are linked to different business models and to the services provided to clients.

98 Roman Garcia P, Smith A and Blomberg N. Public data resources as a business model for SMEs. The Role of Public Bioinformatics Infrastructure in supporting innovation in the life sciences. F1000Research 2018, 7(ELIXIR):590 (document), doi: https://doi. org/10.7490/f1000research.1115445.1

ELIXIR Annual Report 2018



ELIXIR at ISMB and ECCB 2018

ELIXIR Fifth anniversary communication

ELIXIR sponsored the two main conferences in bioinformatics held in 2018: the Intelligent Systems for Molecular Biology (ISMB) Conference held in July in Chicago, USA, and the European Conference for Computational Biology (ECCB), held in September in Athens, Greece.

To mark the 5th anniversary of ELIXIR and to highlight its achievements and outcomes in the past five years (2014– 2018), ELIXIR prepared a social media campaign, leading up to the ELIXIR 5th Anniversary Conference, which was held in Brussels, Belgium, on 11 December 2018.

The programme of both conferences featured dedicated ELIXIR sessions, showcasing ELIXIR’s activities and achievements. The ISMB conference in Chicago included an ELIXIR special track entitled, ‘European and Global Life Sciences Core Data Resources: Managing Funding for Big Data’ at which the ELIXIR Core Data Resources were presented and the emerging Global Biodata Coalition was discussed. ELIXIR was the main organising partner of the ECCB conference in Athens. As part of the ECCB-ELIXIR partnership, the conference programme featured a dedicated ELIXIR Application track to showcase ELIXIR activities and services. The track presented twelve projects from across the ELIXIR infrastructure, which were selected via a call for abstracts that was open to all ELIXIR Nodes, Platforms and Communities.

Using the hashtag #ELIXIRImpact, the campaign focused on how researchers in the life sciences could benefit from ELIXIR activities and services, and on how ELIXIR contributes to improving the findability, accessibility, interoperability and reusability of life-science data. During the campaign, a total of 45 messages posted were shared or liked nearly 700 times (an average of over 15 interactions per post). ELIXIR also featured in the communication campaign by the European Commission - on the official twitter account of Horizon 2020 and on the twitter account of the European Commission’s Research and Innovation DG.

The full breath of ELIXIR activities was presented in nearly 50 posters presented in the ELIXIR Poster session, at the ELIXIR booth, and in eight ELIXIR hands-on workshops. The ECCB 2018 was organised by the Hellenic Society for Computational Biology and Bioinformatics.

Niklas Blomberg, the ELIXIR Director and Wolfgang Burtscher, Deputy Director General for Research and Innovation at the European Commission, at the ELIXIR Fifth Anniversary conference.


ELIXIR Annual Report 2018

Fifth anniversary conference

ELIXIR website and ELIXIR intranet

To celebrate its fifth anniversary, ELIXIR organised a one-day conference entitled ‘Open Data for Impact and Innovation’ for funders, policy-makers, industry representatives, and ELIXIR partners.

The development of the ELIXIR website in 2018 focused on adding new content and on re-organising the site’s existing information, so that information is easier to find and the site easier to navigate.

The event highlighted the outcomes and major milestones achieved since ELIXIR was established in December 2014. Through a series of talks, the conference presented ELIXIR’s achievements and the priorities of the next Scientific Programme (2019–2023).

A key change can be seen in the Services section, where services offered by different ELIXIR Nodes are now grouped according to Scientific domain, Type of service, and Service collection.

Complementary talks from high-profile external speakers included: Wolfgang Burtscher, Deputy Director General for Research and Innovation DG, at the European Commission; Dame Janet Thornton, member of the Scientific Council at the European Research Council, who played a key role in establishing ELIXIR; and Gabriela Pastori, Chair of ESFRI Food and Health Strategic Working Group.

The number of unique visitors on the ELIXIR website rose by over 65% in 2018, as compared to 2017, and exceeded over 85,000 visitors. By 31 December 2018, the ELIXIR intranet had 699 registered users, of which 144 registered in 2018. The number of intranet page views in 2018 was 7,500, which represents an increase of over 10% compared to 2017.

The Service page interface on the ELIXIR website (

ELIXIR Annual Report 2018


ELIXIR Gateway on F1000Research The ELIXIR Gateway on F1000Research was launched in December 2015, as a platform to collect and capture ELIXIR’s research and technical outputs. In 2018, the ELIXIR channel published nine articles, all transparently peer-reviewed through the F1000Research postpublication peer-review process. An additional seven strategy documents, technical reports and white papers were also published on this Gateway as non-peerreviewed documents.

The members of the Advisory Board of the ELIXIR Gateway in 2018 were: • Niklas Blomberg, ELIXIR Director • Inge Jonassen, University of Bergen, Head of ELIXIR Norway • Arlindo Oliveira, Instituto Superior Técnico, Head of ELIXIR Portugal • Bengt Persson, Uppsala University, Sweden, Head of ELIXIR Sweden • Graziano Pesole, The University of Bari Aldo Moro, Head of ELIXIR Italy

The most successful article published in the ELIXIR Gateway in terms of the number of readers was by Dominguez Del Angel, Hjerde , Sterck et al.99, which presents general guidelines for genome assembly and genome annotation. These guidelines are intended to be stable over time and to cover all aspects of general assembly and annotation projects, from start to finish. The paper has been viewed by nearly 15,000 readers and downloaded over 2,600 times. Other important ELIXIR publications include an article by Gabella, Durinx and Appel, which presents the outcome of an ELIXIR Implementation study that reviewed different funding models for biological knowledgebases100, and an article by Gruening, Sallou, Moreno P et al., which presents recommendations developed by the Biocontainers community (part of the ELIXIR Tools Platforms) to produce standardized bioinformatics packages and containers101. The editorial oversight of the ELIXIR Gateway is provided by an Advisory Board, who review all papers submitted to the Gateway to ensure all materials are relevant to the ELIXIR community.

99 Dominguez Del Angel V, Hjerde E, Sterck L et al. Ten steps to get started in Genome Assembly and Annotation. F1000Research 2018, 7(ELIXIR):148, doi: 100 Gabella C, Durinx C and Appel R. Funding knowledgebases: Towards a sustainable funding model for the UniProt use case. F1000Research 2018, 6(ELIXIR):2051, doi: 101 Gruening B, Sallou O, Moreno P et al. Recommendations for the packaging and containerizing of bioinformatics software. F1000Research 2018, 7(ELIXIR):742, doi:


ELIXIR Annual Report 2018

ELIXIR Hub staff

In 2018, the ELIXIR Hub significantly strengthened its technical coordination capacity by appointing four new staff members in the Hub Technical Coordinators’ team. By creating the new roles of ELIXIR External Relations Officer and Principal Legal Advisor, the ELIXIR Hub also strengthened its governance, legal and external relations competencies. The four new members of the ELIXIR Hub Technical Coordinators team joined in Spring 2018, and take up the coordination of three ELIXIR Platforms (Tools, Compute and Interoperability) and of the ELIXIR Human Data Communities. The new staff members appointed in 2018 are as follows: Jen Harrow, who took up the post of ELIXIR Tools Platform Coordinator. Working closely with the Tools Platform leadership team, Jen’s tasks are to align the development of ELIXIR Tools and Services Registry with other ELIXIR services, to drive ELIXIR’s benchmarking strategy and to facilitate collaboration with other ELIXIR Platforms and Communities. Jen came to ELIXIR from Illumina, where she was Program Manager for Population Genomics, primarily involved in collaborating with Genomics England.

Sirarat Sarntivijai, who was appointed as the ELIXIR Interoperability Platform Coordinator. She drives the development and implementation of the Platform portfolio of interoperability services (including the Recommended Interoperability Services) and manages collaboration between five ELIXIR technical Platforms and the Communities. Sirarat is a domain expert in ontologies and in linked-data in biomedical sciences and translational healthcare. In her previous post at EMBl-EBI, she worked on the Experimental Factor Ontology, and was involved in the development of ontology-assisted data integration framework in many projects, including the Human Cell Atlas, OpenTargets, and PredicTOX. Gary Saunders, who joined ELIXIR in the role of Human Data Coordinator, leading the implementation of the ELIXIR-wide strategy to enable the responsible sharing of human data consented for reuse in scientific research. A major focus of this role is to work with the existing Human Data Communities to ensure that all the data generated are compliant with the FAIR data principles, and to coordinate these efforts with the Global Alliance for Genomics and Health. Prior to his appointment, Gary worked at EMBL-EBI as the data manager for the European Variation Archive and for the Database of Genomics Variants Archive.

Members of the ELIXIR Hub at the All Hands meeting in Berlin (from left to right): Joana Wingender, Pascal Kahlem, Ricardo Arcilla, Premysl Velek, Susanna Repo, Jerry Lanfear, Kathi Lauer, John Hancock, Dana Cernoskova, Andrew Smith, Jen Harrow, Melissa Balzano, David Lloyd, Rachel Drysdale, Gary Saunders, Friederike Schmidt-Tremmel, Corinne Martin and Martin Cook. (not pictured: Niklas Blomberg, Jonathan Tedds, Laura Mangan, Phyllida Hallidie, Vera Herkommer and Rafael Jimenez).

ELIXIR Annual Report 2018


Jonathan Tedds, who took up the post of Technical Coordinator for the ELIXIR Compute Platform and was appointed in April 2018. Jonathan drives the implementation of the Platform’s technical strategy and supports the coordination of the Platform. A major focus of Jonathan’s role is to work with ELIXIR Node partners to ensure that compute resources run by ELIXIR Nodes are integrated into an effective portfolio across ELIXIR. Jonathan joins ELIXIR from the University of Leicester, where he worked as a Senior Research Fellow to develop integrated database solutions for biomedical research informatics. Three new appointments were also made to the ELIXIR Hub’s External Relations Team. Kathi Lauer joined the ELIXIR Hub as Industry Officer in March, replacing Pablo Roman, who moved on to join the Institute of Molecular and Oncological Medicine of Asturias in his native Oviedo, Spain. Kathi leads the development and implementation of ELIXIR’s Industry Strategy, including the ELIXIR Innovation and SME programme and outreach to users across industry. Kathi joins ELIXIR from the University of Cambridge, where she worked as a research associate looking at the interaction of DNA-damage-repair proteins and viruses. Corinne Martin joined the ELIXIR Hub as External Relations Officer in April. She is responsible for ELIXIR’s International Strategy and for supporting ELIXIR Nodes in demonstrating the impact of their activities. Corinne came to ELIXIR from the UN Environment Programme – World Conservation Monitoring Centre (UNEP-WCMC), in Cambridge, UK, where she worked as Senior Programme Officer, providing oversight, management and scientific lead to technical projects at the science–policy interface. Joana Wingender joined ELIXIR in April to take up the post of Administrative Officer, supporting the management and coordination of ELIXIR governance bodies and supporting internal communications. Previously Joana worked as a trainee in EMBL’s International Relations department.


ELIXIR Annual Report 2018

In November 2018, ELIXIR Board approved the appointment of Vera Herkommer as ELIXIR Principal Legal Advisor. Vera joined ELIXIR from EMBL, where she worked as the Head of Legal Services. In her previous role, Vera supported the establishment and development of ELIXIR’s legal structures since its inception. In her new post at ELIXIR Hub, Vera will lead the legal functions within ELIXIR and provide strategic advice on governance, regulatory and compliance activities of ELIXIR. By the end of 2018 and following the return of Melissa Balzano from maternity leave, Dana Cerkoskova finished her contract at ELIXIR Hub and returned to her native Czech Republic, to establish an events management company. The ELIXIR Hub also hosted Leyla Garcia as visitor in August 2018 to work as a Knowledge and Semantic Web Coordinator on Bioschemas and the EOSCPilot project. Leyla works in EMBL-EBI’s Uniprot team.

Governance Committees and Financial Data

ELIXIR Committees

ELIXIR Board Chair

Vice Chairs

Prof Rein Aasland, Norway

Dr Ruben Kok, Netherlands Prof Rita Casadio, Italy


Scientific delegate

Administrative delegate


Laurence Lenoir

Michele Oleo Didier Flagothier

Czech Republic

Jaroslav Koča

Jan Burianek


Anders Krogh

Troels Tvedegaard Rasmussen


Pärt Peterson

Toivo Räim Priit Tamm


Per Öster

Riina Vuorento Sirpa Nuotio (since October 2018) Jarmo Wahlfors (until October 2018)


Frédéric Boccard

Eric Guittet


Rolf Backofen Alexander Goesmann

Johannes Mohr


Prof. Christos Ouzounis Prof. Artemis Hatzigeorgiou (since November 2018)

Maria Gkizeli (since November 2018)


László Patthy

Gábor Tóth


Maria Nash Garry Purcell (both from October 2018) Marion Boland (until October 2018)



Yossi Kalifa

Ilana Lowi


Rita Casadio

Salvatore La Rosa


Rudi Balling Regina Becker

Lynn Wenandy Pierre Misteri


Ruben Kok

Bea Pauw


Rein Aasland Stig Omholt


Isabel Rocha (since March 2018) Ana Teresa Freitas (until March 2018)

Andreia Feijão (until March 2018)) Tiago Saborida


Damjana Rozman

Albin Kralj


Ferran Sanz

Cristina Bauluz and Dr Rafael de AndresMedina


Björn Andersson

Karl Gertow


Christian von Mering

Isabella Beretta


Chris Rawlings

Mark Palmer Amanda Collis


Iain Mattaj Janet Thornton

Silke Schumacher


ELIXIR Annual Report 2018

Heads of Node Committee Chair Niklas Blomberg, ELIXIR Director


Head of Node


Yves Van de Peer

Czech Republic

Jiří Vondrášek


Søren Brunak


Jaak Vilo


Tommi Nyrönen


Claudine Médigue and Dr Jacques van Helden

Germany Greece

Alfred Pühler Babis Savakis


Balázs Győrffy


Walter Kolch


Michal Linial


Graziano Pesole


Reinhard Schneider


Jaap Heringa


Inge Jonassen


Arlindo Oliveira


Brane Leskošek


Alfonso Valencia


Bengt Persson


Ron Appel and Christine Durinx


Carole Goble and Neil Hall


Rolf Apweiler and Ewan Birney

ELIXIR Annual Report 2018


Scientific Advisory Committee

Industry Advisory Committee



Dr Francis Ouellette, Ontario Institute for Cancer

Elizabeth Reynolds, General Bioinformatics, UK

Vice Chair

Vice Chair

Dr Janet Kelso, Max Planck Institute for Evolutionary Anthropology, Germany

Abel Ureta-Vidal, Eagle Genomics, UK

Members: Prof Pascal Borry, University of Leuven, Belgium Dr Robert Gentleman, 23andMe, USA Prof. Melissa Haendel, Oregon Health and Science University, USA (since June 2018) Prof Larry Hunter, University of Colorado, USA Prof Elina Ikonen, University of Helsinki, Finland Dr Janet Kelso, Max Planck Institute for Evolutionary Anthropology, Germany Prof Nicola Mulder, UCT Computational Biology Group (NBN), South Africa Dr Francis Ouellette, Origin Bioninformatics, Canada Prof Juni Palmgren, Karolinska Institutet, Sweden Dr Susan E. Wallace, University of Leicester, UK Dr Doreen Ware, USDA ARS, Cold Spring Harbor Laboratory, USA (since June 2018)


ELIXIR Annual Report 2018

Ian Barrett, AstraZeneca, UK Iain Hrynaszkiewicz, Springer Nature, UK Andreas Kremer, ITTM, Luxembourg Natalia JimĂŠnez Lozano, Atos, Spain Filip Pattyn, Ontoforce, Belgium Sara Paulina de Oliveira Monteiro, The Navigator Company, Portugal Christian Paulitz, Bayer CropScience, Germany Elizabeth Reynolds, General Bioinformatics, UK Philippe Sanseau, GlaxoSmithKline, UK SĂĄndor Szalma, Takeda Pharmaceuticals, USA Abel Ureta-Vidal, Eagle Genomics, UK

Financial data

In its 2013 Summer meeting, the Council unanimously approved ELIXIR’s legal framework, including its status within EMBL as a ‘Special Project’ as well as EMBL’s membership of ELIXIR (EMBL/2013/16/Rev 1). The legal framework of ELIXIR is based on the ELIXIR Consortium Agreement (ECA), which has been concluded among countries and EMBL. With its entry into force in 2013 ELIXIR has evolved into an independent internationally operated infrastructure. The budget of ELIXIR is set annually by the ELIXIR Board and all funds related to its activities, including its surplus, ring-fenced within EMBL’s accounts.


2018 budget


































Running costs









Commissioned services





Total expenditure Technological Activities










Running costs

















Grant expenditure incurred





Total Expenditure









Income ELIXIR Member state contributions Ordinary contributions (a) “Foreign exchange (loss)/gain on sterling contributions (b)” Grant income (c) Other income Net income

Expenditure Technological activities

Equipment and depreciation

Directorate and Administrative expenditure

Equipment and depreciation Total expenditure Directorate and Administration Support and Admin Infrastructure costs

Surplus/(Deficit) (d)

ELIXIR Annual Report 2018


(a) ELIXIR Member state contributions



Belgium Czech Republic Denmark Estonia Finland France Germany Greece Hungary Ireland Israel Italy Luxemburg Netherland Norway Portugal Slovenia Spain Sweden Switzerland United Kingdom

198 73 127 8 99 1,153 1,447 49 47 76 91 852 14 329 181 86 19 597 200 256 1,011

145 54 93 6 72 846 1,061 26 34 56 66 625 10 241 133 63 14 438 147 188 717




(b) The ELIXIR Board approved that, from January 2016, the UK will pay its member state contributions in Sterling (ELIXIR/2015/28). The difference between the value of these contributions valued in Euros at the date of payment and the date of the approval of the 2018 budget was a loss of €71k (2017: los of €3k).

Grant funding awarded Grant income earned in the current year Grant expenditure incurred in the current year

2018 Actual €€000 5.851 1,221 (1,317)

2017 Actual €€000 4.096 984 (987)

Unutilised grant income



(c) Grant income

(d) This surplus is included in the EMBL general reserve, but has been reng-fenced for the use by ELIXIR. (e) The following countries have amounts due or prepaid at 31 December 2018: Values in €000

Contribution 2018

Interest 2018

Denmark Netherlands



ELIXIR Annual Report 2018

Prepayments for 2019 134 338 472

Credits and Acknowledgements Produced on the direction of the ELIXIR Board by the External Relations team at the ELIXIR Hub. With a special thanks to all of those who contributed to the development of ELIXIR infrastructure in 2018, most notably Heads of Nodes, Platform and Community Leads, Technical and Training Coordinators and members of the various Working groups. Š May 2019 Published under the CC BY 4.0 licence

ELIXIR is building a sustainable European Infrastructure for biological information, supporting life science research and its translation to: Medicine Environment Bioindustries Society

Contact: Niklas Blomberg, Director ELIXIR Wellcome Genome Campus Hinxton, Cambridgeshire CB10 1SD, United Kingdon +44 (0)1223 492 670 +44 (0)1223 494 468

Profile for ELIXIR

ELIXIR Annual Report 2018  

ELIXIR Annual Report 2018