Page 1

Demonstrating Results of Big Data initiatives in Healthcare

ABSTRACTS


.2

| Demonstrating Results of Big Data initiatives in Healthcare

Index

Introduction to AEGLE and presentations relating to AEGLE

05.

An Integrated ProďŹ le of Chronic Lymphocytic Leukemia Patients: fusing High-Throughput Sequencing and Clinical Data to Build an Integrated ProďŹ le of Chronic Lymphocytic Leukemia Patients

07.

Big data analytics and processing of health data for research purposes: state of play in the EU Member States before and after the GDPR

08.

Early health technology assessment of big data analytics

09.

Ethico-legal issues with biomedical Big Data and some solutions

10.

R&D platform for Chronic Lymphocytic Leukemia

11.

R&D platform for type 2 diabetes

12.

Solution for a Clinical Decision Support System for the Intensive Care Unit

13.

Solution for a Research and Development User Interface for the Intensive Care Unit


.3

| Demonstrating Results of Big Data initiatives in Healthcare

Introduction to AEGLE and presentations relating to AEGLE

The primary aim of the AEGLE project has been to build an innovative ICT solution addressing the whole data value chain for health based on: cloud computing enabling dynamic resource allocation, high-performance computing (HPC) infrastructures for computational acceleration and advanced visualization techniques. This conference includes several presentations relating to AEGLE. Many of the achievements made during the AEGLE project are illustrated using three domains in healthcare: - chronic lymphocytic leukaemia (CLL) - type 2 diabetes, and - the intensive care unit (ICU).


.4

| Demonstrating Results of Big Data initiatives in Healthcare

Introduction to AEGLE and presentations relating to AEGLE

Four presentations are directly related to these

presentation describes the AEGLE ICU-R&D user

domains.

interface, which focuses on supporting

1. One presentation highlights developments seen in

research-related activities.

the CLL domain. CLL is a chronic, incurable haematological cancer that can be very costly to treat. The focus of the CLL work has been on the genetic and microenvironmental factors in the early stages of CLL, which could help to understand the mechanisms that underlie CLL onset and help to identify novel therapeutic targets. 2. Another presentation describes the work done in the type 2 diabetes domain. This presentation includes a demonstration of how AEGLE analytics could improve predictive modelling for complications of diabetes, which would help to improve targeted interventions. 3. Two presentations describe the work done in the ICU domain, which enable physicians to utilise the algorithms and predictive analytics developed within AEGLE to improve patient management in the areas of assisted mechanical ventilation, nutrition and patient deterioration. The ďŹ rst one describes the AEGLE ICU-CDSS user interface, which focuses on supporting care-related activities, while the second ICU

In addition, there are three presentations that describe some special challenges: 1. The ďŹ rst presentation on legal issues focuses on issues relating to legal issues surrounding the processing of personal data and the new European legal framework (GDPR). 2. The second presentation discusses the ethico-legal issues and presents two aids to guide decision-making on information governance in biomedical research to promote proportionate governance. 3. The third presentation describes how early health technology assessment methodology can be used to decide which applications of big data analytics are worth examining in more detail using cost-eectiveness modelling. These early health technology assessment can help developers to determine which applications are worth further development.


| Demonstrating Results of Big Data initiatives in Healthcare

An Integrated Profile of Chronic Lymphocytic Leukemia Patients: fusing High-Throughput Sequencing and Clinical Data to Build an Integrated Profile of Chronic Lymphocytic Leukemia Patients

.5

PROBLEM: Chronic Lymphocytic Leukemia (CLL), a hematologic neoplasia, is a complex disease. Next Generation Sequencing (NGS) has revolutionized the volume and variability of information (big data dimensions) that might be relevant to the CLL. Examples of NGS data include: immuno-repertoire data, whole genome/exome sequencing data, RNA sequencing data, etc. These constitute a group of heterogeneous data sources, whose individual value cannot be questioned. However, in isolation they are not sufficient to “tell the whole story” about the disease, its origin, progression, prognostication, etc. Added value in health care can come from combining these sources, or the information produced by them. They need to be combined in a meaningful and smart manner to build a comprehensive and as complete as possible picture.

Alexandra Kosvyra, Christos Maramis, Ioanna Chouvarda CERTH


| Demonstrating Results of Big Data initiatives in Healthcare

An Integrated Profile of Chronic Lymphocytic Leukemia Patients: fusing High-Throughput Sequencing and Clinical Data to Build an Integrated Profile of Chronic Lymphocytic Leukemia Patients

Alexandra Kosvyra, Christos Maramis, Ioanna Chouvarda CERTH

.6

SOLUTION:

discovery, and constitutes an approach with qualitative and quantitative characteristics, contributing towards

This is the design of novel integrated view that is able to

bridging data driven research with medical research,

combine in a meaningful manner all the

and towards data driven translational research for

aforementioned sources of information, towards an

personalised medicine.

integrated profile of the CLL patient. The introduced profile relies on a properly designed data model centered around the CLL patient. This model links the various NGS sources with the patient and their clinical

SPECIAL CHALLENGE:

data (concerning diagnosis, prognosis, response to

In the design of the patient profile, we had to overcome

treatment, etc.). The proposed profile efficiently

certain challenges, pertaining to the the necessary level

summarizes the large-scale datasets using meaningful

of integration (at data or feature or outcome level), the

high-level indicators/outlines and, when necessary

data quality and standardization in this wealth of

introduced simple yet powerful visualizations for all

heterogeneous data and analytics outputs. In addition, a

analysis outputs. Associations between indicators

main challenge has to do with the usability of the

(features) arising from different data analyses (eg SNPs

introduced profile/view and the credibility of predictive

from WES vs gene expression from RNAseq) are

analytics. For such approaches to be widely adopted,

supported. The association with clinical outcome,

the translational value of this approach needs to be

intervention and progression, is offered via

leveraged.

classification/prediction models, based on machine learning approaches, and employing the above-mentioned features. The profile facilitates the 1) presentation and exploration, 2) intra

and interpatient comparison, 3) new knowledge


| Demonstrating Results of Big Data initiatives in Healthcare

Big data analytics and processing of health data for research purposes: state of play in the EU Member States before and after the GDPR

.7

‘Big Data analytics’ in the field of health provides many

In the context of the AEGLE Horizon 2020 Innovation

scientific opportunities but also raises many legal

Action, Timelex collects, via its network of national legal

challenges. Indeed, in particular for scientific research,

correspondents, up-to-date reports describing the

the data processed are considered special categories

legal framework applicable to the processing of health

of data and therefore should be particularly protected.

data for research purposes in every EU Member State.

Moreover, with the implementation of the new

All Member States are currently in the process of

European legal framework applicable to the processing

revising their legal framework in this domain, looking

of personal data, also known as the GDPR, the rules for

forward to the entry into force of the GDPR on the 25th

the processing of health data, genetic data and other

of May 2018.

special categories of personal data are about to change Europe wide. Additionally, because health remains mostly in the hands of the European Union’s Member States, national legislations may provide additional rules applicable to scientific research in the field of health. This may affect the coherence of the European regime applicable to scientific research. The concept of Big Data is commonly referred to and used. However it is not mentioned by the GDPR. This is one of the reasons why Big data is particularly interesting because it challenges the principles of data minimisation and purpose limitation. In this context, it is

Jos Dumortier Time.Lex

why ascertaining the applicable rules is so important to scientific research.

The presentation will provide some important preliminary results of the survey undertaken by the Timelex team and give an initial insight on the impact of the GDPR on the national legal rules applicable to the processing of health data for research purposes in Europe.


| Demonstrating Results of Big Data initiatives in Healthcare

Early health technology assessment of big data analytics

.8

PROBLEM:

outcomes and costs can be estimated using industry-standard cost-effectiveness modelling that is

One major promise of big data and big data analytics

currently used to assess health technologies (like

(BDA) is that they can be used to solve all sorts of

diagnostic tests) that are ready to use in healthcare. If

healthcare problems and thereby improve health

the impact is deemed important, the BDA researchers

outcomes and healthcare efficiency. However, the huge

can continue developing the application. However, if

variety of problems that big data and BDA could

the impact is limited, the BDA researchers should

theoretically solve is actually the cause of an important

contemplate abandoning that application and switching

new problem for BDA researchers, namely which

to another application.

problems should they first try to solve? Once these problems have been identified, industry-standard cost-effectiveness analyses could be performed to help BDA researcher to make decisions about further development of that particular application.

SOLUTION:

SPECIAL CHALLENGE: Several challenges were encountered during our research. One particular challenge was that the description of the healthcare problem sometimes changed over time since clinicians needed to reflect on how BDA could lead to better patient care. This

Case-by-case experience and discussions with others

experience only reinforced the notion that a set of

(including clinicians in the AEGLE consortium) helped to

easy-to-use criteria is a valuable way to ensure that

develop a set of criteria that should be met before

time-consuming analyses to support decision-making

starting with the early cost-effectiveness analyses.

Jos Aarts, Ken Redekop

about the further development by BDA researchers

While these criteria are relatively simple to apply, they

should only be performed after those criteria are met.

Erasmus School of Health Policy and

can help to ensure that the time spent on doing early

Management, Erasmus University

cost-effectiveness analyses is efficiently spent. Once an

Rotterdam

application meets these criteria, its impact on health


| Demonstrating Results of Big Data initiatives in Healthcare

Ethico-legal issues with biomedical Big Data and some solutions

.9

PROBLEM:

SPECIAL CHALLENGE:

Medical research ethics are established as a vital

Public expectations of data processing differ according

requirement for the protection of research subjects

to four factors: who is processing the data; why are they

from various harms. However, the paradigms that have

processing it; what are they processing; and how are

developed in the iterations of the Declaration of

they processing it. The use of tools to acknowledge

Helsinki require adaptation to ensure the optimal

these distinctions will ensure that research governance

exploitation of data science. Research ethics

has legitimacy and proportionality. The use of tools like

governance bodies may not always fully understand

this will facilitate international data science projects

data protection law, particularly its limitations in the face

whilst helping to maintain the trust of the public in

of fast-moving technological change.

biomedical Big Data research.

SOLUTION:

These tools require further study to confirm their value

This paper presents two aids to guide decision-making on information governance in biomedical research to promote proportionate governance. The first is a finer categorisation of the sensitivity of data. There are many different levels of sensitivity of data, including different

Barbara Pierscionek and John Rumbold

levels of medical data. Proportionate governance can only be achieved by recognizing this. The second is an anonymisation matrix to guide the appropriate level of

School of Science and

anonymisation according to sensitivity of data and the

Technology, Nottingham

context of processing.

Trent University

in Big Data projects.


| Demonstrating Results of Big Data initiatives in Healthcare

R&D platform for Chronic Lymphocytic Leukemia

.10

PROBLEM:

SOLUTION:

Chronic lymphocytic leukemia (CLL) is considered as a

We focused on the impact of both the genetic and

prototypic disease where genetic and

microenvironmental factors in early stages of CLL, i.e.

microenvironmental factors drive disease pathogenesis

cases with MBL and compared the results with CLL. In

and progression. “CLLlike” Monoclonal Bcell

specific, we performed: (i) whole exome sequencing

Lymphocytosis (MBL) is characterized by the presence

(WES) in individuals with MBL and patients with CLL, (ii)

of clonal B cells with the characteristic CLL phenotype,

RNA-seq in the same sample groups and, finally, (iii)

yet in lower numbers compared to CLL. MBL is

in-depth immunoprofiling of the T cell compartment

distinguished into high and low count (HCMBL and

through deep-sequencing of the T cell receptor (TR).

LCMBL, respectively) based on the number of clonal

Results will provide for the first time a comprehensive

cells. HCMBL may evolve to CLL requiring treatment

view of the mechanisms that underlie CLL onset and

and is considered a preleukemic condition, while

identify novel therapeutic targets in the era of

LCMBL has an unclear, if any, link to CLL. The precise

personalized medicine.

mechanisms underlying MBL onset and/or progression into overt CLL remain unknown. Arguably, these could

SPECIAL CHALLENGE:

entail the acquisition of particular genetic lesions

The mail challenge of this study was the design of an

and/or extrinsic signals delivered to the clonal B cells, including those emanating from T cells.

analytics platform that could integrate all relevant bioinformatics tools to analyse all different types of data in a robust, systematic, quick and user-friendly way. The

Andreas Agathangelidis

AEGLE platform has all required features to analyse all

Institute of Applied Biosciences (INEB)

aforementioned types of data providing the user with

and Centre for Research and

the ability to address complicated scientific scenarios.

Technology Hellas (CERTH)


| Demonstrating Results of Big Data initiatives in Healthcare

R&D platform for type 2 diabetes

.11

PROBLEM:

databases to create enlarged datasets, thus enabling Big data analytics within a collaboration. The aim of

Type 2 diabetes currently affects over 6% of the world’s

these analytics is to improve predictive modelling for

population. Although more associated with the

complications of diabetes, thereby enabling targeted

developed world in the past, it is increasingly a global

interventions. It also can enable the development of

problem. Despite this prevalence, management is often

targeted drug therapies based on phenotypic

based on clinical experience, and supported by

stratification. Analytics can be developed for the

databases of their own patients. UK databases have no

pharmaceutical industry, as well as clinical and

agreed minimal standard dataset, whether commercial

academic research teams.

or NHS. This makes Big data analytics, and predictive modelling for complications as well as responses to therapy, difficult to achieve.

SPECIAL CHALLENGE: Several challenges have been encountered: As the

SOLUTION:

databases are clinically driven, then the data they hold may have gaps and errors; algorithms were developed

The AEGLE platform provides high performance

to mange these deficiencies. Merging of disparate

computing within a web-based platform. This enables

databases requires clinical and technical expertise in

the provision of advanced analytics to all healthcare

order to preserve the utility of the data.

providers. AEGLE also has the capacity to merge

John Chang Croydon Health Services – NHS Trust


| Demonstrating Results of Big Data initiatives in Healthcare

Solution for a Clinical Decision Support System for the Intensive Care Unit

.12

PROBLEM:

Events, provides information of their characteristics and analysis of preceding events, to facilitate targeted

The Intensive Care Unit (ICU) is, as its name implies a

interventions. Additionally the ICU-CDSS analyzes data

place where intensive monitoring and treatment is

from commercially available ventilator and permits the

provided to the sickest of patients. All monitoring and

computation of driving pressure, the identification as

life supporting devices produce continuously a large

well as the prediction of sustained high driving

amount of data. Physicians in the ICU need to process

pressure. Moreover the ICU-CDSS analyzed the

all these information to provide care to the patients. The

patient’s electronic medical record data, and provides

physician is at a disadvantageous position, having to

information and alerts for nutrition deficits based on

deal at the same time more than one patient, with

both physician selected and algorithm estimated

much more than one problems. Through safe

thresholds, on 8-hour and 24-hour basis. Finally the

processing of the available data and use of automated

ICU-CDSS provides alerts for deteriorating trends

algorithms Clinical Decision Support Systems (CDSS)

based on patient’s electronic medical record data.

can facilitate the overwhelmed physicians.

SOLUTION:

SPECIAL CHALLENGE: Best practices for data security, anonymization and

The AEGLE project has developed a platform which

privacy have been applied throughout the platform

enables safe patient data handling. The AEGLE

development. The applied algorithms can be further

ICU-CDSS user interface allows physicians to benefit

Katerina Vaporidi, Department

explored with the AEGLE ICU-R&D user interface, and

from the algorithms and predictive analytics developed

alert thresholds can be modified to best match the

of Intensive Care Medicine, University

within AEGLE to improve patient management in the

needs of a specific patient population.

of Crete, School of Medicine, and

areas of assisted mechanical ventilation, nutrition and

Intensive Care Unit, University Hospital

patient deterioration. Specifically the ICU-CDSS permits

of Heraklio, Crete. Greece

the analysis of ineffective effort monitoring data and identifies the clinically significant Ineffective Efforts


| Demonstrating Results of Big Data initiatives in Healthcare

Solution for a Research and Development User Interface for the Intensive Care Unit

.13

PROBLEM:

analysis of ineffective effort monitoring data, identification and feature extraction of Ineffective Efforts

In the Intensive Care Unit (ICU) all monitoring and life

Events, and analysis of preceding time-period

supporting devices produce continuously a large

ventilation variables, to investigate causality.

amount of data. All these information are used for the

Additionally the ICU-R&D-UI provides analytics for data

delivery of everyday care of the patients, while only part

from commercially available ventilator, alone and in

of it is stored in individual patient medical records. Yet it

combination with medical record and monitoring data,

is beeing increasingly recognized that secondary use of

and permits the computation of driving pressure, and

electronic medical record and monitoring data can

prediction of periods of sustained high driving pressure.

provide new knowledge and advance our

Moreover, the ICU-R&D-UI enables the analysis of

understanding of disease. Exploitation of this large

electronic medical record data, with specific focus on

amount of data, using big data analytics, can lead to

nutrition parameters, and permits the investigation of

improved patient stratification and delivery of

the multiparamatric medical records.

individualized patient care. Safety of sensitive personal information is always a major concern that has to be adequatelly addressed for the user interface.

SOLUTION:

SPECIAL CHALLENGE: Best practices for data security, anonymization and privacy have been applied throughout the platform development. The existing algorithms and workflows

The AEGLE project has developed a platform which

Katerina Vaporidi, Department

can be further enriched by uploading custom code and

enables safe patient data storage and handling. The

modified to best match the needs of the individual

of Intensive Care Medicine, University

AEGLE ICU-R&D user interface allows physicians to

researcher.

of Crete, School of Medicine, and

benefit from the analytics and workflows developed

Intensive Care Unit, University Hospital

within AEGLE to research in the areas of assisted

of Heraklio, Crete. Greece

mechanical ventilation, nutrition and patient deterioration. Specifically the ICU-R&D-UI permits the


.14

| Demonstrating Results of Big Data initiatives in Healthcare

Other Health Big Data Projects and Initiatives Abstracts

Index

15.

A Behavioral Informatics Platform for Health Problems Prevention through Personalized Data Collection and Social Comparison Feedback

16.

An e-Health platform for coronary artery disease management: The SMARTool H2020 project

18.

Bayesian Inference about Optimal Treatment Regimens using Observational Data: An Application to Head and Neck Carcinoma Data collected in BD2DECIDE

19.

Big Data in Psychiatry: Collaborating in a compute visits data model

20. CrowdHEALTH: Collective wisdom driving public health policies (2017-2020) 22. ELIXIR Platform 23.

Establishing a linked European Cohort of Children with Congenital Anomalies

25.

European Medical Information Framework

26.

Predicting Frailty Condition in Elderly Using Multi-Dimensional Socio-Clinical Databases

27.

Proposal for Demonstrating Results of Big Data initiatives in Healthcare


| Demonstrating Results of Big Data initiatives in Healthcare

A Behavioral Informatics Platform for Health Problems Prevention through Personalized Data Collection and Social Comparison Feedback

.15

PROBLEM:

The proposed approach treats the participating individuals as peers of a “wellbeing social network” and

The P4 (predictive, preventive, personalized and

capitalizes on the established behavior change

participatory) medicine paradigm is slowly transforming

technique of social comparison. To this direction, the

healthcare towards sustainability. Concerning

platform periodically executes a series of descriptive

prevention and personalization, several personal

analytics to the ever-increasing collected behavioral

behavioral patterns have been associated with

data to draw personalized comparisons with other

potentially preventable health problems (e.g., excessive

individual peers or even the entire population.

snacking with obesity, tobacco smoking with COPD).

Configurable personalized feedback concerning the

The emerging field of Behavioral Informatics (BI) offers

aforementioned comparisons is provided to the

ICT solutions for monitoring and modifying such

individuals through the same smartphone app.

harmful behavioral patterns. Evidently, the monitoring alone of a range of behaviors for a population of individuals in their daily life is a big data-scale task.

SOLUTION:

SPECIAL CHALLENGE: The main non-technical challenges that had to be addressed in the design of the introduced platform is the protection of the participants’ identity and personal

Focusing on behaviors with proven associations to

data: Best practices for data security, anonymization

health problems, the present research is developing a

and privacy have been applied throughout the platform

novel BI platform for the day-by-day collection and

development. Moreover, the feedback module has

analysis of a variety of behavioral data from a large

been designed so as to withhold any identifiable

volume of healthy individuals. Data collection is

information about the peers (real names are not

performed through a newly developed smartphone

collected, a unique nickname is employed as the peer

Technologies Dept of Medicine,

app that is made freely available; the collected data are

ID, etc.).

Aristotle University of Thessaloniki

sent to the platform back-end to be stored in a scalable

Christos Maramis and Ioanna Chouvarda Laboratory of Computing, Medical Informatics and Biomedical-lmaging

database solution.


| Demonstrating Results of Big Data initiatives in Healthcare

An e-Health platform for coronary artery disease management: The SMARTool H2020 project

.16

PROBLEM: The clinical manifestations of coronary artery disease (CAD) are triggered by local processes (the high risk plaque) in the context of a systemic disease (the high risk patient). Multiple factors contribute to plaque formation / progression / complication, by a complex interaction between biological and mechanical components, and several artery-specific or patient-specific factors have been proposed as diagnostic and /or prognostic biomarkers of atherosclerosis (ATS) formation and progression. The identification of subjects at increased risk for coronary ATS progression and related clinical events is still far from optimal. Coronary imaging and, in

F. Barbon (7*) , P. Stofella (7) , A. Sakellarios (2) , A.J.H.A. Scholte (3) , D. Neglia (4) , R. Buechel (5) , J. Knuuti (6) , J. Correia (8) , S. Meucci, M. Schuette (10) , O. Parodi (1) , M. Rial (1) , S. Rocchiccioli (1) , D. I. Fotiadis (2) , G. Pelosi (1) on behalf of SMARTool Study Investigators (1) Institute of Clinical Physiology CNR, Pisa, Via Moruzzi 1 56124, Italy, (2) Institute of Molecular Biology and

Biotechnology, Dept. of Biomedical Research Institute – FORTH, University Campus of Ioannina, GR 45110, Ioannina, Greece (3) Academish Ziekenhaus Leiden, Albinusdreef 2, Leiden 2333 ZA, Netherlands (4) Fondazione Toscana Gabriele Monasterio, Pisa, Via Moruzzi 1 56124, Italy (5) Universitaet Zuerich, Raemistrasse

particular, non-invasive coronary computed tomography angiography (CTA) has been shown to outperform non-imaging conventional risk factors, clinical risk scores as well as the majority of molecular biomarkers currently adopted in clinical practice. Although processing CTA by advanced analyses, such as Computational Fluid Dynamics (CFD), allows further detailed examination of local morphological and biomechanical characteristics able to assist a clinical decision support system

71, ZURICH 8006, Switzerland, CH233525 (6) Turun Yliopisto, Yliopistonmaki, Turun Yliopisto 20014, Finland (7)

(CDSS), a comprehensive site specific and patient-specific platform is not

Exprivia S.p.A., Via Alcide Degasperi 77, Trento 38123, Italy (8) Biotronics 3D Limited, Park Place 1, London E14

currently available as a unique tool for clinical decision support in stratification,

4BE, United Kingdom (9) Micronit Microtechnologies BV, Enschede, The Netherlands (10) Alacris Theranostics

diagnosis, prediction and treatment through a personalized approach.

GmbH, Max-Planck-Strasse 3D-12489 Berlin

* Correspondence: fabio.barbon@exprivia.it Exprivia S.p.A., Via Alcide Degasperi 77, Trento 38123, Italy.


| Demonstrating Results of Big Data initiatives in Healthcare

An e-Health platform for coronary artery disease management: The SMARTool H2020 project

.17

SPECIAL CHALLENGE:

Preliminary results of PIM demonstrate accuracy, specificity and sensitivity of 85.1%, 98.7% and 44%,

- Cost-effectiveness

respectively, using support vector machine as classifier. SmartFFR has been compared to invasively measured

PIM risk stratificator has a very good negative predictive

FFR with a correlation coefficient of 0.90, while plaque

value (98.7%), so it can act as a solid gatekeeper to

growth has a 75% prediction accuracy in identifying

expensive imaging tests such as CTA, avoiding

regions prone to plaque formation.

unnecessary radiation exposure. On the other hand, CTA-based virtual fractional flow reserve computation

Moreover, growing health data availability and

(SmartFFR) is a gatekeeper to CathLab expensive and

evidence-based medicine paradigm itself are posing a

potentially harmful diagnostic procedures, acting as a

new information overload challenge for clinicians.

low-risk and low-cost substitute for Invasive FFR,

SMARTool platform aims at easing CAD related clinical

according to an estimated FFR/ICA ratio of <4.0% and

decisions and interventions at the point of care by

an estimated saving of up to 90% of the financial cost.

providing a data-driven, evidence-based Clinical Decision Support System (CDSS) of tools.

F. Barbon, P. Stofella, A. Sakellarios, A.J.H.A. Scholte, D. Neglia, R. Buechel, J. Knuuti, J. Correia, S. Meucci, M. Schuette, O. Parodi, M. Rial, S. Rocchiccioli, D. I. Fotiadis, G. Pelosi on behalf of SMARTool Study Investigators

A 5-year forecasting business case has been developed taking into consideration the platform's

The CDSS includes an orchestration component that

fixed/variable costs and a revenue simulation model,

aims at gluing this set of tools into a visual CAD staging

and resulting in a break-even point of around 10K

for the clinical practitioner through CAD management

cases/year.

workflow.

- Clinical Usability

- Acknowledgements:

PIM and PRM components of SMARTool CDSS

This project has received funding from the EU H2020

evidence an acceptable performance in terms of CAD

research and innovation programme under grant

risk stratification and invasive treatment decision.

agreement No 689068.


| Demonstrating Results of Big Data initiatives in Healthcare

Bayesian Inference about Optimal Treatment Regimens using Observational Data: An Application to Head and Neck Carcinoma Data collected in BD2DECIDE

.18

PROBLEM:

the needs of the patient. This problem, called confounding, requires that all confounding variables are

Using patient characteristics for personalizing treatment

available for modelling. Using simulation, we assessed

decisions is an increasingly important objective in

the impact of different strengths of confounding on the

evidence-based medicine and statistics. The potential

accuracy of the optimal treatment assignment using a

of applying big data, both in terms of number of

learning technique called Bayesian Additive Regression

patients and patient characteristics, is large. However,

Trees. An advantage of the technique is that it can

little is known about how to identify sub-sets of patients

handle large numbers of predictors and unknown

that may benefit from a treatment more strongly than

functional forms, both often encountered in big data

others.

settings. The technique showed good frequentist

SOLUTION:

coverage of the true optimal population benefit and

We present new methodology for learning optimal treatment regimens from observational data, which

good treatment assignments properties.

APPLICATION:

allows (a) determining an optimal assignment rule for

We applied the method to high-resolution Larynx

patients while penalizing suboptimal decisions, (b)

carcinoma data collected within the European

estimating the expected patient and population benefit

BD2DECIDE consortium and containing all patients

under this optimal regimen, and (c) quantifying the

treated in three EU countries in 2009-2012. We

degree of certainty about the correctness of treatment

determined an optimal rule for assignment of patients

decisions and benefit estimates.

to surgery or radiotherapy maximizing 2-year survival.

SPECIAL CHALLENGE:

The observed survival probability was 0.716 which

Medical Centre Amsterdam,

A common problem in health data collected from

credible interval: 0.734,0.831). In 34.8% cases optimal

The Netherlands

patient files and electronic registries is that patient

treatment assignment was different from the treatment

assignment to treatment is not random, but tailored to

actually received.

Thomas Klausch, Peter van de Ven, Johannes Berkhof Department of Epidemiology and Biostatistics, VU University

under optimal assignment increased to 0.780 (95%


| Demonstrating Results of Big Data initiatives in Healthcare

Big Data in Psychiatry: Collaborating in a compute visits data model

.19

PROBLEM:

using big data analytics, like predictive modelling. In frequent agile scrum sessions with a small team of

Psychiatric disorders are complex in aetiology, dynamic

professionals and patients value and knowledge of

over time and influenced by the interaction between

different domains (data and health) is combined and

the individual patient and its environment. Big data

added to the data whereupon further analytic steps can

deals with this complexity by using real life data in real

be initiated. This creates models that make sense for

time from a real patient population. As such, big data

daily practice. Two examples are predictive models on

provides opportunities to better understand the

the effects of antidepressants and on aggressive

complexity and improve mental health care for

behavior during admission.

individual patients. However, the challenge is how to benefit from these opportunities in daily clinical practice?

SPECIAL CHALLENGE: In order to validate predictive models and to disseminate the knowledge of a big data approach in

For this, a transformation to a data driven organization is

Psychiatry collaboration with other organizations is

needed with close collaboration between data experts

essential. However, privacy and hardware issues make

and health care professionals.

it difficult to connect data sources.

SOLUTION:

This is why the compute visits data consortium with

The UMC Utrecht Big Data Psychiatry project is

Drs. Karin Hagoort and Prof. Dr. Floortje Scheepers UMC Utrecht

focusing on the stimulation of a data driven organization. By building flexible decision support systems for professionals, based on clinical data from the electronic patient file, professionals have direct insight in daily practice and can make a profit out of it. We use anonymised data of real-life electronic patient files. After preparation, these datasets are analysed

Antes GGZ, GGZ Eindhoven, Antonius hospital Psychiatry and UMC Utrecht Psychiatry was built. In a carousel design, where training, validating and testing theory can all take place at different organizations, the computing ‘visits’ the data sequentially. Algorithms, output and software are exchanged in the consortium and results can be replicated and validated without sharing the patient data.


| Demonstrating Results of Big Data initiatives in Healthcare

CrowdHEALTH: Collective wisdom driving public health policies (2017-2020)

.20

PROBLEM: Rather than focusing on a specific disease or health problem, CrowdHEALTH intends to integrate high volumes health-related heterogeneous data from multiple sources with the aim of supporting policy making decisions.

SOLUTION: CrowdHEALTH is delivering a secure ICT platform to collect and aggregate high volumes health data from multiple information sources in Europe. CrowdHEALTH also proposes the evolution of patient health records towards citizen Holistic Health Records (HHRs) to capture the clinical, social and contextual factors and which are federated into “Social HHRs”

Thomas Klausch, Peter van de Ven, Johannes Berkhof Department of Epidemiology and Biostatistics, VU University Medical Centre Amsterdam, The Netherlands

.An early prototype integrating several technical components will be available and presented by means of screenshots mainly.


| Demonstrating Results of Big Data initiatives in Healthcare

CrowdHEALTH: Collective wisdom driving public health policies (2017-2020)

.21

SPECIAL CHALLENGE: The first main challenge is related to the different sources of healthrelated data, which includes data from

The project is developing KPIs and policy evaluation

sensors, social and contextual data. Another issue is

components to be able to assess if policies have been

related to the privacy of data and anonymization Accessibility of datasets is an important issue,

The project is working with several data standards, in

- Legal/ethical issues concerning data:

particular the HHR model is mainly based on the

The project faces the ethical challenge related to

add information for aspects not yet covered.

interventions based on non-transparent data sources,

and Biostatistics, VU University Medical Centre Amsterdam, The Netherlands

- Dataset standardisation:

the country regulations.

subjecting individuals and populations to policies and

Department of Epidemiology

effective or not.

implying that it is not possible to go back to the source. depending on the organization who stores the data and

Thomas Klausch, Peter van de Ven, Johannes Berkhof

- Measurable impact (i.e. cost-effectiveness) of your solution:

emerging FIHR standard, which is being extended to

- Dataset standardisation:

processing and analyses. Although the project is

CrowdHEALTH is developing a Policy Development

developing components to clean data and ensure

Toolkit that should allow policy makers to interact with

quality and reliability, it is not fully clear who will be

big data without having to deal with complicated

accountable for the quality of the data input, or the

calculations, formulae or unreadable data. The interface

algorithms applied.

will allow the selection of a few parameters and meaningful visualizations.

- Contact: Lydia.montandon@atos.net CrowdHEALTH project coordinator


.22

| Demonstrating Results of Big Data initiatives in Healthcare

ELIXIR Platform

ELIXIR - http://elixir-europe.org - is an intergovernmental organisation that brings together life science resources from across Europe. These resources include databases, software tools, training materials, cloud storage and supercomputers. The goal of ELIXIR is to coordinate these resources so that they form a single infrastructure. This infrastructure makes it easier for scientists to find and share data, exchange expertise, and agree on best practices. Ultimately, it will help them gain new insights into how living organisms work. ELIXIR includes 21 members and over 180 research organisations. It was founded in 2014, and is currently implementing its first five-year scientific programme. We will present an overview of the platforms and focus in on the Compute platform and it's application to real world use cases including in e.g. rare disease and human data based on collaborations including UK National Health Service Trust partners. The Rare Disease Use Case extends and generalises the system of access authorisation and high volume secure data transfer developed within the EGA project. The goal of the Use Case is to create a federated infrastructure that will enable researchers to discover, access and analyse different rare disease repositories across Europe. It is doing this in partnership with other European infrastructure projects, namely RD-CONNECT, BBMRI-ERIC and E-Rare. We will include an introductory 3 minute video presentation and in the talk highlight how we are addressing the Big Data challenges of legal/ethical issues via standardised authentication and dataset approaches. The impact of ELIXIR to date is also captured and analysed in a final 3 minute video presentation.

Dr Jonathan A. Tedds Compute Platform Coordinator, ELIXIR Europe Hub, Cambridge, UK & Hon Research Fellow: Health And Research Data InformaIcs


Establishing a linked European Cohort of Children with Congenital Anomalies

| Demonstrating Results of Big Data initiatives in Healthcare

Health

Establishing a linked European Cohort of Children with Congenital Anomalies

Paris, France; Ile de la Réunion, France; Northern, Netherlands; OMNI-NET, Ukraine Saxony-Anhalt, Germany; Zagreb, Croatia Antwerp, Belgium; Basque, Spain; Tuscany, Italy; Emilia Romagna, Italy; Finland; South, Portugal; WANDA, UK; SWCAR, UK; Odense, Denmark; CAROBB, UK; EMSYCAR, UK; NorCAS, UK; Poland; Norway; Malta; CARIS, UK;

Survival

Education Prescriptions

Prof J K Morris, Scientific Coordinator; Dr Ester Garne, Clinical Coordinator; Dr Maria Loane, Data Coordinator

Background:

Prof J K Morris, Scientific Coordinator; Dr Ester Garne, Clinical Coordinator; Dr Maria Loane, Data Coordinator

Aims:

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 733001. Start Date: 1 Jan 2017. Duration: 5 years

- Congenital anomalies (or birth defects) are a major

- To investigate the health and educational outcomes of

cause of infant mortality, childhood morbidity and

children with congenital anomalies for the first 10 years

long-term disability.

of their lives.

http//www.eurolinkcat.eu/

enquiries@eurolinkcat.eu

- Over 130,000 children born in Europe every year will

- To facilitate the development of a more reciprocal

have a congenital anomaly.

relationship between families with children with

- EUROlinkCAT will use the existing EUROCAT infrastructure to support 22 registries in 14 European countries to link their congenital anomaly data to mortality, hospital discharge, prescription and educational databases.

congenital anomalies, health and social care professionals and researchers by developing an online forum: “ConnectEpeople”

.23


| Demonstrating Results of Big Data initiatives in Healthcare

Establishing a linked European Cohort of Children with Congenital Anomalies

.24

Objectives:

- To evaluate the accuracy of existing electronic health care databases and make recommendations on their

- To establish a European network of standardised

use and on improving their accuracy.

datasets containing information on the mortality, health, educational achievements and needs of children with

- To engage with the relevant

congenital anomalies born from 1995-2014 up until 10

international/national/regional health authorities by

years of age.

establishing an Action Advisory Panel to ensure that relevant findings are implemented and translated into

- To provide an e-platform “ConnectEpeople” for public

health policy.

and professional engagement in setting and disseminating relevant reseach priorities and their

- To enable the established infrastructure and

outcomes, focusing on four specific anomalies:

methodology for this unique research platform to be available for local research and future European wide

- Heart surgery in children

- Cleft lip

- Down syndrome

- Spina Bifida

- To expand the knowledge on the survival, health, disease determinants and clinical course of children according to their specific anomaly. - To investigate socio-economic health inequalities.

Prof J K Morris, Scientific Coordinator; Dr Ester Garne, Clinical Coordinator; Dr Maria Loane, Data Coordinator

- To evaluate the costs of hospitalisation during the first five years of life for children with a congenital anomaly. - To expand the knowledge on the educational achievements and needs of children with specific congenital anomalies.

analyses beyond the end of the project.

Work Packages: Work Package (WP) 1 : Co-ordina�on and management WP2 : Crea�on Central Results Registry

WP3 : Mortality WP4 : Morbidity WP5 : Educa�on WP6 : Data quality WP8 : Dissemina�on

WP7 : Connect Epeople


| Demonstrating Results of Big Data initiatives in Healthcare

European Medical Information Framework

.25

EMIF, the European Medical Information Framework (2013-2018), is a multidisciplinary research and development project. Its objectives have been to develop and implement a robust and scalable architecture and tools to connect health data from a variety of sources across Europe to facilitate large scale clinical and life sciences research. EMIF is funded through the Innovative Medicines Initiative (IMI), a public private research and development partnership between the European Commission and the European Federation of Pharmaceutical Industries and Associations (EFPIA). The EMIF Platform provides an efficient integrated information framework for this large-scale re-use of health and life sciences data. The Platform enables research users and data custodians to collaborate throughout the research lifecycle from data discovery to data sharing and data analysis. It enables approved users to analyse securely multiple, diverse, data via a single portal, thereby mediating research opportunities across a large quantity of research data. The EMIF project includes two specific research topics that have helped to guide the development of the Platform: the identification and validation of protective and precipitating factors for conversion to Alzheimer’s Disease, and predictors of metabolic complications of obesity. There are many challenges associated with federating heterogeneous data sources, and with enabling research users to discover and assess the suitability of different data sources to their research needs. It is important to establish a trustworthy governance framework to encourage data sources to share their data with external researchers, and to protect the privacy of the data subjects represented in the data. This demonstration will explain the EMIF concept and outline the way in which it support the data discovery, data assessment and data use lifecycle. Some of the EMIF tools will be presented. The EMIF Code of Practice will be summarised, and discussed with the audience.

Dipak Kalra University College London


| Demonstrating Results of Big Data initiatives in Healthcare

Predicting Frailty Condition in Elderly Using Multi-Dimensional Socio-Clinical Databases

.26

PROBLEM:

SPECIAL CHALLENGE:

In the last decades life expectancy increased globally,

The retrospective cohort studies are conducted on the

leading to various age-related issues in almost all

whole elderly population of the Municipality of Bologna

developed countries. Frailty affects elderly who are

(380,181 residents in 2010) by usign 12 different data

experiencing daily life limitations due to cognitive and

sources that include information on clinical and

functional impairments and represents a remarkable

socio-economic aspects and health service resources

burden for national health systems.

utilization. The over 65 years old category (25.93% of the

SOLUTION: In this presentation, we show our work on two different predictive models for frailty by exploiting 12 socio-clinical databases. Emergency hospitalization or all-cause mortality within a year are used as surrogates of frailty. The first model is able to assign a frailty risk score to each subject older than 65 years old, identifying 5 different classes for tailor made interventions. The second prediction model assigns a worsening risk score to each subject in the first non-frail class, namely the probability to move in a higher frailty class within the year.

Danilo Montesi Alma Mater Studiorum Università di Bologna

overall population) represents the baseline cohort. We use a six-year (2011-2016) and a four-year (2013-2016) observation period to assess the predictive ability of the frailty risk and worsening risk model, respectively. The strengths of our study include the possibility to guide appropriate planning of health resource utilization and develop patient-oriented preventive strategies. The use of routinely collected socio-clinical data reduced the potential risk of missing data and allowed to collect a wide variety of predictor variables including clinical and socio-economic aspects.


| Demonstrating Results of Big Data initiatives in Healthcare

Proposal for Demonstrating Results of Big Data initiatives in Healthcare

.27

PROBLEM: Single observational studies claiming surprising benefits or harms from specific medical treatments tend to influence practices despite lack of independent confirmation.

SOLUTION:

of the choices: use time-varying covariates for all non-constants, use a competing risk analysis when the

Confirm or refute such reports by analyzing Medicare’s

outcome is not death, use left truncation for late entry,

“Big Data’ with Cox survival analyses. Medicare is huge,

use propensity matched controls (to speed runs), and

with 28 million fee-for-service patents, and carries

include wash outs. We found one trap – drugs can look

diagnoses, procedures, medications, demographics,

as if they were associated with death if initiated

vital status, and more.

because of ICU admissions. Still learning best approach

We are following up on reports that antiandrogen

for some issues.

treatment of prostate cancer causes Alzheimer’s disease (1*), use of protein pump inhibitors increases death, short-term use of quinolone antibiotics causes/aggravates aortic aneurysms, metformin prolongs life, and more. Our early answers are no, no, yes, and no, respectively.

Clem McDonald, Seo Biak, Fabricio Kury, Jason Lau

CHALLENGES: CMS solves legal issues. Reviews project, provides access to linked, de-identified data, allows no patient level downloads, and only allows analyses to be run

Cox regression is very powerful, and has many

inside their machines. Biggest challenge is finding

implementation choices and traps. We now know some

stable answers.

(1*) Risk of Alzheimer’s Disease Among Senior Medicare Beneficiaries Treated With Androgen Deprivation Therapy for Prostate Cancer. Seo Hyon Baik, Fabricio Sampaio Peres Kury, and Clement Joseph McDonald. Journal of Clinical Oncology 2017 35:30, 3401-3409


| Demonstrating Results of Big Data initiatives in Healthcare

.28

Partners of the AEGLE Consortium

This project has received funding from the European Unionâ&#x20AC;&#x2122;s Horizon 2020 research and innovation programme under grant agreement No 644906

Abstracts 17052018  
Abstracts 17052018  
Advertisement