Page 1

European Biotechnology News Science & Industry

April 2014

II Big Data & IP in Life Sciences SPECIAL 31_EBSIN4_14_Titel_Big Data_tg.indd 27

03.04.2014 14:45:38 Uhr

Comprehensive industry-speci�ic advice in Life Sciences. Dentons` Life Sciences experts in Germany advise on project-related transactions, or alternatively as an “outsourced legal department”, with deep industry-speci�ic knowledge, creativity and years of expertise to ensure their clients` success. Whether licensing contract deals or regulatory issues relating to the drug advertising law – as part of a team of over 80 consultants in Germany, Dentons provides companies in the areas of pharmaceuticals, diagnostics, biotechnology and medical devices with a future-oriented and interdisciplinary legal advice. Dentons is a new global law �irm with more than 2,500 lawyers and professionals in 79 locations in 52 countries offering creative, actionable business and legal solutions. Created by the combination of Salans LLP, Fraser Milner Casgrain LLP (FMC) and SNR Denton, Dentons is built on the solid foundations of three highly regarded law �irms. Your contact for Life Sciences: Peter Homberg T: +49 69 45 00 12 311 Dentons Frankfurt Pollux, Platz der Einheit 2 60327 Frankfurt am Main

Dentons Berlin Markgrafenstraße 33 10117 Berlin

Meet Dentons. The new global law �irm created by Salans, FMC and SNR Denton. 32_EBSIN4_14_Dentons.indd 1

02.04.2014 16:34:17 Uhr

Nº 4 | Volume 13 | 2014



BiG Data & IP Intro

Big Data & smart data There has been a huge rise in data collection related to medicine in the ten years since the Human Genome Project was declared finished. Big datasets that link electronic medical records with patient-specific diagnostic data or disease-associated genetic risk factors (biomarkers) are now promising to improve the efficacy of patient pre-selection for clinical trails and therapy. But the growing flood of information in the life sciences is also presenting huge challenges when it comes to secured storage, processing, data visualisation and analysis. In clinical medicine, especially oncology, “Big Data” is expected to play an increasingly important role in identifying causality of symptoms, predicting hazards of disease incidence or reoccurence, or in improving quality of care. But is the investment in projects like the “Big Data to Knowledge” (BD2K) initiative, the HANA Oncolyzer project or the free-access digital cancer treatment library “Eviti” worth the money? Or is the hype surrounding Big Data masking a field with big promise, but little real current value?

issued in March by the IMS Institute for Healthcare Informatics. “Riding the Information Technology Wave in Life Sciences: Priorities, Pitfalls and Promise” showed that cloud-based business intelligence applications and storage in non-relational parallel-processing databases, embedded analytics and systems integration has the potential to drive transformational change over the next three years – both in overall healthcare system efficiency and the efficacy of treatments. The study’s authors believe that the availability and adoption of secure, healthcare-specific tools and services are key to accelerating opportunity and deriving greater value from new, expanded sources of health information to optimise patient outcomes. Cost pressure is also driving solutions that promise to improve the efficiency of pharmaceutical development. According to IMS Health, Big Pharma will need to reduce combined operating costs by US$36bn annually through 2017 to maintain operating margins and current levels of R&D activities.


Challenges and bottlenecks

IBM’s Watson Foundation presenting new visualisation tools for Big Data at CEBIT.

At BIOCOM’s recent 7th IP conference “Big Data – Big Drugs” in Berlin, Sachin Soni – the Director of Equity Research (Life Sciences) at Kempen & Co Merchant Bank – made it clear that integrating Big Data could have a massive impact on healthcare. He showed data suggesting that improvements in pharma­ceutical R&D productivity alone could create value of US$40-70bn annually. Evidence-based care could add another US$90-110bn – and that seems to be just the tip of the iceberg. Linking existing information on drug profiles with a growing body of omics data and patient records can for example identify new applications for drugs

33_EBSIN4_14_spezial_intro_ml.indd 33

that have already been approved. Integrating sequencing and outcome data can help determine new drug targets and compounds, according to Dr. Michele Wales from InHouse Patent Counsel LLC. That’s shown by drugs such as Benlysta, Raxibacumab, Albiglutide or Darapladib, which were derived from Human Genome Sciences’ sequence databases. Other speakers at the conference saw further potential in areas like identifying drug responders, or finding reasons for noncompliance of therapies. Industry is already aggressively embracing new Big Data approaches, according to a survey of 70 life sciences organisations

As competition among life sciences companies intensifies – and the mix of new medicines skews toward those with relatively small target patient populations – analytic systems that help bring medicines to the right patients and their physicians have become essential. The implementation of decision support algorithms can accelerate the improvement of health outcomes, while also bringing more efficiency to the entire health system. Nearly 60% of survey respondents additionally rated patient apps as extremely or very important to addressing commercial challenges, while 69% similarly rated investments in physician apps. However, Big Data itself can be highly diverse and uninformative without preprocessing. Current limitations include selection bias, sample size, missing values, accuracy, completeness and the nature of reporting resources. Although a few first successes are surfacing, many challenges remain. “Data protection is key,” says Andrew Litt from Dell. “Confidence in genome research cannot undermined by inadequate data protection.” B

04.04.2014 11:58:01 Uhr


Nº 4 | Volume 13 | 2014


Sponsored Article In Focus

Integrating Big Data The analysis and interpretation of massive amounts of biological data poses a significant bottleneck for researchers and clinicians seeking to understand and diagnose diseases. Prof. Dr. Reinhard Büttner explains how his laboratory addressed these challenges, and where he sees the value of pre-tailored commercial data processing systems. ? Prof. Büttner, your lab was one of the first in Europe to implement next-generation sequencing (NGS) into routine clinical diagnostics. How do you use this technology today? ! Buettner: We use NGS in cancer diagnostics and initially started with lung cancer. Today, we have established additional tests for melanoma, chronic lymphomatic leukemia (CLL) and gastro­ intestinal (GI) tumors. However, with about 3,500 cases a year, lung cancer is still the most frequently performed test in our lab followed by GI, melanoma and CLL. In total, we perform almost 5,000 NGS-based tests annually. ? How long did it take to set it up? ! Buettner: The initial set-up of the entire workflow to its routine clinical implementation took us about a year. This is because we needed some time to gain experience with the NGS technology and validate our approach. In fact, we went through a very excessive validation phase to ensure the accuracy of our results. But still, we’re not there where we want to be. I think of this as a process during which we’re continuously implementing new technologies and also new gene sets. It’s really about continuous development, and the current system we use is already our second version lung cancer test.

34-35_EBN_Advert-Qiagen_tg.indd 34

? Are you looking into the entire genome, or a particular set of genes? ! Buettner: We’re working with gene panels that cover a set of around 20-40 genes, depending on the type of cancer. As of today, I don’t think that it makes sense to cover the exome or even entire genome in routine clinical diagnostics. This is because you have only a limited number of potential drugs at hand linked to biomarkers. In addition, it is still quite challenging in terms of sequencing costs and the amount of data you generate. ? How much data do you have to process for one patient? ! Buettner: It’s usually one gigabyte of data per test. This is because we sequence every gene in our panel at a very high coverage, in our case 5,000-fold. ? This must be a significant challenge for your workflow? ! Buettner: I think that we have adapted quite well. When we initially set-up the workflow, we developed our own proprietary data­ analysis pipeline, which first helps us to

Reinhard Buettner studied medicine at the Universities of Mainz, Munich, London, and Cologne, and received his MD at the Pettenkofer Institute for Virology in Munich. After postdoctoral fellowships at the Gene Center Munich and the MD Anderson Cancer Center in Houston, he became a staff pathologist at the University of Regensburg. From 1999–2001 he was a full Professor for Pathology at the RWTH Aachen. In 2001, Buettner was appointed Professor and Chairman for Pathology at the University Hospital Bonn, and since 2011 he has been a Professor and Chairman for Pathology at the University Hospital of Cologne’s Center for Integrated Oncology.

automatically filter the data to identify relevant mutations based upon different algorithms reflecting our quality standards – for instance, the quality of data or coverage depth. The subsequent clinical interpretation is still manual work. This is the most time-consuming part of the entire analysis process – about a day if you deal with 3-5 mutations – but we found the automatic algorithms we tried to implement weren’t reliable enough. ? Is the entire pipeline based upon a proprietary system? ! Buettner: We ended up with a mix of analytical tools that are freely available, such as the Integrative Genomics Viewer and own tools developed in collaboration with our IT department.

04.04.2014 12:55:26 Uhr

Nº 4 | Volume 13 | 2014



Sponsored Article ? Was it a major cost factor for your lab, or is the investment in bioinformatics rather negligible? ! Buettner: It is a significant cost factor. For instance, we had to exchange all the switches and the connections to the existing IT system because the regular connection was too slow to transfer the amount of data we’re working with. So we had to invest in hardware. But you also have to invest in people who generate your algorithms to make sense out of the data. However, I believe these are the typical starting costs associated with getting into NGS. ? Do you think this approach would also be the way to go for other labs? ! Buettner: We have a very large discovery unit around lung cancer, so it made sense for us to invest in bioinformatics infrastructure and personnel. However, there are also vendors out there like QIAGEN that work on pre-tailored commercial bioinformatics solutions, and if you’re a pure diagnostic lab that wants to do NGS-based tests, I think it’s probably sounder to buy such a solution. In fact, I believe this is ideal for larger diagnostic labs that are focused on routine diagnostics and simply want to have a streamlined analysis pipeline with state-of-the-art technology. ? But some argue that relying too much on integrated systems might harm professional standards … ! Buettner: There is little difference to other areas of modern medicine. Just think of an intensive care unit – you have no chance to control the technical credibility of all the para­ meters provided by various instruments. We are dependent on technology. This does not mean that you shouldn’t perform regular controls of your workflow. But the idea that someone can visually control the plausibility of gigabytes of data is simply not practicable; this would lead to many more er-

34-35_EBN_Advert-Qiagen_tg.indd 35

rors. So I rather see this as a purely psychological factor. ? So where do you see the role of the pathologist in this setting? ! Buettner: Applied to cancer, I believe that you need a solution that delivers to the pathologist comprehensive and credible information about all validated mutations and genomic alternations in the sample. But then, the pathologist needs to integrate this information into a comprehensive diagnosis – a conclusive report – which connects the genomic information with data from microscopy and immunochemistry as well as other clinical information about the patient and the particular disease. This is something that you shouldn’t do automatically, in my view. ? Speaking of integrating information, how do you handle the exchange with other institutions or external databases? ! Buettner: Other institutions aren’t directly connected to our genome data. The external institutions we work with – around 100 hospitals and pathologists in Germany – receive a standardized report containing a table that summarizes which genes have been sequenced and at which coverage, what is the allele frequency and the potential functionality of the mutation, and what would be the potential clinical interpretation in terms of treatment or inclusion in a study. ? How do you ensure that your findings and recommendations are up-to-date? ! Buettner: This is a very critical point. When it comes to the clinical interpretation of mutations, we refer to guidelines that make statements about which genes should be tested for in what tumor entities. However, guidelines don’t include lists of drivable mutations, since this knowledge is continuously expanding at a very rapid pace. This can be quite challenging if you find novel or rare

mutations. You might have analyzed the tumor according to guidelines, but still don’t know what to tell the patient. So I think that over time, we need a database that connects even novel and rare mutations to clinical outcomes. This brings us to clinical studies. Today, we have a good overview of all studies performed here at the University of Cologne, and we know precisely the inclusion criteria and can direct patients to the studies if necessary. However, it is much more challenging when you start looking for external studies, especially when new compounds are in an early stage of development. This can be extremely time­-consuming. So I think that in both cases there is room for improvement. ? Which other challenges do you see in the application of personalized approaches like these? ! Buettner: We are generating tremendous numbers of new biomarkers for all kinds of different cancers. These biomarkers literally flood clinical practice, but I’m still observing some skepticism towards targeted treatment approaches – in Germany and elsewhere. We need a broader understanding that we’re truly entering into a completely new age of medicine – an age of rational oncology that examines the molecular alterations driving a tumor and matches it with valid therapeutic concepts. However, cases in which healthcare professionals ask for a full mutation scan only to decide to stick with standard chemotherapy are still commonplace. This is the wrong path, and one which puts our entire healthcare system at risk. We need to invest in modern diagnostic technologies to determine which patients can benefit from particular treatments and act accordingly. If you try to save money in diagnostics, you will lose this money through imprecise therapies. There is still a lot of work that needs to be done here. ii

More lnformation QIAGEN QIAGEN-Str. 1 40724 Hilden Germany;

04.04.2014 12:55:36 Uhr

European Biotechnology

Nº 4 | Volume 13 | 2014

Big Data & IP

Net work

Companion Diagnostics

IP considerations associated with CDx Ramin Ronny Amirsehhi, Amirsehhi Intellectual Property Law; Martinsried

Many pharmaceutical companies are considering a biomarker strategy for their drugs as advances in diagnostic technologies, the growth of biosimilars, a rising number of expiring patents, and concerns about profit margins drive them to adjust their approach.

Join the European Biotechnology Network! The European Biotechnology Network is dedicated to facilitating co-operation between professionals in biotechnology and the life sciences all over Europe. This non-profit organisation brings research groups, universities, SMEs, large companies and indeed all actors in biotechnology together to build and deliver partnerships. Do you want to know more about the advantages of a (free) membership? Just have a look at our website:

European Biotechnology Network Avenue de Tervueren 13 1040 Brusseles, Belgium Tel: +32 2 733 72 37 Fax +32 2 64 92 989

36-37_EBSIN_4_14_Special_Amirsahhi_tg.indd 36

In the race to discover and develop com­ panion diagnostics successfully, many pharmaceutical companies are partner­ ing with biotech companies that are ex­ pert in the appropriate diagnostic or tech­ nology. Working with a diagnostic part­ ner to develop a companion diagnostic, which ultimately also requires obtaining market authorisation, complicates the al­ ready complex drug development and ap­ proval process. An intellectual property (IP) strategy is part of this complex proc­ ess, and plays a key role in the develop­ ment and commercialisation of compan­ ion diagnostics. IP strategies for companion diagnos­ tics include a deep understanding of type of discovery, business objectives, patent eligibility and infringement considera­ tions. While it is beyond the scope of this article to examine all IP considerations for all major jurisdictions, it is important to briefly discuss how patent law is evolv­ ing in the US and Europe. In this first in­ stallment of a series of articles, we look at types of discovery and US patent eligi­ bility issues.

Type of discovery The strength of patent protection is most often correlated with the type of discov­ ery. Patent claims directed at compounds are the strongest type of protection. Such

protection could be used to exclude oth­ ers from using the compound or novel bio­ marker in any type of diagnostic for any type of drug. In most cases, biomarkers are known proteins. A discovery involves their corre­ lation with a particular disease and drug that is under study. This type of discov­ ery may result in method-of-treatment claims, which can be very valuable – es­ pecially if such method claims read on a drug and/or diagnostic label. Method-of-treatment claims need to be drafted with both the drug label and diagnostic label in mind. The labels can be in general terms or include a specif­ ic assay. A claim directed at a specific di­ agnostic assay or technology would in­ clude limitations associated with how the presence of the biomarker is determined. Such claims can be vital in the long run, as they may extend the exclusivity of the drug against generics and biosimilars, since to meet FDA requirements, any ge­ neric or biosimilar equivalent would also be required to include reference to the di­ agnostic assay. The discovery and patenting of new plat­ form technologies can also be valuable to a diagnostic company. It is important to con­ sider in advance how a potential competi­ tor may attempt to avoid patent infringe­ ment while at the same time demonstrat­ ing the equivalence of its device to a legal­

02.04.2014 16:40:15 Uhr

Nº 4 | Volume 13 | 2014

Big Data & IP ly marketed device by submitting a 510(k) premarket submission to the FDA. Therefore, it is important to consider other feasible methods or techniques for measuring the bio­marker and include these in the specification and/or claims. Discoveries that do not impact labels are less valuable, and are usually easier to get around. These include, for example, the discovery of specific assay reagents.

Patent eligibility By now, most people in the life science community have become aware of the US Supreme Court’s decision in Mayo Collaborative Services vs. Prometheus Laboratories, Inc. 132 S.Ct. 1289, (2012 –“Mayo” decision), as well as in Association for Molecular Pathology et al. vs. Myriad Genetics, et al., 133 S.Ct. 2107, (2013 – “Myriad” decision). It is important to highlight the difference between these two rulings. The Mayo decision is entirely independent of the “law of nature“; it is focused on process claims. The Myriad decision did not review process claims, and focused on the genes. There are numerous publications analysing the two decisions. This article will therefore propose strategies for obtaining claims that relate to diagnostic methods or biomarkers. It is important to remember that the following approach-

es are completely applicable only in the US. For other jurisdictions – such as Europe – different scope of claims may be obtained. When drafting applications, applicants should therefore consider these differences and include them in the specification and claims.

Congress and Exhibition


Proposed solutions According to the Mayo decision, correlations between a biomarker and efficacy are a natural law, and are therefore not patentable. The possible approaches in light of this can be claims 1) directed at a method of diagnosing a disease by detecting a novel biomarker, 2) use of a specific reagent (as mentioned above, these types of claims can be easy to circumvent for competitors and should be considered carefully), 3) adding a treatment step (the treatment step can be a particular drug or therapy), or 4) detecting a combination of biomarkers. Genomic DNA is considered a product of nature, and based on the Myriad decision is therefore not patentable. Possible strategies for bypassing this roadblock can be claims directed at cDNA, nonnaturally occurring DNA (such as mutagenized or chemically modified DNA), synthetic RNA or synthetic proteins (it is important to avoid using the term “isolated”), as well as their respective probes and primers.  

Imaging Smart Medical Systems Digital Prevention and Care Early Detection & Prediction Big Data Management & Analytics Regulatory Affairs Market Access Strategies Supplier Manufacturer Relations

 1000 Participants  120 Exhibitors  65 high level lectures in three parallel sessions  International Delegations  Job Wall

© fotogestoeber/

2 - 3 July 2014 NCC Ost NürnbergMesse

36-37_EBSIN_4_14_Special_Amirsahhi_tg.indd 37

Supported by:

02.04.2014 16:40:25 Uhr


Nº 4 | Volume 13 | 2014



Leveraging Big Data for new medicines Manuela Müller-Gerndt, Healthcare & Pharma/Life Sciences, IBM Germany; Dr. Wolfgang Hildesheim, Watson Group, IBM Europe; Fatema Maher, IBM Germany

Almost US$200bn are spent annually on R&D in science-driven sectors such as healthcare, life sciences, consumer products or chemicals. An estimated 60% of pharmaceutical R&D investments, however, are spent on products that will never reach the market. This article looks at how best practices for R&D workspaces could be applied to ensure successful drug development. It all starts with a step-by-step approach towards Big Data, for example based on IBM’s Watson Foundations. Cancer is the second most common cause of death in the developed world. In fact, one in three people will be diagnosed with carcinoma at some stage in their life. To tackle this global problem, pharma firms are investing billions of dollars every year in the development of cancer treatments. Even though R&D costs for drug development are exploding, there are currently about 1,900 cancer medicines in the pipeline. At the same time, the volume, velocity, variety and veracity of data is rising exponentially in medicine. We know today that cancer is often related to genetic factors. To sequence insightful DNA and achieve better treatment options, billions of samples need to be compared. That is generating huge sets of data, so that drug development for oncology has increasingly become a data-driven science bringing together physicians, pharmacologists, molecular biologists, computer scientists and mathematicians to solve complex problems that none of these disciplines could solve individually.

The Big Data challenge This complexity is also reflected in the related data itself, which are often referred to as the characteristics of “Big Data”:

38-39_EBSIN_4_14_Special_IBM_tg.indd 38

– Volume of data: Research data is spread across huge databases in a number of different areas, including patents, compounds, journals, biomarkers, structure activity relationships, medical records and genomics. For example, around one million human genomes were sequenced in 2013, and we expect around five million human genomes to be sequenced by the end of 2014 worldwide with around 15 million new cancer patients each year worldwide. Over 26 million unique molecules are available for new drug development from 400+ different sources in the ChemSpider chemical database. – Velocity of data: Real-time analysis of device data, images and alerts will change the role of monitoring devices in healthcare outcomes and patient well-being. Bedside monitoring devices today capture more than 1,000 vital signs per second. – Variety of data: Around 80% of all data today is unstructured, and this percentage will increase dramatically over time. As health and personalised medicine make advances in the population, even more data inputs can be expected from medical records, notes and dictation, public health reports, scientific papers, social media and the Internet. – Veracity: How trustworthy is the data?

The ambition to deal with these four ’V’s’ can be understood as the definition of the Big Data trend, which we see in oncology, but also in healthcare and nearly all other industries.

A business need in pharma R&D Oncology R&D researchers are challenged by huge amounts of data from different sources in heterogeneous formats that they have to digest and turn into new products more quickly. Given the sensitivity of health records and medical information, as well as the need to protect intellectual property, one of the most compelling issues is security. More and more R&D employees, administrators, managers and senior executives are asking for role-specific workplaces with instant access and proactive delivery of changes to many kinds of information from many internal systems – both within their own organisation and external databases/ online libraries. They want access to patient information, physician opinions, clinical data, medical research studies, product and market information, and regulatory & compliance standards.

Getting up to speed To stay at the forefront, R&D departments in oncology need to tackle the Big Data challenge and prepare to take advantage of the upcoming era of Cognitive Computing. To get up to speed, typically, they go through three phases: – The first phase starts with Information Exploration & Discovery to enable complete visibility into new and historical internal data sets, as well as external research results and publications. – The second phase concentrates on Content Analytics by applying text mining, pattern recognition and predictive analytics (e.g. for target discovery and validation) in order to find correlations between genes and diseases. – Phase three introduces Cognitive Computing, adding natural interaction with oncologists and systems that learn through interactions, deliver evidence-based medical responses and drive better outcomes in the near future.

02.04.2014 16:42:00 Uhr

Nº 4 | Volume 13 | 2014



BIG DATA & IP Since most oncology R&D departments are currently in the first phase, or even a pre-stage of it, this article introduces how organisations can get started with information exploration. Usually the first step is selecting a defined pilot, working with an innovative group and tying together a few internal and external information sources.

State-of-the-art data exploration Modern R&D workplaces ideally include intelligent information exploration that virtually integrates multiple sources into one single access point, provides personalised content, uses best-of-breed search and unique, automatic clustering and categorisation capabilities, and allows users to browse all kinds of historical research information within the organisation and beyond. Dynamic clustering helps users to gain new insights that have not been discovered before. Having the ability to tag, bookmark and comment on documents helps foster collaboration and improve treatment outcomes. Proactive alerts and “push intelligence” let people know when new or updated information is available. The rapid application of building capabilities helps to quickly present customised, role-specific views to the end user. This kind of information exploration is available today, allowing organisations

to maximise the value of their intellectual property, reduce their product development cycle and significantly decrease R&D costs.

One real-life example A leading pharmaceutical company that has been developing innovative medicines for decades needed to focus on competitiveness. That involved transforming information and data sharing capabilities to support faster time to market and new product introductions. R&D needed instant access to patient and research information across the company’s intranet, to many different applications and file systems, as well as information from several external toxicity databases and subscription sources. The company’s executives created initiatives to increase data transparency and productivity throughout the enterprise, worldwide.

Data Exploration Initiative The Watson Data Explorer software was deployed within 10 weeks and gave 30,000 employees worldwide an authorised access to intranet portal-like applications. These portals were built rapidly to cover real-time internal and external information, including networked file systems, the enterprise content management system, files in Microsoft SharePoint, an employee directory

Unlock the value of information when users need it the most

Watson Explorer

gation Discovery & navigation Data access & integration Clustering rization • C lustering & categorization Providing unified, real-time access • Index structured & unstructured data—in place ence • Contextual intelligence and fusion of big data unlocks • Support existing security ications • Easy-to-deploy applications greater insight and ROI • Federate to external sources oday’s big data • All at the scale required for today’s big data • Leverage MDM, governance, and taxonomies challenges

Create LL unified view of ALL Create unified view of ALL Improve customer service & me information for real-time information for real-time reduce call times monitoring monitoring

n risk nce


Increase productivity & leverage past work increasing speed to market

Analyze customer data to unlock true customer value

Identify areas of information risk & ensure data compliance

Watson Explorer – a first step towards intelligent 360º view workplaces including structured and unstructured data.

38-39_EBSIN_4_14_Special_IBM_tg.indd 39

and external internet pages and subscription sources. The solution preserved existing security parameters, so that employees could only access content they were authorised to view. Security is supported at group, user, document and the even more granular field levels. Watson Explorer has enabled touch points throughout the drug discovery process, from initial research to clinical trials. R&D staff obtained an easy navigation tool and a way to filter data quickly. Scientists can now retrieve an overview, drill down into specific topics and discover content that might remain uncovered. They can also scan multiple synonyms for medical terms and phrases, while critical R&D employees receive alerts of changes in support-of-compliance efforts. Collaboration capabilities allow users to identify relevant content to colleagues by commenting and tagging results. By globally leveraging past research around the history of compounds for formulation and noted effects, R&D was able to drastically reduce duplicate efforts. In addition, sales representatives became more productive by gaining access to the newest policies, external news, marketing collateral and a doctor’s latest purchases. Finally, the company knowledge base supported hiring and retraining transitioning employees. A knowledge-sharing and collaboration culture was created, as well as subject-matter experts, who maximise one of the company’s most valuable assets – their data. Watson Data Explorer has already delivered measurable business value to many pharma companies. In the example described above, the efficiency of R&D was improved by 90%. Cutting search by 50% saved the company millions of dollars in the first year alone. In addition, sales rose by 4.1%, training costs were reduced by 10% and new staffing requirements decreased by 1.2%, saving US$13.4m per year.  References [1] Jaruzelski, B., Loehr, J., Holman, R.: THE GLOBAL INNOVATION 1000: Making Ideas Work. B. Company,Editor. 2012. [2] IBM Institute for Business Value analysis. [3], DKFZ 2013

02.04.2014 16:42:09 Uhr


N –º 4 | Volume 13 | 2014


BIG DATA & IP Interview

Big Data can’t make gold from bad data Life sciences companies are growing increasingly interested in Big Data as they seek to combine information about disease pathways with patient data and critical inclusion criteria for clinical trials or electronic medical records. EuroBiotechNews spoke with SAP’s Peter Langkafel, the software giant’s General Manager Public Sector/Healthcare MEE, about the impact of Big Data and its current limitations in life sciences applications.



Dr Langkafel, is the idea of Big Data being over­hyped? LANGKAFEL:


© mhristov -

“Big Data” in healthcare is an El Dorado for some people, but it is still a very diffuse term. There is a huge potential in some areas – such as better integrating research data and clinical day-to-day data. But Big Data can also mean “going beyond borders”, which means the integration of ambulatory and in-house care and the analysis of what is happening there. This is at times a complex topic due to organisational and legal issues. But there’s no doubt there is huge potential. That said, there is hype as well. For some people, Big Data is pretty much only associated with personalised medicine – with genomic analysis, which “promises” cures for all kinds of diseases in the

future. For me, the so-called “potential” here is hype.



In which areas does SAP see the greatest market potential for Big Data solutions within the healthcare industry? LANGKAFEL:


We’re involved in projects with providers like the Charité (Europe´s biggest university clinic), healthcare providers like the AOK (Germany´s biggest health insurer), research organisations like the cancer research centres in Heidelberg and Stanford, and Health Maintenance Organisations like Kaiser Permanente in the US. I would describe three phases for Big Data against this backdrop. First, it is about understanding and maybe better visualising data. Then you have to align and match data sources – which has not been possible to this extent before. The third phase is to really create new data or add some data outside of the traditional ecosystem. And I’d like to point out that just a few years ago we still had technical difficulties, but today technology hurdles like memory represent unleashed potential. Technology is no longer necessarily the obstacle.



What is Big Data about, and how can data integration help improve patient recruitment for clinical trials? Or the identifica-

40-41_EBSIN3_14_Interview_SAP_tg.indd 40

Dr. med. Peter Langkafel is SAP AG’s General Manager Public Sector/Healthcare MEE (Middle and Eastern Europe) and the President of the Berlin/Brandenburg German Association of Medical Informatics (BVMI e.V.) He has over 20 years of experience in healthcare and information technology. With a PhD in medicine, a degree in medical informatics, and an executive MBA, Langkafel was a clinician and researcher himself for many years. He has written a long list of publications, and has appeared as a speaker at over 100 healthcare conferences.

tion of drug responders in areas like oncology, the identification of new uses for already approved drugs, establishing electronic medical records, etc.? LANGKAFEL:


There are many existing cases that can illustrate this process. We’ve tried to understand what has happened in the past with traditional reporting. Currently we are on our way to understanding the situation in real-time. And Big Data technology will help us to predict and look into the future…so it’s not just about what has happened, but what is happening and what will happen.



What tools are needed and used to manage inherently imprecise data types like that involving stratification biomarkers? LANGKAFEL:


The golden rule of any business intelligence project is: garbage in — garbage out. If you have bad data, you can’t turn it into gold dust through some miraculous process. We do provide tools for data cleansing

03.04.2014 14:49:20 Uhr

N –º 4 | Volume 13 | 2014

or master data management, but these are mainly technical tools that can’t “understand” the content of imprecise data. There is a lot of effort being put into trying to get unstructured data (like a doctor’s discharge letter) into the system. Most of these projects promise a lot, but only reach accuracy levels of 60% to 80%. Something is missing, but it’s not clear what. So we can’t use this approach. My recommendation is always to try to feed structured data into the system!


»Larger is good. Smarter is better.«


What are current drivers, and where do you see limitations? LANGKAFEL:


Technology is certainly a driver, as well as the growing volume of digital data in healthcare. We’ve started to view “data” as something very valuable – but we are in an early phase of really understanding its power. Data security is for me not a limitation, but an absolute prerequisite. We should be asking our data security officers as well about who will protect our data from NOT being used. Our main limitations are the silos in the organisation and a lack of statistical understanding. Or let me put it this way: Big Data gives a lot of answers, but we have to ask the right questions first.



What has been already achieved by SAP, and what are your plans for the future? LANGKAFEL:


SAP is the frontrunner in memory technology: With SAP HANA, we have an extremely powerful tool in our portfolio, which is now even available in a cloud offering. The SAP Healthcare platform brings together new ways of using and integrating data. We are delivering solutions in that area, and we are co-innovating with customers – like those I mentioned before – around the world.



Data from academic institutions are often not compatible with the requirements of the biopharmaceutical industry when it comes to things like assay design, reproducibility or biomarker cut-off values. What kind of standardisation do you think is needed to support efficient translation into applications? LANGKAFEL:


Any discussion about standardisation should ask the same open question: Why is there no standard? Who has interest in NOT having a standard? Maybe for economic reasons, to protect an area (or a market) or to protect fraud and abuse, or to protect AGAINST fraud and abuse? I am personally a big fan of the the British Medical Journal (BMI) Open Data Initiative. Everybody who has nothing to hide should allow others to have a look in – under a certain operation mode, of course (data security, IP, legal). Here we need new ways for academia and industry to collaborate. F

40-41_EBSIN3_14_Interview_SAP_tg.indd 41

U.S. and EUropEan LifE SciEncE patEnt proSEcUtion


New office at the innovation and start-up center for Biotechnology in Martinsried: am Klopferspitz 19 | 82152 planegg/Martinsried | Germany | Tel.: +49 (0)89 4516 9990 | Mobile: +49 (0)176 64 63 33 81

07.04.2014 10:48:09 Uhr

2014 04 ebsin special bigdata  
Read more
Read more
Similar to
Popular now
Just for you