Page 1

HHPR Harvard Health Policy Review

Generating Signal from Noise:

Volume 14 Issue 2

Big Data’s Big Challenge


Editor’s Note 3 FEATURES Population Analytics and Big Data 4 Conflict Between Adolescent’s Rights to Confidential 8 Health Care and Meaningful Use Requirements for Personal Health Record Access

Muhammad Sarib Hussain Martin S. Kohn, MD, MS, FACEP, FACPE Jorge A. Gálvez, MD, Allan F. Simpao, MD, Mohamed A. Rehman, MD

The Role of Health Information Management 11 Allan F. Simpao, MD, Jorge A. Gálvez, MD, Professionals in Data and Information Governance Mohamed A. Rehman, MD The Development of MyCancerJourney and the 14 Jay F. Piccirillo, MD, Dorina Kallog jeri, MD, MPH, Incorporation of Predictive Analytics to Improve Sara Kukuljan, BS, RN, CCRC, Robert Palmer, MBA Cancer Patient Care Predictive Analytics: Advancing Precision and 16 George Savage, MD, MBA Population Medicine HEALTH HIGHLIGHTS Leaping the Data Chasm: Structuring Donation of 18 Patrick L. Taylor, JD, Kenneth D. Mandl, MD, MPH Clinical Data for Healthcare Innovation and Modeling Academic Institutions’ Critical Guidelines for Health 22 Hilarie Cranmer, MD, MPH, Miriam Aschkenasy, MD, MPH, Care Workers who Deploy to West Africa for the Ryan Wildes, Stephanie Kayden, MD, MPH, David Bangsberg, Ebola Response and Future Crises MD, MPH, Michelle Niescierenko, MD, Katie Kemen, Kai-Hsun Hsiao, MBChB, Michael VanRooyen, MD, MPH, Frederick M. Burkle, Jr, MD, MPH, DTM, Paul D. Biddinger, MD Using Data to Drive Innovation in Health and 25 John Quackenbush, PhD Biomedical Research The United Countries of America, Benchmarking the 28 Charles W. Slack, AB, Warner V. Slack, MD Quality of US Health Care Privacy: Does Identifiability of Data Mean that Privacy 30 Isaac Kohane, MD, PhD, Patrick L. Taylor, JD is Dead? STUDENT CONTRIBUTIONS Using Predicative Modeling to Redue Readmissions: 32 Tanmoy Das Lala Policy Implications Whole Genomic Data Integration Into the Electronic 34 Achal P. Patel Health Record: Examining Current Practical and Ethical Challenges


NAMCShiny: An Interactive Web Application 37 Jean Fan, Kamil Slowikowski to Explore Health Trends in 2003-2010 National Ambulatory Medical Care Survey Data Harvard Health Policy Review

editor’s note With the sequencing of the human genome at the turn of the millenium, the life sciences have emerged belatedly to command the traits traditionally associated with the physical sciences: the mantle of mystery, the aura of other-worldliness, the promise of discovery and advancement together with the funding to pursue it. With vast sums being poured into the field, it has expanded like never before generating raw information at a breathtaking pace-so raw in fact that we have yet to divine the truth that lies in our genomes for our genes offer but a glimpse of it. This has been accompanied by leaps in technologies that supply data and the technologies that organize and process it and the overlap of all three is to be found in the nascent field of bioinformatics.

Editor in Chief Muhammad Sarib Hussain Managing Editor Eunah Lee Business Manager D.J. Brooks Design Chair Melinda Song Publishing Chair Kamran Jamil Technology Chair Young Mi Kwon Senior Editors Bridgette Slater, features Edna Wang, student contributions Natalie Cho, online content Associate Editors Eric Li, business Marcus Gutierrez, business Benjamin Zheng, business Fatma Akcay, design Hueyjong Shih, design Sandy Wong, health highlights Risham Dhillon, health highlights Megan Gao, student contributions Anthony Thai, student contributions Chelsea Wang, online content Raghu Dhara, online content Shayla Partridge, online content Hanna Amanuel, online content Patric Cao, online content, web design Jiahui Huang, web design The Harvard Health Policy Review is an undergraduate publication of Harvard College. The Harvard name and Veritas Shield are trademarks of the President and Fellows of Harvard University.

Harvard Health Policy Review Harvard University Student Organization Center at Hilles Box #40 59 Shepard Street Cambridge, Massachusetts 02138

©2015 by President and Fellows of Harvard University. All rights reserved. No part of this publication may be reproduced in any form without express written consent from the publisher.

To say that policy has not kept pace with such advancements is to repeat a truism of our times. While these advancements doubtless herald the dawn of a new age, helping us gain traction on previously intractable health problems, they also present complex new ways in which society must interact with the data its constituent members generate. The matter is made more complex by the uniquely identifiable and inherently private nature of the data in question. This complexity is a natural consequence of the fact that data derived from our biology is inherently tied to our biology and thus to ourselves. As such the field raises profound questions about not just the most efficient and effective use of the data but also about the need to do so responsibly reigning in our scientific ambitions when they fly in the face of the ethics that must necessarily inform so delicate a matter as our health. Our Spring 2015 issue attempts to showcase the field of bioinformatics in all its attendant complexity. Our Features section opens with a piece highlighting the importance of applying individual level predictive analytics to strengthen evidence based medicine and argues against the population level analysis that has been the classical focus of the field. This is followed up by two articles discussing bioinformatics in the context of increasing reliance on electronic health records. The discussion here centers on privacy concerns stemming from the data and management of the data itself so as to keep pace with ever evolving technology and the regulation surrounding it. The final two pieces closing out our Features section offer us a glimpse of what use this new information can be put to through the lens of two separate data logging technologies that can prove insightful for providers and the patients they serve. In Health Highlights, we approach the subject from a different angle, with the opening piece discussing mechanisms that may be implemented to facilitate ‘data donation’ in service of innovation. Unrelated to bioinformatics but of incredible importance given the recent ebola crisis in West Africa, we feature a piece by leading professionals that offer guidelines for health care workers deploying to Ebola-stricken regions. The third article in Health Highlights provides us with a timely reminder that data can divine only if it is in the right pair of hands and is subject to the most rigorous handling. This is followed by a fascinating discussion on why US healthcare challenges might benefit more from interstate comparisons that cross-country ones. Our final entry in Health Highlights cautions against an overzealous concern for preventing the misuse of Big Data that can inadvertently hinder the innovation which stems from its appropriate application. Our Student Contributions section covers such diverse topics as integration of genomic data into electronic health records, discussion of novel web based tools that can provide the analysis necessary for evidence based health policy and the use of predictive modeling to reduce hospital readmissions. Together these carefully curated pieces in our Spring 2015 issue offer an informative overview of bioinformatics- a project that would have been all but impossible had it not been for the efforts of our contributors, our advisers, our managing editor and the rest of our dedicated staff on our editorial review, design, technology, publishing and business boards. We do hope you enjoy this latest edition of our publication. Sincerely, Muhammad Sarib Hussain Editor-in-Chief 2014-15

about the cover Cover illustration modified from stock image via Shutterstock by Hueyjong Shih.

acknowledgements The Harvard Health Policy Review would like to thank the Harvard Undergraduate Council for their generous support of this publication.

Spring 2015 Volume 14, Issue 2



Population Analytics and Big Data Martin S. Kohn, MD

One-third of what is spent on healthcare in the US is for things of no value. Reducing that waste requires making better, more personalized decisions. For decades, clinical decisions were supported by reports of controlled research studies, involving comparisons of two populations of patients. While useful, such studies had limited impact because the result of such studies, stating that, on average, one group did better than the other, could not reliably be applied to an individual patient. In addition, there is evidence that many of those studies were flawed. With increasing demand for healthcare to be both clinically and economically more effective, we need to be able to make better decisions for each patient. That goal, sometimes called personalized healthcare, requires new forms of analysis, using huge volumes of real world data that can yield insights applied to individuals, not groups.

fact that they often not performed or reported well. It is an unfortunate reality that many published research reports are false and that many studies cannot be reproduced.6,7,8 A review of published reports on cancer therapy concluded that fewer than half of the studies used appropriate analyses, leading to unreliable results.9 Similar issues exist in reports of drug development trials.10 Further, publication bias, in which studies with positive or significant results are more likely to be published, leads to overestimation of treatment effects.11 Older patients tend to be underrepresented in studies and some statistically significant results may be clinically inconsequential.12 Thus, the credibility of EBM has been It is a sobering fact that perhaps a third of all the money spent on diminished by the limitations of the published evidence, mostly RCTs healthcare in the United States is wasted, for things of no value.1 and meta-analyses.13 Nonetheless, such studies were the evidence A 2000 report from the Institute of Medicine states that avoidable that was used for clinical decisions for decades. A current goal is hospital errors resulted in as many as 98,000 deaths.2 A follow-up personalized healthcare – making decisions that are more likely to report ten years later showed essentially no improvement in patient be beneficial for the individual. In order to better understand the safety.3 needs of an individual, we have to go beyond traditional population Fixing these persistent problems will require making it easier to analytics to find cohorts that are very similar to the individual in make evidence-based decisions. Evidence-based medicine (EBM) question, well beyond the few characteristics that are used to argue has been a goal in healthcare for at least 20 years, to reduce the that the treatment and control groups in RCTs are alike. The need reliance on unsystematic use of clinical information. The focus to extend EBM beyond population-based studies to include other of EBM has been accessing, evaluating and interpreting medical forms of evidence has been recognized for decades.14,15 However, literature.4 The nature of the literature means that EBM has been it is only relatively recently that we have had both access to large built on population-based metrics.5 The value and limitation of amounts of different kinds of data and the necessary analytic tools. population analytics is contained in its name: it looks at populations, Population analytics are often the basis for determining and and gives insight into the circumstances or conditions of groups. mitigating risk factors. For example, patients with elevated cholesterol However, it is now understood are more likely to have a heart that we have to focus on The value and limitation of population attack than people with lower the specific attributes of cholesterol. However, there individuals if we are to make analytics is contained in its name: it looks are patients with highly elevated sound clinical decisions. at populations, and gives insight into the cholesterol who never develop Population-based studies are disease. Thus, based circumstances or conditions of groups... it is cardiac relatively unhelpful in focusing on a population risk factor, on individual patients. Most now understood that we have to focus on the one might well put all patients published research trials specific attributes of individuals if we are to with elevated cholesterol on involve outcome comparisons a cholesterol-lowering drug. between a treatment group, make sound clinical decisions. However, that means that receiving a new intervention, patients who would not have and a control group that either receives an older treatment or developed heart disease, despite high cholesterol, will receive a drug none at all. The gold standard for such studies is the randomized with no benefit. Since most drugs have adverse effects, sometimes controlled trial (RCT), which tries to eliminate potential sources of severe, patients who receive a drug that does not help them risks bias either on the part of the researchers or the patient groups. But complications or adverse events while providing no benefit. The very little insight is provided about the individuals in each of the two population statistic of “number needed to treat� (NNT) is the groups. For example, if the treatment group outcome is statistically number of patients that have to receive a treatment (such as a drug) better than the control group, it is likely that some of the treatment in order for one patient to avoid a negative consequence. That group patients actually did worse, but the information indicating number allows us to see what fraction of patients would and would their deterioration was lost in looking at the group average. Similarly, not benefit from the treatment, but does not help identify which if the treatment group did not show improvement, there may well patients would benefit. Consider statins, a very commonly prescribed have been a sub-group that did respond well to the new treatment, medication to lower cholesterol. Some meta-studies conclude that but, again, that impact was blunted by only looking at the overall 1 in 60 were helped in reducing heart attacks, 1 in 268 were helped response of the group. in reducing incidence of stroke, and no improvement was observed The utility of published studies is further compromised by the in all-cause mortality while 1 in 10 experience muscle damage.16 Spring 2015 Volume 14, Issue 2


How much better would it be, both clinically and economically, if we could be more precise and not have to treat 60 people to help one person? Improving on the NNT requires learning more about individual patients and possible interventions. If we can identify which patients will benefit from statins ahead of time, and, conversely, which patients will not benefit, then we can focus treatment more effectively while reducing adverse events. Among the contributions to waste and inefficiency are overtreatment and the failure to implement care processes that are known to be helpful. The only way to reduce that waste is to make better decisions. That means using all the available information to make decisions that are more likely to be beneficial for the individual. It requires predictive analytics, to predict and augment diagnostic and treatment decisions.17 We must shift our approach from treating the disease to treating the patient. It will result in treatments optimized for the patient, rather than relying on averages from traditional controlled studies.18 Analyses of real world data, rather than the dichotomous view of controlled studies, are more likely to get us to the goal of personalized healthcare.19,20 We now live in a “big data” environment, with huge amounts of health related data of all kinds. Big data provides great opportunities for new insight, but also great challenges. Big data has four characteristics:21 1. Volume – vast amounts of data, such as from millions of patient records 2. Velocity – rapid increases in the amount of data 3. Variety – many different kinds of data (e.g. images, genomes, demographics, home monitoring measurements) 4. Veracity – much of the data is unreliable. 80% of it may be uncertain Those characteristics (See Figure 1) mean that big data cannot be analyzed with older techniques. We need new, predictive analytic tools to transform that big data into insights that can be used at the clinician-patient interface to improve outcomes.22

Photo from Pixabay, Creative Commons Public Domain Dedication.

One approach to obtaining more prevent the deterioration, improving quality personalized insight is sometimes of life and reducing cost. It will be possible called patient similarity analytics.23 to guide patient treatment by predicting the Using potentially thousands of patient clinical course following the treatment.24 Big characteristics from existing, real world data, data analytics will help with patients with such as from electronic health records and multiple chronic diseases, rarely addressed claims data, for large numbers of patients, in research trials, and high-risk, high-cost a cohort of very similar patients can be patients.­25 created. Whether one calls this Although traditional an evolution in population We must shift population-based studies are analytics or a de-emphasis of our approach becoming less important for population analytics in clinical clinical decision-making, there decisions support is an issue from treating are issues where population of semantics, but it is a clear the disease to metrics are relevant, such as change in roles. public health assessments, treating the Predictive analytics also determining risk factors for provides the opportunity to patient. acquired diseases or needs move healthcare from reactive, assessment for a group of as in waiting for a problem to develop, to patients to be managed.26 An increasing preventive. Predictive analytics in big data number of provider organizations are allows the perception of patterns that adopting an accountable care model, in can identify a public health issue before which they agree to provide high quality it becomes serious. Similarly, integrating care, with improved outcomes, for a fixed home monitoring data from patients with amount of money. Such organizations need chronic diseases with other health data to thoroughly understand the health status to predict clinical deterioration before it of and requirements for the population, such occurs, when a modest intervention could as identifying high risk and high utilization patients they are considering taking on.27,28 Figure 1: Characteristics of Big Data If they cannot accurately predict the needs of those patients, or how to allocate services Volume Velocity for those patients, they can readily perform Rapid increase in the amount of data poorly and lose large sums.29 Vast amounts of data, such as from millions of patient records Population analytics can be used to assess the impact of the implementation of Variety Veracity personalized healthcare and other healthcare Many different kinds of data Much of the data is unreliable. reform efforts.30 If healthcare is improved at (e.g. images, genomes, demographics, 80% of it may be uncertain. the individual level it should appear at the home recording measurements) population level in terms such as quality






Harvard Health Policy Review

Predictive analytics also provides the opportunity to move healthcare from reactive, as in waiting for a problem to develop, to preventive. Predictive analytics in big data allows the perception of patterns that can identify a public health issue before it becomes serious. of life, cost, etc. Population analytics will continue to be used for implementation and assessment of public health measures such as vaccination programs. But even public health issues will have to look deeper than the impact of some single factor across the whole population. We can explore some of long held assumptions about population health and identify challenges to improving healthcare, such as sodium restriction or racial and economic disparities.31,32,33 But for it to be useful, the quality of populationbased analytics will have to improve, avoiding common errors such as incorrect statistical analysis, to be truly valuable.22 Both population-based analytics and

predictive big data personalized analytics will be important in healthcare. However, roles are changing as health policy and goals evolve. Population analytics will focus on broader issues and become substantially less important in making decisions about individuals. Predictive analytics, as both the volumes of kinds of data increase, will begin to rely on rapidly developing information technology tools. The challenge in, and importance of, health analytics has been recognized, leading to new medical subspecialties like clinical informatics. In all, improved use of data is the future of healthcare.

Dr. Martin Kohn, an alumnus of MIT, Harvard and NYU, is Chief Medical Scientist at Sentrian, Sentrian performs predictive analytics on home monitoring data to reduce avoidable hospitalization in patients with chronic diseases. He was previously Chief Medical Scientist at IBM Research. Dr. Kohn is board-certified in emergency medicine and clinical informatics.

1. Berwick D, Hackbarth A. Eliminating Waste in US Healthcare. JAMA 2012:307(14) 15131516 2. Kohn LT, Corrigan JM, Donaldson MS, Editors; Committee on Quality of Health Care in America, Institute of Medicine. To Err Is Human: Building a Safer Health System. National Academy of Sciences 2000 3. To Err is Human – To Delay is Deadly. Consumer Reports Health. SafePatientProject. org 4. Evidence-based medicine: a new approach to teaching the practice of medicine. Evidencebased medicine working group. JAMA 1992;268:2420–2425. 5. Winters-Miner LA, Bolding PS, Hilbe JM, et al. Practical Predictive Analytics and Decisioning Systems for Medicine, page 36. Elsevier 2015 6. Naik G. Scientists’ Elusive Goal: Reproducing Study Results. Wall St Jl. Dec 2, 2011 7. Smith R. Where is the wisdom. ..? BMJ 303(5) OCTOBER 1991 pp798-799. 8. Ioannidis JPA (2005) Why most published research findings are false. PLoS Med 2(8): e124 9. Murray DM, Pals SL, Blitstein JL. Design and Analysis of Group-Randomized Trials in Cancer: A Review of Current Practices. J Natl Cancer Inst 2008;100: 483 – 491 10. Prinz F, Schlange T, Asadulla K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011 Aug 31;10(9):712. doi: 10.1038/ nrd3439-c1. 11. Dwan K, Altman DG, Arnaiz JA, Bloom J, Chan A-W, et al. (2008) Systematic Review of the Empirical Evidence of Study Publication Bias and Outcome Reporting Bias. PLoS ONE 3(8): e3081. doi:10.1371/journal.pone.0003081 12. Boot CM, Tannock IF. Evaluation of Treatment Benefit: Randomized Controlled Trials and Population-Based Observational

Research. JCO Aug 2013 328-3299 13. Feinstein AR, Horwitz RL. Problems in the ‘‘Evidence’’ of ‘‘Evidence-based Medicine’’ Amer J Med. 103:529-535, Dec 1997 14. Sackett DL, Rosenberg WMC, Muir Gray JA, et al. Evidence based medicine: what it is and what it isn’t. BMJ. 1996;312:71–72. 15. Lathem W. Individual Medicine. New Eng J Med 1968;279(1):2216. accessed 1/4/2015 2:05 PM EST. 17. Winters-Miner LA, Bolding PS, Hilbe JM, et al. Practical Predictive Analytics and Decisioning Systems for Medicine. Page 5. Elsevier 2015 18. Winters-Miner LA, Bolding PS, Hilbe JM, et al. Practical Predictive Analytics and Decisioning Systems for Medicine. Page xxxiv. Elsevier 2015 19. Van Bebber SL Trosman JR, Liang S-Y, et al. Capacity building for assessing new technologies: approaches to examining personalized medicine in practice. Per Med . 2010 July ; 7(4): 427–439 20. Bradley EH, Curry LA, Devers KJ Qualitative Data Analysis for Health Services Research: Developing Taxonomy, Themes, and Theory Health Serv Res. 2007 August; 42(4): 1758– 1772 21. Global Technology Outlook 2012, IBM 22. Morris AH, Ioannidis JPA. Limitations of Medical Research and Evidence at the PatientClinician Encounter Scale. CHEST 2013; 143(4):1127–1135 23. Sun J, Wang F, Hu J, Ebadollahi S. Supervised patient similarity measure of heterogeneous patient records. ACM SIGKDD Explorations Newsletter 2011;14(1):16-24 24. Winters-Miner LA, Bolding PS, Hilbe JM, et al. Miner Practical Predictive Analytics and Decisioning Systems for Medicine. Page 43.

Elsevier 2015 25. Bates D, Saria S, Ohno-Machado L, Shah A, Escobar G. Big Data In Health Care: Using Analytics To Identify And Manage High-Risk And High-Cost Patients. Health Affairs, 33, no.7 (2014):1123-11 26. Winters-Miner LA, Bolding PS, Hilbe JM, et al. Practical Predictive Analytics and Decisioning Systems for Medicine. Page 55. Elsevier 2015 27. Felt-Lisk, S, Higgins T. (2011). Exploring the Promise of Population Health Management Programs to Improve Health.Mathematica Policy Research Issue Brief. http://www. health/PHM_brief.pdf 28. Gotz D, Stavropoulos H, Sun J, Wang F. ICDA: A Platform for Intelligent Care Delivery Analytics. AMIA Annu Symp Proc. 2012; 2012: 264–273 29. Fisher ES, Shortell SM, Kreindler SA. A Framework For Evaluating The Formation, Implementation, And Performance Of Accountable Care Organizations. Health Affairs, 31, no.11 (2012):2368-2378 30. Roski J, McClellan M. Measuring Health Care Performance Now, Not Tomorrow: Essential Steps To Support Effective Health Care Reform. Health Affairs, 30, no.4 (2011):682689 31. Strom BL, Yaktine AL, Oria M, Editors, Sodium Intake in Populations: Assessment of Evidence. The National Academies Press, 2013 32. Reinhardt UE. Does The Aging Of The Population Really Drive The Demand For Health Care? Health Affairs, 22, no.6 (2003):27-39 33. Hsia R Y-J, Asch SM, Weiss RE, et al. California Hospitals Serving Large Minority Populations Were More Likely Than Others To Employ Ambulance Diversion. HEALTH AFFAIRS 31, NO. 8 (2012): 1767–1776

Spring 2015 Volume 14, Issue 2


Photo from Pixabay, Creative Commons Public Domain Dedication. FEATURES

Conflict Between Adolescents’ Rights to Confidential Health Care and Meaningful Use Requirements for Personal Health Record Access Jorge A. Gálvez MD, Allan F. Simpao MD, Mohamed A. Rehman MD Pediatricians participating in the Meaningful Use program face a particular challenge in adolescent clinics. Specifically, adolescents are able to seek care for reproductive services without obtaining explicit consent from their guardians. The information surrounding such encounters must be maintained confidentially in order to maintain the patient’s trust and access to essential health care services such as pre-natal care and treatment of sexually transmitted diseases. Electronic health records with patient portals pose a potential threat to inadvertent disclosure of confidential information to unintended parties, thereby breaching the patient-physician trust relationship and HIPAA regulations. This paper will evaluate: • Stage 2 Meaningful Use attestation requirements for Eligible Provider’s and Eligible Hospital’s reporting of patient access to a personal health record. • Strategies for implementation of patient portals to the Electronic Health Record system • Strategies for healthcare providers to address access to the patient portal with adolescents and their parent(s)/guardians Introduction Adolescent medicine has the unique challenge of managing the care of an individual who is fully mature (or nearly mature), but still requires parental or guardian consent to receive medical care. There are certain situations when adolescents are able to seek diagnosis and treatment for medical conditions without consent of their parent or guardian; specifically, adolescents may independently give informed consent when considered a mature minor, an emancipated minor or in an emergency situation.1-3 Although requirements vary from state to state, policymakers generally uphold adolescents’

right to confidentiality when seeking care for reproductive services such as diagnosis and treatment of sexually transmitted diseases and pregnancy. HIPAA regulations extend to the protection of privacy and confidentiality for adolescent patients.3 The Meaningful Use program under the Affordable Care Act, however, introduces a new challenge to adolescent privacy and access to care. Stage 2 requires Eligible Providers (EP) and Eligible Hospitals (EH) to provide electronic access to a personal health record portal to patients or their guardians.A According to the requirements, 50% of patients are required to have access to their

information within 4 days of it being available to the provider or hospital. Furthermore, 5% of patients must have accessed their information. This requirement poses a particular challenge to providers currently participating in meaningful use attestation, since there are regulatory requirements to maintain specific information confidential. Furthermore, providers are able to exercise judgment to withhold information from the personal health record if they feel that a patient may be at risk of harm if the information is released in this manner. The meaningful use requirements for EP’s and EH’s include a statement to “restrict from disclosure due to any federal, state or local law regarding the privacy of a person’s health information, including variations due to the age of the patient or the provider believes that substantial harm may arise from disclosing particular health information in this manner.”B Historically, adolescents will avoid medical care when they feel that their confidentiality cannot be maintained.4 Some groups of adolescents who engage in unprotected sexual activity may be at higher risk of unplanned pregnancies and sexually transmitted diseases. Approximately 12 million adolescents are affected by sexually transmitted diseases each year.5 Furthermore, almost 70% of adolescents report being sexually active at some point by senior year of high school.5 Pediatricians must continue to extend services to adolescents to

A: B:


Harvard Health Policy Review

reduce the continued propagation of sexually transmitted diseases and assist in informed decision-making regarding family planning. The legal requirements for parental involvement in obstetric care have evolved over the years. Currently, adolescents are able to seek pregnancy testing and prenatal care confidentially. Furthermore, Title X of the Public Health Service Act allowed for nationwide system of health clinics to provide family planning to anyone interested in receiving the services, which included contraception.C Also, adolescents are able to seek contraceptive care confidentially within the confines of the law.2 On the other hand, there is increased variability in the notification and confidentiality requirements for individuals seeking termination of a pregnancy based on state requirements (i.e. require parental presence; no parental presence required; parental consent required). Meaningful Use requirements specific to confidentiality and minors EP’s and EH’s follow various strategies to address the meaningful use personal health record access for patients and their proxy. The Department of Health and Human Services responded to a public comment regarding the potential breach of HIPAA regulations by providing patients and their proxy access to personal health record as mandated by the Meaningful Use requirements in the September 4th issue of the Federal Register, 45 CFR Part 170 as follows: “Comment: Some commenters suggested that patients under the age of 18 should not have the same access to the same information to which adult patients have access and requested a separate list of required elements for patients under the age of 18. Response: An EP may decide that online access is not the appropriate forum for certain health information for patients under the age of 18. Within the confines of the laws governing guardian access to medical records for patients under the age of 18, we would defer to the EP’s judgment regarding which information should be withheld for such patients. In lieu of providing online access to patients under the age of 18, EPs could provide online access to guardians for patients under the age of 18, in accordance with state and local laws, in order to meet the measure of this objective. Providing online access to guardians in accordance to state and local laws would be treated the same as access for patients, and guardians could then be counted in the numerator of the measure.

We recognize that state and local laws may restrict the information that can be made available to guardians, and in these cases such information can be withheld and the patient could still be counted in the numerator of the measure. No requirement of meaningful use supersedes any Federal, State or local law regarding the privacy of a person’s health information.”6 From a practical standpoint, the EP can exercise judgment to determine which and to whom (a patient, their proxy, or both) the information is appropriate to release via the personal health record portal. Strategies for implementing personal health record access Patient has full access to PHR, proxy has no access This solution eliminates the risk of a proxy accessing confidential information in the personal health record because they would not be issued credentials to access the information. However, there is a possibility that a patient’s credentials could be used by a proxy or another individual to gain access to the confidential information in the Personal Health Record. Patient has full access to PHR, proxy has limited access This solution attempts to minimize the risk of inadvertent confidentiality breaches by limiting the proxy’s access. This strategy is likely to be imperfect due to the imperfect nature of clinical documentation. All of the providers and administrative staff who are engaged in reproductive care would need to follow standard documentation practices to ensure that the confidential information is not stored in a form that could be accessible by a proxy in a personal health record. Furthermore, this area would require constant oversight from the personal health record team to ensure that upgrades in the documentation structure do not result in unintended disclosures of confidential information. Patient has full access and can grant full access to parent Some electronic health records may not have the capability to create different access profiles for a patient’s personal health record account. In other words, all user accounts that are linked to a patient’s personal health record account have access to the same information. In this scenario, the EP or EH could limit all of the information that is available on the personal health record profile to ensure exclusion of confidential information. After reviewing the EP and EH required items, the only items that would

pose minimal risk of disclosing confidential information are: Patient name, current allergy list and medication allergy list, smoking status and demographic information. Regardless of which implementation strategy is used for the personal health record, providers must be aware when confidential information in the chart should remain confidential.7 In other words, providers must be able to determine whether a patients’ chart contains restrictions on the way that the information should be shared. Confidentiality Breaches Upon implementation of personal health record accounts, the EP or EH should continue to monitor for security breaches and inadvertent disclosure of confidential information to unintended parties. The EP or EH could identify all elements of the electronic health record that could be considered as confidential and develop a strategy for tracking specific user access to this information. In the event that users other than the patient or direct health care providers access the information, the security teams should be notified for continued surveillance. Providers can be informed of such access logs when they occur so they can evaluate the situation for and react proactively. This risk mitigation strategy may reduce the number of inadvertent confidentiality breaches and allow the providers to actively engage the users that appear to log into the personal health records on a more personal level. Discussion Adolescent medicine can be highly rewarding for healthcare providers, as they can assist a youth through the transition to adulthood. Part of this transition requires that the patient becomes aware of the growing responsibilities for their actions and the importance of communication and a support system. The forum for discussing issues surrounding reproductive care needs such as contraception, safe sex, and prevention of sexually transmitted diseases can be a difficult point for some patients and their families. While the personal health record has several potential pitfalls in this area, it also has the potential to initiate difficult discussions around reproductive care and safe-sex practices. Some patients and their families may become alarmed when considering who should have the login credentials for the patient’s personal health record: imagine a mother and her 12 year old child at a pediatrician’s office, and the child is given a password and instructed not to share it with anyone else, not even with his


Spring 2015 Volume 14, Issue 2


or her parents. An astute pediatrician may capitalize on the opportunity to provide a supportive environment to both the patient and his or her parent or guardian. After all, the truly difficult conversations will not be the conversations about the user access privileges to the personal health record, but rather the conversations about a positive test result for a pregnancy test or a sexually transmitted disease. Patients who engage their providers on discussions about reproductive care should be assured about the confidential nature of the interaction. Providers should review the implications of their encounter on the personal health record, particularly if the patient’s proxy has access to part of or all of the record. Providers may generally encourage patients to engage their family members in the discussion, and may even serve as a mediator in the disclosure of the confidential information. In some cases, the personal health record may serve as a springboard for these discussions and disclosures during an office visit or other healthcare encounter between a patient and provider. EP’s and EH’s need to pay close attention to their strategy to implement patient and proxy access to the personal health record portals. Providers who simply activate patient and proxy access to the personal health record may face situations where the patient’s confidentiality may be undermined. In these scenarios, patient’s trust of the health care

Table 1: Meaningful Use Stage 2 Criteria for Reporting Personal Health Record by EP’s and EH’s.D,E

provider may be damaged, and the patient may withdraw from care or withhold important information from the provider. On the other hand, parents may become upset with a provider who is perceived to be obstructing access to their child’s medical record. The features for personal health record portals available from electronic health record vendors are likely to vary, and may or may not allow customization to match the institution or provider’s planned requirements for adolescent and proxy access. Ideally, the personal health record access can be customized based on the user profile to allow

for the lowest risk of confidentiality breaches. Ultimately, the primary goal of healthcare providers is to provide services to patients who need them. In the case of adolescents who seek or require reproductive care services, all efforts to reduce barriers to access will be beneficial to ensure timely treatment. The personal health record can serve as the testing ground for the discussions about reproductive care between a provider and the patient and proxy. Finally, the personal health record should be configured in such a way as to minimize the risk of potential disclosure of confidential information.

Jorge A. Gálvez MD is an anesthesiologist at The Children’s Hospital of Philadelphia. His primary area of interest is the application of electronic health records for clinical decision support, particularly for children requiring surgery.

Allan F. Simpao MD is an anesthesiologist at The Children’s Hospital of Philadelphia. His primary area of interest is in developing data models for physiologic data and conducting research on perioperative surgical quality outcome metrics.

1. Dickens BM, Cook RJ. Adolescents and consent to treatment. Int J Gynaecol Obstet [Internet]. 2005 May [cited 2015 Jan 31];89(2):179–84. Available from: http://www. 2. Maradiegue A. Minor’s rights versus parental rights: Review of legal issues in adolescent health care. J Midwifery Women’s Heal [Internet]. 2003 Jan [cited 2015 Mar 3];48(3):170–7. Available from: pubmed/12764301 3. Young MG. Medicine and the law: treatment of minors--consent requirements. Tex

Med [Internet]. 1983 Jan [cited 2015 Mar 3];79(1):72–4. Available from: http://www. 4. Akinbami LJ, Gandhi H, Cheng TL. Availability of adolescent health services and confidentiality in primary care practices. Pediatrics [Internet]. 2003 Feb [cited 2015 Mar 3];111(2):394–401. Available from: http:// 5. Blachman DR, Lukacs S. America’s Children: Key National Indicators of Well-Being. Ann Epidemiol. 2009;19:667–8. 6. National Archives and Records Administra-

Mohamed A. Rehman MD is an anesthesiologist and director of perioperative information services at The Children’s Hospital of Philadelphia. Dr. Rehman continues to advocate for unified electronic health records nationally through the American Academy of Pediatrics, the American Society of Anesthesiologists, the Society for Pediatric Anesthesia and the Society for Technology in Anesthesia.

Eligible Provider Patient name Provider's name and office contact information Current and past problem list Current and past procedures Laboratory test results Current medication list and medication history Current medication allergy list and medication allergy history Vital signs (height, weight, blood pressure, BMI, growth charts) Smoking status Demographic information (preferred language, sex, race, ethnicity, date of birth) Care plan field(s) including goals and instructions, and any known care team members including the primary care provider (PCP) of record

Eligible Hospital Patient name Admit and discharge date and location. Reason for hospitalization. Care team including the attending of record as well as other providers of care. Procedures performed during admission. Current and past problem list. Current medication list and medication history. Current medication allergy list and medication allergy history. Vital signs at discharge. Laboratory test results (available at time of discharge). Summary of care record for transitions of care or referrals to another provider. Care plan field(s), including goals and instructions. Discharge instructions for patient.

tion, Office of the Federal Register, Office USNAARA FR. Federal Register, V. 77, No. 171, Tuesday, September 4, 2014. Department of Health and Human Services. Government Printing Office.53769–64352. 7. Council on Clinical Information Technology. Policy Statement--Using personal health records to improve the quality of health care for children. Pediatrics [Internet]. 2009 Jul [cited 2015 Feb 26];124(1):403–9. Available from: pubmed/19564327

D: E:


Harvard Health Policy Review

Photo from Wikimedia Commons, Creative Commons Attribution. FEATURES

The Role of Health Information Management Professionals in Data and Information Governance Allan F. Simpao, MD, Jorge A. Gálvez, MD, Mohamed A. Rehman, MD The federal government has incentivized physicians and health care institutions to adopt electronic medical record systems. Health information management professionals play a key role in electronic health record data and information governance, the use of data for reporting purposes, as well as the use of data analytics. Compliance standards are a dynamic environment that can significantly impact information governance. Introduction The U.S. federal government, through the HITECH Act of 2009, has incentivized physicians and health care institutions to adopt electronic health record (EHR) systems in order to improve health care quality and efficiency.1 This legislation, coupled with advances in computer and monitoring technology, has facilitated the collection and storage of vast amounts of patient data in electronic form.2 This health care “Big Data” has generated a need for health information management (HIM) professionals who can manage and curate the data while adapting to everadvancing technology and navigating a dynamic legal and regulatory environment.3 The American Health Information Management Association defines HIM as “the practice of acquiring, analyzing, and protecting digital and traditional medical

information vital to providing quality patient care.”4 HIM professionals must connect clinical, operational and administrative functions, and have a broad knowledge base and skillset that includes managing information, utilizing data for reporting purposes, and understanding data analytics and compliance standards.5 Managing Data and Information Governance The rise of Big Data in health care has brought with it the expectations of big benefits and solutions for health care organizations to personalize care and improve the quality for the patients.6 However, such great expectations carry significant responsibilities regarding understanding Big Data demands, such as data transport, data standards, and mapping processes.6 Organizations that wish to leverage information technology for quality and cost saving improvements must

ensure proper data and information storage, analysis, transmission, access, use, protection and governance.7 Data governance is the exercise of authority and control over the management of data assets, and HIM professionals need to manage where the data come from, where it goes, and who maintains its stewardship.8 Information governance is “the specification of decision rights and an accountability framework to ensure appropriate behavior in the valuation, creation, storage, use, archiving, and deletion of information.”9 Thus, HIM professionals play an integral role in capturing, maintaining, providing, and protecting quality data and information for use in clinical and administrative decisionmaking. HIM professionals should be familiar with best practices for EHR data capture and data validation principles.10 To fulfill their role of ensuring data quality, HIM professionals need to understand many limitations of the data at a granular level. For example, data irregularities may exist throughout an organization due to inconsistent naming conventions or definitions,11 or physiological data artifacts may occur due to phenomena inherent to monitoring devices (e.g. movement artifact with pulse oximetry).12 HIM professionals must be aware of potential sources of data inconsistency in order to address them where possible and must notify users of the potential risk of drawing faulty conclusions Spring 2015 Volume 14, Issue 2


from erroneous data. HIM professionals’ role as health care data and information stewards is particularly important, as health care has strict requirements for data privacy, confidentiality, and security. For example, HIM professionals in the U.S. should be familiar with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule,13 which was intended to address potential threats to patient privacy that are posed by the computerization and standardization of medical records,14 as well as best principles and practices for data security and confidentiality.15 HIM professionals’ specific responsibilities within an organization may include determining who can access EHR data, how data integrity will be maintained, and how privacy and security guidelines will be applied to EHR usage.16 For instance, if an EHR stores its data in a decentralized online “cloud”, then HIM professionals must ensure that the data is encrypted and secure yet also accessible to permitted users.17 Data and security breaches can be costly in terms of fines and lawsuits as well as loss of public trust in a given care provider or health care system.18,19 The U.S. Department of Health and Human Services lists numerous examples of resolution agreements and civil money penalties for HIPAA violations on its web site.20 Thus, HIM professionals’ role in managing data and information governance is to ensure quality and security throughout the entire data and information life cycle, from data generation and validation to information security and privacy. This all-encompassing

Hospitals and health care organizations are responsible for maintaining secure, comprehensive longitudinal patient records while maintaining compliance with local, state, and federal regulations. responsibility is pertinent not only with data and information used internally within an enterprise, but also with data shared externally for reporting purposes. The Secondary Use of Data for Reporting Purposes As EHR data across multiple health care organizations has proliferated, so has the secondary use of patient data for reporting purposes. The goal of such efforts is to aggregate data across multiple institutions for the benefit of more robust (and relevant) subsequent data analysis and to band together to reap the benefits of Big Data. Local, state, or national governments may institute these data registries for disease surveillance, prevention, or treatment strategies at the population level.21 These collaborations may also take place on an international scale.22 Furthermore, research studies of rare diseases or outcomes may often be underpowered when conducted at single institutions, thereby necessitating collaborative efforts to obtain an adequately powered research study. HIM professionals must be aware of the implications and requirements of sharing patient data via registries and health information exchanges, including ensuring data quality, proper de-identification of patient data and standards for interoperability.23 As data stewards, HIM professionals should

Photo from Wikimedia Commons, Creative Commons Attribution. Dr. Daniel Sands (right) at Beth-Israel Deaconess Medical Center shares the electronic health record of his patient.


Harvard Health Policy Review

remain vigilant for the risk of HIPAA noncompliance with sharing of secondary data and maintain data security and privacy.13 Health Care Data Analytics The accumulation of health care Big Data at the enterprise level and beyond has fueled a demand for analytics applications to transform data into meaningful, actionable information.24 Taking actions involving patient care mandates strict data integrity management,25 as faulty data may result in erroneous conclusions that can potentially cause patient harm. HIM professionals should be familiar with the various types of analytics methods, which can be stratified in four levels. Level I analytics methods involve the capture and analysis of discrete clinical and operational measures, while level II analytics uses comparative data to understand process and outcome variation. Level III analytics involves analyzing population-level data and longitudinal databases.26 Predictive analytics, or level IV analytics, is the application of statistical techniques to determine the likelihood of certain events occurring together, and can be used to create models of likely future events.26 Patient readmission data can be analyzed for statistical patterns, and then predictive modeling methods can be employed to identify patients who are at risk for readmission in order to reduce the likelihood of (and the costs associated with) readmission.27,28 Predictive analytics methods can be applied to EHR Big Data to develop personalized treatments such as for hyperlipidemia.29 Predictive analytics can also be utilized for evidencebased decision making at the enterprise level, not only for operational reporting but also current and future system-wide management and planning.30,31 Because of its implications for improving both patient and the enterprise’s well-being, predictive modeling of readmission risk remains an area of active research, as their performance and generalizability can be suboptimal and vary depending on design factors.32,33 Compliance Standards and Their Impact on Information Governance As data and information stewards, HIM professionals also play a role in supporting compliance and legal efforts by organizing data for retrieval and retention.34 Hospitals

and health care organizations are responsible for maintaining secure, comprehensive longitudinal patient records while maintaining compliance with local, state, and federal regulations. This includes practices to maintain accreditation by regulatory bodies as well as documentation to ensure compliance with payors such as Medicare and Medicaid. HIM professionals must overcome these challenges while dealing with the deluge of structured and unstructured data and information that is created by thousands of people within an enterprise.23 Changes in legislation or payor practices can “move the goalposts” and change the rules

and requirements that institutions must follow. Thus, HIM professionals should remain abreast of the dynamic conditions in which they manage data and information. Furthermore, the consequences of poor data and information governance are not limited to regulatory warnings and fines or reductions in payor reimbursement. Medical identity theft and fraud are also potential problems that can occur when compliance standards are lax or nonexistent.35 Conclusion The proliferation of computer and information technology throughout health care has produced massive amounts of

Jorge A. Gálvez MD is an anesthesiologist at The Children’s Hospital of Philadelphia. His primary area of interest is the application of electronic health records for clinical decision support, particularly for children requiring surgery.

Allan F. Simpao MD is an anesthesiologist at The Children’s Hospital of Philadelphia. His primary area of interest is in developing data models for physiologic data and conducting research on perioperative surgical quality outcome metrics.

patient data that is being mined and analyzed for meaningful information. Hospitals and health care systems are seeking and implementing ways to leverage their Big Data to improve myriad patient care, financial, data exchange and compliance processes. Consequently, HIM professionals must be cognizant and capable of a wide range of tasks and responsibilities while managing data and information governance, which requires the navigation of dynamic conditions while acting as responsible stewards of data and information. Mohamed A. Rehman MD is an anesthesiologist and director of perioperative information services at The Children’s Hospital of Philadelphia. Dr. Rehman continues to advocate for unified electronic health records nationally through the American Academy of Pediatrics, the American Society of Anesthesiologists, the Society for Pediatric Anesthesia and the Society for Technology in Anesthesia.

1. Jamoom E, Beatty P, Bercovitz A, Woodwell D, Palso K, Rechsteiner E. Physician adoption of electronic health record systems: United States. NCHS Data Brief 2011;1-8. 2. Costa FF. Big data in biomedicine. Drug Discov Today. 2014; 19:433-40. 3. Eramo LA. Health care’s data revolution: How data is changing the industry and reshaping HIM’s roles. JAHIMA. 2013; 84:26-30, 32. 4. What is Health Information? Available from Last accessed March 3, 2015. 5. Kloss LL. Leading innovation in enterprise information governance. JAHIMA. 2013; 84:34-8. 6. Fernandes L, O’Connor M, Weaver. Big data, bigger outcomes. JAHIMA. 2012; 83:38-43. 7. Rode D. Leading the health IT revolution: HIM revolutionaries needed to ensure quality death data are utilized. JAHIMA. 2012; 83:18-20. 8. Reeves M, Bowen R. Governing health care’s most valuable asset. JAHIMA. 2012; 83:62-5. 9. IG 101: What is Information Governance? Available from http://journal.ahima. org/2013/12/04/ig-101-what-is-information-governance/. Last accessed March 3, 2015. 10. Health data analysis toolkit. AHIMA. Available from public/documents/ahima/bok1_048618.pdf. Last accessed March 3, 2015. 11. Osborne K, Spellman L, Warner D. Setting the norm. HIM increasingly involved in developing and using standards. J AHIMA. 2014; 85:52-3. 12. Fouzas S, Priftis KN, Anthracopoulos MB. Pulse oximetry in pediatric practice. Pediatr. 2011; 128: 740-52. 13. HIPAA Privacy Rule, 45 CFR 164.530; HIPAA Security Rule, 45 CFR 164.308(a)(5)(i). 14. Shuren AW, Livsey K. Complying with the Health Insurance Portability and Accountability Act. Privacy standards. AAOHN J. 2001; 49:501-7. 15. The 10 Security Domains. AHIMA. Available

from public/documents/ahima/bok1_049602.hcsp?dDocName=bok1_049602. Last accessed on March 3, 2015. 16. Defining the Personal Health Information Management Role. AHIMA. Available from public/documents/ahima/bok1_038473.hcsp?dDocName=bok1_038473. Last accessed on March 3, 2015. 17. Rodrigues JJ1, de la Torre I, Fernández G, López-Coronado M. Analysis of the security and privacy requirements of cloud-based electronic health records systems. J Med Internet Res. 2013; 15:e186. 18. U.S. Department of Health and Human Services Health Information Privacy. Available from Last accessed on March 3, 2015. 19. Agaku IT, Adisa AO, Ayo-Yusuf OA, Connolly GN. Concern about security and privacy, and perceived control over collection and use of health information are related to withholding of health information from healthcare providers. J Am Med Inform Assoc. 2014; 21:374-8. 20. U.S. Department of Health and Human Services Health Information Privacy. Available from enforcement/examples/. Last accessed on March 3, 2015. 21. Centers for Disease Control and Prevention Immunization Information Systems. Available from Last accessed on March 3, 2015. 22. Galluccio F, et al. Registries in systemic sclerosis: a worldwide experience. Rheum, 2011. 50, 60-68. 23. White SE. De-identification and the sharing of big data. JAHIMA. 2013; 84:44-7. 24. Ryu S, Song TM. Big data analysis in health care. Healthc Inform Res. 2014; 20:247-8. 25. White AE, Taylor, LB. Accountable care and data analytics emerging in health care. JAHI-

MA. 2012; 83:56-58. 26. Kloss LL. Quality improvement and data analytics in the era of the electronic health record. 2011 AHIMA Convention Proceedings. 2011. 27. Evans M. Health care’s ‘moneyball’. Predictive modeling being tested in data-driven effort to strike out hospital readmissions. Mod Healthc. 2011; 41:28-30. 28. Shams I, Ajorlou S, Yang K. A predictive analytics approach to reducing 30-day avoidable readmissions among patients with heart failure, acute myocardial infarction, pneumonia, or COPD. Health Care Manag Sci. 2015; 18:1934. 29. Zhang P, Wang F, Hu J, Sorrentino R. Towards personalized medicine: leveraging patient similarity and drug similarity analytics. AMIA Jt Summits Transl Sci Proc. 2014 Apr 7;2014:1326. eCollection 2014. 30. Fihn SD, Francis J, Clancy C, Nielson C, Nelson K, Rumsfeld J, Cullen T, Bates J, Graham GL. Insights from advanced analytics at the Veterans Health Administration. Health Aff (Millwood). 2014; 33:1203-11. 31. Garay J, Cartagena R, Esensoy AV, Handa K, Kane E, Kaw N, Sadat S. Strategic analytics: towards fully embedding evidence in healthcare decision-making. Healthc Q. 2015;17 Spec No:23-7. 32. Walsh C, Hripcsak G. The effects of data sources, cohort selection, and outcome definition on a predictive model of risk of thirty-day hospital readmissions. J Biomed Inform. 2014; 52:418-26. 33. Kansagara D, Englander H, Salanitro A, Kagen D, Theobald C, Freeman M, Kripalani S. Risk prediction models for hospital readmission: a systematic review. JAMA. 2011; 306:1688-98. 34. Nunn S. Driving compliance through data governance. JAHIMA. 2009; 80:50-1. 35. Mancilla D, Moczygemba J. Exploring medical identity theft. Perspect Health Inf Manag. 2009; 6:1e.

Spring 2015 Volume 14, Issue 2



The Development of MyCancerJourney and the Incorporation of Predictive Analytics to Improve Cancer Patient Care Jay F. Piccirillo, MD1,2, Dorina Kallogjeri, MD, MPH1,2, Sara Kukuljan, BS RN CCRC1, and Robert Palmer, MBA3 Newly diagnosed cancer patients seek relevant and accurate information regarding prognosis, treatment effectiveness, and quality of life. The current evidence base for cancer care is not designed to answer many of these questions. MyCancerJourney is a web-based portal to capture, organize, analyze, and present patient-specific information for newly diagnosed adult cancer patients at the time of diagnosis and for cancer patients throughout their entire survivorship experience. MyCancerJourney is designed to be the electronic hub of the cancer patient’s experience and can support the type of patient-centric model of cancer care that health care reform demands and cancer patient groups advocate. As the population ages and more Americans are diagnosed with cancer, the challenges of cancer diagnosis, provision of accurate prognostic information, and treatment choices will become even greater than at present. In 2014, approximately 1.7 million Americans developed a new cancer and nearly 14.5 million Americans are living with a diagnosis of cancer; in 2024, it is expected that 19 million Americans will be living with a diagnosis of cancer.1,2 As a result of the large number of newly diagnosed cancer patients and cancer survivors, billions of dollars will be spent on cancer treatments and survivorship issues. Unfortunately, in many cases, these expenditures will not result in the expected outcomes.3,4 We believe a significant reason cancer care is not producing the benefits we expect is that physicians cannot access relevant evidence to permit personalized treatments and to provide evidence-based care. Equally important, patients cannot benefit from the wisdom of the experience of millions of other cancer patients due to inadequate information systems, data capture, processing, and sharing. A cancer diagnosis is terrifying and the treatment choices are complex with profound survival, quality of life, and economic implications. Patients and their families are provided with complex and often unclear or contradictory information and are therefore left to make overwhelming treatment decisions without adequate data. Currently, available evidence-based information does not support a personalized treatment plan and the evidence base of cancer care is not sound. There are multiple reasons why the evidence base for cancer care is unsound. The prognostic estimates and treatment decision-making in cancer care are primarily based on the results from clinical trials and statistics published by the National Cancer Institute5 and the American Cancer Society.1 Often, these data relate mortality to site and morphologic spread of a tumor at the time of diagnosis and fail

to include patient-specific factors, such as patient age, gender, comorbidity, and cancerrelated symptom severity.6 As a result, survival among participants enrolled in clinical trials is often found to be significantly different from patients with similar stage cancer enrolled in observational studies.7-12 The low enrollment of and participation by newly-diagnosed cancer patients in clinical trials, estimated to be approximately 4% of adult cancer patients, is described as a national health issue and undermines the usefulness and generalizability of clinical trial results.13-15 The low participation by ethnic and racial minorities is very concerning since members of these groups generally experience disproportionally higher mortality rates relative to the entire U.S. population. The lack of widespread participation in clinical trials thus leads to problems in the scientific quality of the research, generalizability of the results12, and speed of scientific discovery.16-18 Additional problems with clinical trials serving as the evidence base for cancer care result from the fact that clinical trials are not designed to provide prognostic information and rarely incorporate quality of life and other patientreported outcome measures. And finally, the well-documented problem of publication bias prevents physicians and patients from understanding the true risks and benefits of various treatments.19 Today, healthcare providers and their patients do not have access to the trusted, relevant clinical information required for informed treatment decision-making. As a possible solution to the lack of generalizability and applicability of clinical trial results to individual cancer patients, Elting et al12 encouraged the use of population-based trials of effectiveness among “all comers.” Giving suitable attention to the fundamental clinical and prognostic distinctions among different patients with similar cancers, the study of the outcomes of patients treated in their natural clinical setting will provide patients and

physicians with the critical information needed for truly informed clinical decision making. Description of MyCancerJourney MyCancerJourney is a web-based portal to capture, organize, and present patient-specific information for newly diagnosed adult cancer patients at the time of diagnosis and throughout the entire survivorship experience. MyCancerJourney consists of three separate and unique programs: MyInsights, MyJournal, and MyCommunity. Trained cancer navigators are available to assist users with supplying the cogent information and understanding the various outputs from MyCancerJourney. MyInsights is an interactive predictive model that generates patient-specific survival curves for newly-diagnosed adult cancer patients. The overall survival estimates are based on the Cox proportional hazards analysis of the outcomes of patients derived from multiple leading cancer centers in the U.S. Personalized survival curves based on demographic, clinical, and tumor characteristics are the initial display of survival estimates. To initiate the personalized survival curve presentation, the user provides the appropriate values for the cogent prognostic factors (e.g., age, gender, race, comorbidity score, cancer site, cancer stage, tumor grade, and, where appropriate, tumor markers). Online guidance is provided throughout the portal for explanation and use of medical terms. The survival curve is presented as a typical x-y graph with survival duration, in years, on the x-axis and survival percentage on the y-axis. A comparison age- and gender-matched survival curve for the total population appears in each patient-specific survival graph. In a consecutive step, the impact on survival of up to four different treatment options is displayed. Upon each “click,” a pictograph appears and provides the survival estimate in a format different from the survival curve and a text box provides the script that explains, in a patient-friendly language, the information provided in the graph. The primary goal of the presentation is to allow the user to compare survival outcomes for different treatment options among patients with similar cogent prognostic factors. When combined with functional wellbeing, quality of life, and financial information for each treatment option, MyInsights will allow for more comprehensive assessment and appropriate patient-specific decisions. MyJournal is where patients provide basic demographic, clinical, tumor, treatment, functional well-being, and quality of life

1 Department of Otolaryngology- Head and Neck Surgery, Washington University School of Medicine, St. Louis, Missouri, USA. Phone: (314)362-4125, Fax:(314)362-7522 2 Clinical Outcomes Research Office, Washington University School of Medicine, St. Louis, Missouri, USA. Phone: (314)362-4125, Fax: (314)362-7522 3. PotentiaMed, 901 South Mopac Expressway, Plaza One, Suite 300, Austin, Texas, USA. Phone: (512)708-9000, Fax: (512)708-9018


Harvard Health Policy Review

information. Patients record their journey from diagnosis through survivorship and can compare their experience with patients with similar cancers and demographic and prognostic factors. In addition, patients participate in data collection that will help other cancer patients understand the benefits, risks, side effects, and outcomes of various cancer treatments. The conceptual model for the development of MyJournal is based on the framework for cancer comparative effectiveness research developed through the review of current models and semi-structured discussions with clinicians.20 The cogent data is collected through the use of discrete modules, consisting of a limited number of questions and utilizing specially designed drop-down and fill-in prompts, that were created to capture the cancer care continuum from the patient’s perspective. The stored data is then available for individual patient reporting or as part of group data. MyJournal was designed to capture, aggregate, and share large amounts of cancer information on the premise that collective wisdom can lead to profound improvements in cancer care. MyCommunity is a supportive, online social networking community for users to meet, share ideas, gain resources, and manage and cope with the issues related to cancer and its treatment. MyCommunity allows the user to create personalized communities based on type of cancer, age, gender, and location along the cancer care continuum. In this way, the members of the community are better able to understand the user’s situation, offer valuable treatment advice, and insights. MyCommunity transforms cancer patients and survivors into advocates and “peer coaches”; every member can ask questions and offer neighborly advice and support. There are multiple challenges facing the development and widespread integration of MyCancerJourney into patient care. The first is obtaining continuous feeds outcomes data from more providers across the country to enhance the power of our predictions. Second, to consistently collect complete patient-reported outcomes and treatment information has historically been a challenge. We will work on novel approaches to obtain patient-reported information in a consistent fashion. Third, to model the cancer patient experience in MyCancerJourney takes considerable knowledge of clinical cancer care and remaining up-to-date with changes. Fourth, there will be the need to monitor the information shared on MyCommunity as it is likely be of a sensitive nature and defining what is suitable content for public consumption will require ongoing thoughtful review and standards. And finally, to ensure that newly diagnosed cancer patients and their families are aware of MyCancerJourney. Conclusion MyCancerJourney is a cloud-based advanced

analytic platform that provides a patientfriendly environment to utilize data already contained within hospital-based tumor registries, capture patient-specific demographic, clinical, treatment, and outcome information, and provide this information back to patients to use in a variety of ways, including comparative treatment effectiveness assessment. All collected data and patient-reported outcomes, including periodic follow-up assessments of general health, disease-specific functional status, quality of life, and satisfaction with care are stored “in the cloud” within secure servers and displayed in graphic formats that allow ease of understanding and interpretation. MyCancerJourney leverages the power of

collective wisdom along with the power of predictive analytics to create a paradigm shift in cancer care by empowering patients with unbiased treatment and outcome information to support more informed decisions. MyCancerJourney will provide the patientcentric model of cancer care and ensure the support that health care reform demands and cancer patient groups advocate. We believe the connectedness of cancer patients that results from the use of these web-based tools will empower patients during their treatment and survivorship experience. In addition, the information captured in MyCancerJourney will help and support other cancer patients in important and previously unimagined ways.

Dr. Piccirillo is Professor and Vice Chair for Research in the Department of Otolaryngology-Head and Neck Surgery at Washington University and Co-Director of the Cancer Quality Improvement Committee of the Siteman Cancer Center. He has conducted extensive research into the incorporation of comorbidity information into cancer statistics.

Sara Kukuljan, BS, RN, CCRC is the Research Compliance Coordinator/ Educator for the Department of Otolaryngology at Washington University School of Medicine in St. Louis. Ms. Kukuljan is a Registered Nurse, Certified Clinical Research Coordinator, and serves as President of the Greater Missouri Chapter for the Association of Clinical Research Professionals.

Dr. Kallogjeri is the Research Statistician in the Department of Otolaryngology-Head and Neck Surgery at Washington University. She has over 12 years of experience in clinical research. Dr. Kallogjeri is the instructor for the introductory and intermediate level statistical courses taught for pre- and post-doctoral students.

Mr. Robert Palmer, MBA is President and CEO of PotentiaMetrics and has over 20 years of CEO experience with startup, growth stage, and established companies. He conceived and led the development of the firm’s PotentiaMED analytics products, including comparative effectiveness analysis and predictive analytics for clinical and health economic outcomes.

1. Siegel R, Ma J, Zou Z, Jemal A. Cancer statistics, 2014. CA Cancer J Clin. 2014;64:9-29. 2. DeSantis CE, Lin CC, Mariotto AB et al. Cancer treatment and survivorship statistics, 2014. CA Cancer J Clin. 2014;64:252-271. 3. Brawley OW, Goldberg P. How We Do Harm. A doctor breaks ranks about being sick in America. New York, NY: St. Martin’s Griffin; 2011. 4. Leaf C. The Truth in Small Doses. Why We’re Losing the War on Cancer-and How to Win It. Simon & Schuster; 2014. 5. U.S. Cancer Statistics Working Group. United States Cancer Statistics:1999-2011 Incidence and Mortality Web-based Report. United States Cancer Statistics . 2014. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. 6. Piccirillo JF, Feinstein AR. Clinical symptoms and comorbidity: significance for the prognostic classification of cancer. Cancer. 1996;77:834-842. 7. Davis S, Wright PW, Schulman SF et al. Participants in prospective, randomized clinical trials for resected non-small cell lung cancer have improved survival compared with nonparticipants in such trials. Cancer. 1985;56:1710-1718. 8. Antman K, Amato D, Wood W et al. Selection bias in clinical trials. J Clin Oncol. 1985;3:11421147. 9. Bertelsen K. Protocol allocation and exclusion in two Danish randomised trials in ovarian cancer. Br J Cancer. 1991;64:1172-1176. 10. Braunholtz DA, Edwards SJ, Lilford RJ. Are randomized clinical trials good for us (in the short term)? Evidence for a “trial effect.” J Clin Epidemiol. 2001;54:217-224. 11. Peppercorn JM, Weeks JC, Cook EF, Joffe S. Comparison of outcomes in cancer patients

treated within and outside clinical trials: conceptual framework and structured review. Lancet. 2004;363:263-270. 12. Elting LS, Cooksley C, Bekele BN et al. Generalizability of cancer clinical trial results: prognostic differences between participants and nonparticipants. Cancer. 2006;106:24522458. 13. Brawley OW. The study of accrual to clinical trials: can we learn from studying who enters our studies? J Clin Oncol. 2004;22:2039-2040. 14. Murthy VH, Krumholz HM, Gross CP. Participation in cancer clinical trials: race-, sex-, and age-based disparities. JAMA. 2004;291:27202726. 15. Stewart JH, Bertoni AG, Staten JL, Levine EA, Gross CP. Participation in surgical oncology clinical trials: gender-, race/ethnicity-, and age-based disparities. Ann Surg Oncol. 2007;14:3328-3334. 16. Joffe S, Weeks JC. Views of American oncologists about the purposes of clinical trials. J Natl Cancer Inst. 2002;94:1847-1853. 17. Swanson GM, Bailar JC, III. Selection and description of cancer clinical trials participants--science or happenstance? Cancer. 2002;95:950-959. 18. Newman LA, Roff NK, Weinberg AD. Cancer clinical trials accrual: missed opportunities to address disparities and missed opportunities to improve outcomes for all. Ann Surg Oncol. 2008;15:1818-1819. 19. Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research. Lancet. 1991;337:867-872. 20. Carpenter WR, Meyer AM, Abernethy AP, Sturmer T, Kosorok MR. A framework for understanding cancer comparative effectiveness research data needs. J Clin Epidemiol. 2012;65:1150-1158.

Spring 2015 Volume 14, Issue 2



Predictive Analytics: Advancing Precision and Population Medicine Dr. George Savage, Co-Founder and Chief Medical Officer, Proteus Digital Health An enormous chasm exists in chronic disease management between treatment benefits demonstrated under controlled clinical circumstances—efficacy—and results seen in the real world—effectiveness. Predictive analytics methods have a key role to play in bridging this efficacy-effectiveness gap; helping physicians, nurses and health systems deliver superior outcomes to patients at acceptable cost. However, in order to enjoy the benefits of Big Data in health care we must first ensure that we are collecting good data. Good data is acquired routinely, wirelessly, and analyzed in real-time, enabling personalized and timely clinical interventions. Good data is scalable, relying on smartphones rather than specialized computer hardware. Big Data today can tell us that 50 percent of patients take their medicines improperly. Good data gives us the opportunity to do something to help individual patients, creating better population data in the process. We are on the verge of a new era of Digital Health, with the potential for improved clinical outcomes and lower healthcare costs. An enormous chasm exists in chronic disease management between treatment benefits demonstrated under controlled clinical circumstances—efficacy—and results seen in the real world—effectiveness. With the prevalence of chronic conditions affecting 133 million people in the U.S.1 and accounting for 46% of the global burden of disease3, ensuring therapies are effective in the real world is a significant challenge and opportunity to make an impact. Predictive analytics techniques applied to data streaming 24x7 from a new class of wireless devices have a key role to play in bridging this efficacyeffectiveness gap; helping physicians, nurses and health systems deliver superior outcomes to chronic disease patients at acceptable cost. Chronic disease management accounts for 75% of healthcare expenditures in the U.S., with one in four adults managing more than one chronic condition.1,2 Given the long-term nature of these conditions, effectiveness in the real world, outside of the clinical setting, determines whether the patient will derive the intended benefit out of their medicine. Despite the availability of numerous therapeutic options, the numbers of patients with uncontrolled chronic conditions are staggeringly high. For instance, only half of all hypertensive and diabetic patients have their blood pressure and blood glucose levels controlled at recommended levels.4,5 These figures indicate that there is a real problem in chronic disease management today and that we need to better bridge the gap between efficacy and effectiveness in the real world. The challenges of closing this chasm include the multiple patient and provider behavioral elements and other clinical and demographic considerations contributing to the gap -- non-


Harvard Health Policy Review

adherence to medications, health illiteracy, lack of patient engagement, provider inertia, suboptimal treatment regimens, and comorbidities. Some may argue that medical data today is most meaningful when applied to populations, not individuals, and lags clinical events. A 40 percent tumor response rate in a population becomes a binary outcome, one-way or the other, for an individual. In another example, a low medication possession ratio explains organ rejection for a particular transplant recipient only after the fact, when it is too late to intervene, without any indication of why adherence became an issue. Perhaps the population emphasis that is so essential for establishing therapeutic efficacy explains why only about 50 percent of patients use prescribed medications appropriately in the real world.6 The lack of timely and actionable information about medication use and response accounts for the difficulty physicians have in optimizing therapeutic regimens to conform to evidencebased standards. In order to enjoy the benefits of big data in health care we must first ensure that we are collecting good data. Good data is provided in the service of physicians making therapeutic decisions and patients who are doing their best to manage their medicines and daily health. Bridging the efficacy-effectiveness gap requires shifting our analytic focus from drugs and diseases to individuals and families; understanding idiosyncratic preferences and behaviors in order to optimize individual therapy and response. Good data provides the opportunity for intervention and modification throughout a therapy, bridging information gaps between physician visits and encouraging

patient behavior change based on individual insights. Inverting the paradigm—building population data one individual course at a time—requires deployment of digital medicines and wellness sensor platforms to close the feedback loop between lifestyle, therapy and response, leveraging mobile connectivity and the Internet of Things. It is critical that these platforms are approved by regulators, who will validate the safety and efficacy of the device components and substantiate the quality of the resulting data. One example of bridging the gap between efficacy and effectiveness in the real world is the Proteus Digital Health service offering to physicians and patients. The Proteus offering is composed of the Proteus Patch, Ingestible Sensor, and the Proteus software application. The Proteus Patch is a miniaturized, wearable data-logger for ambulatory recording of physiological and behavioral metrics such as heart rate, activity, body angle relative to gravity, and time-stamped, patient-logged events, including events signaled by swallowing the Ingestible Sensor accessory. The software application displays the data received from the Proteus Patch in a logical, meaningful way. The Proteus offering may be used in any instance where quantifiable analysis of event-associated physiological and behavioral metrics is desirable. It has been FDA 510K cleared and CE marked and is available in the United States and United Kingdom. Accuracy and safety information submitted in the 510K includes eleven clinical studies in which the Proteus Patch was used by 492 subjects. During these studies, 412 of these subjects also collectively ingested a quantity of 20,993 ingestible sensors. The Ingestible Sensor has a positive detection accuracy (number detected during directly observed ingestions/number administered during directly observed ingestions) of 97.3%, with 100% correct identification when multiple different sensors are swallowed. The Proteus offering can provide insights into medication taking behaviors and a patient’s physiological response that will enable clinicians to make more informed therapeutic decisions and have evidencebased discussions with their patients on how best to approach management of their condition. This has the potential to stop the cycle of continuous medication review due

treatment goals. Once the foundation of good data at the individual level is established, trends over time can be analyzed to predict the next time an individual patient may fall off course. The power of these trends is visually illustrated by Figure 1a depicting an 84 year old male with stable activity, regular rest and consistent daily medication-taking patterns. Figure 1b depicts a contrasting case of a 90 year old male with unstable activity, erratic rest and inconsistent daily medication-taking patterns. The care team and family of each patient can assess general condition and medication adherence at a glance, providing an opportunity for timely intervention ahead of hospitalization.



Dr. George Savage is the chief medical officer and cofounder of Proteus Digital Health, formerly serving as vice president of research and development. Dr. Savage holds  a  B.S. in biomedical engineering from Boston University, an M.D. from Tufts School of Medicine and an M.B.A. from Stanford Graduate School of Business.

Figure 1: (a) Proteus data over 60 days in 84 year old male, (b) Proteus data over 40 days in 90 year old male The true promise of predictive analytics is in advancing both precision and population medicine. There are significant advances toward this goal from new digital technologies that enable prospective, scalable real-time analytics services applied to each individual, permitting for the first time data-driven support and timely intervention to chronic disease patients.

to undetected non-adherence and accelerate the dose titration of medication where appropriate. By providing information to differentiate between non-response and nonadherence, the Proteus offering optimizes therapy costs (i.e. avoid unnecessary therapy changes and escalation to higher cost drugs) and use of medical resources (i.e. prevent unnecessary specialist referrals and costly complications), facilitating efficient progression through the recommended treatment pathway, and advancing patients towards their treatment goals. The powerful impact of the Proteus data was apparent when patients with uncontrolled hypertension used the Proteus offering for two weeks. In this interim analysis of 46 patients, 37% of patients achieved blood pressure control (<140/90 mm Hg) after two weeks of product use. These patients’ adherence behavior in combination with blood pressure data informed the providers of the following key insights: (1) the current treatment regimen was appropriate, and (2) if blood pressure were to rise again, adherence counseling should be the focus rather than

medication adjustment. Another 58% of patients were identified as adherent to their medications, but still not at treatment goals. With this information, the provider now knew that therapy adjustments were necessary. The remaining 4% were both non-adherent to medications and also not at treatment goals, indicating that additional adherence support should be provided. The physiologic data reports provided the physician with additional understanding of the stability of the patient’s lifestyle, and activity patterns, an important component of cardiovascular health.7 In another study of uncontrolled hypertensive patients, a primary care physician described how he used the information from the Proteus offering for individualized treatment decisions regarding dose adjustment, the addition or discontinuation of medications, prescribing of extendedrelease formulations, and/or adherence counseling to improve blood pressure.8 In just two weeks, the Proteus offering was able to provide the insights necessary to make tailored treatment decisions, eliminating guess work and efficiently moving patients towards

1. Ward BW, Schiller JS, Goodman RA. Multiple chronic conditions among US adults: a 2012 update. Prev Chronic Dis. 2014;11:130389.  2. Centers for Disease Control. Chronic Disease:The Power to Prevent, The Call to Control: At A Glance 2009. Accessed December 2014. Available from: chronicdisease/resources/publications/AAG/ chronic.html 3. World Health Organization. The Global Burden of Chornic. Accessed December 2014. Available from: topics/2_background/en/ 4. Go AS, Mozaffarian D, Roger VL et al. Heart disease and stroke statistics–2014 update: a report from the American heart association. Circulation. 2014 Jan 21;129(3):e28-e292. 5. Fitch K, Pyenson BS, Iwasaki K. Medical claim cost impact of improved diabetes control for medicare and commercially insured patients with type 2 diabetes. J Manag Care Pharm. 2013 Oct;19(8):609-20, 620a-620d. 6. World Health Organization. Adherence to Long-term Therapies: Evidence for Action. Accessed December 2014. Available from: 7. Kim YA, Virdhi N, Raja P, DiCarlo L. Modeling the impact of a digital health feedback system in uncontrolled hypertensive patients. Value in Health. 2014 Nov;17(7):A479. 8. Godbehere P, Wareing P. Hypertension assessment and management: role for digital medicine. J Clin Hypertens (Greenwich). 2014 Mar;16(3):235.

Spring 2015 Volume 14, Issue 2



Leaping the Data Chasm: Structuring Donation of Clinical Data for Healthcare Innovation and Modeling Patrick L. Taylor, JD, Kenneth D. Mandl, MD, MPH

Many innovators transforming healthcare are not the usual suspects, academic researchers deriving generalizable knowledge from patient data. Instead, they are problem-solvers like software designers developing novel algorithms to connect medical science with patient-specific variation. Often, they lack data. To them, data would reveal methodical solutions for idiosyncratic but recurrent missteps in the intricate dance among patients, biomedical science and clinical delivery. Comprehensive, data-driven testing is essential prophylaxis against unanticipated disasters that manufactured datasets would not reliably include. While simulation methods are finding a role in evidence synthesis,1,2 and payer claims data are suggestive but all-toolimited by overriding financial rationales, genuine, comprehensive and accurate primary provider data are necessary for innovators to distinguish reality from projections, truth from fiction. Modeling innovations’ performance with real clinical data reveals latent flaws and perturbing interactions in both clinical care and IT “improvements” before catastrophe. We propose a framework for voluntary data donation to capacitate systems innovation because every alternative means of access created by regulations is dysfunctional or problematic. The problem is not technological; both Federal regulations mandating the “meaningful use” of electronic health records (EHRs), and other initiatives,3 promise patients technical control over e-record copies. Incomplete policy imagination forestalled the legal architecture required to enable donative transactions. In designing the plumbing at the House of HIPAA, the plumbers forgot to install a pipe for innovation. No one asked: how will clinical data become accessible to innovators who are not doctors doing medical research, public health specialists, or clinicians doing quality improvement? The absence of a structured answer causes two problems: (1) how do we legally provide for data donation, and (2) in the absence of mutual acquaintance and a rulebook for mutual expectations, how do we increase the chances that total strangers – innovators, patients and providers – will come together and reach a reasonable deal when there is no money or market to motivate their action. The new pipe we create must be leak-proof and credible to the householders. We motivate our proposal with a canonical use case -“apps” that use and create EHR data. We have encountered barriers to providing patient data to app developers first hand in advancing SMART Platforms (, a program4 funded by health and human services under ARRA, to equip innovators to rapidly create substitutable “apps” for electronic health data.5 SMART provides interfaces that jump both strategic and inadvertent barriers to progress embedded in EHR vendor products; for example, long-delayed apps for pediatric specialists 6 and synergized genomic medicine,7 were implemented within weeks. The development of health apps that do not rely on health system data is already burgeoning - with over 30,000 health apps in the iTunes and Google Play stores, it is clear that this is one of the fastest growth areas for innovation and

will ultimately have a large impact on health. The Unsatisfactory Status Quo Patient data are confidential, requiring patient authorization for providers to disclose personal health information to innovators, unless app development meets one of the Health Insurance Portability and Accountability Act’s (HIPAA) special exceptions. Pertinent here are deidentification, limited data use agreements, quality assurance, and treating data donation for app testing as research. (Waiver of authorization would require the unsustainable finding that obtaining an authorization would be impossible.) “De-identified” personal health data may be disclosed without patient authorization, but reidentification is plausible, turning disclosure into a HIPAA violation. Our contest8 involved developers receiving de-identified data on 30 patients. Preparation was manual and painstaking. Under a data donation program, reidentifiability by the sufficiently determined is virtually guaranteed by the details and breadth of data. Genomic data, inherently identifying, further entangle data release.9 Meanwhile, complete de-identification will impede development and satisfactory testing and validation of apps exploiting deleted data, such as admission dates pertinent to length of stay, and precision zip-codes pertinent to local disease outbreak surveillance. Both are pertinent, for example, to hospitalacquired-conditions. Because their de-identification is partial, these problems afflict limited data use agreements in varying degrees. Moreover, app development is not a permitted purpose for such agreements. App development is not typically, itself, public health oversight, research or healthcare operations to assure quality – although we may expect transformative apps in those areas. For quality assurance, a provider may use identifiable patient information internally and can disclose identifiable patient information to another covered entity if both have a relationship with the patient and the disclosure is pertinent to that relationship.”10 Modeling a method, whether it is a checklist or an app, is not explicitly within the definition of health care operations; even if it were, modeling could examine only patients common to both systems, and the developer would have to be a covered entity. In our contest, none of the apps8 submitted was from a covered entity; all came from software developers eager to jump in to the field, but lacking more than a small sample of real patient data to test their applications’ likely safety and effectiveness. Finally, one might imagine treating developers as researchers, and development as research, particularly since Food and Drug Administration (FDA) approval of devices involves clinical investigations, and researchers regularly solicit consented data donation to their studies. But debugging apps is not the search for generalizable knowledge. Nor do developers face the conflict between seeking knowledge and providing care that underlies research ethics and Institutional Review Board (IRB) review.11 Apps require specialized algorithmic review, modeling, performance

Spring 2015 Volume 14, Issue 2


optimization, and compliance with HIPAA security standards. They may require technical detection of malign functions. These are not IRBs’ expertise. Clinical investigations are not the natural method one would choose to perform them, and fundamental research ethical concepts, like equipoise, are inapt. Rarely will a developer have a medical or hospital appointment, or qualifications to do research; without these it is doubtful that any provider would let them, as strangers, do research on their patients. There is no legal marketplace for comprehensive, provider-originated patient medical records; as the composite and fiduciary creation of many people, with expectations of confidentiality varyingly defined by state and federal statutes finely protecting condition-specific sensitivities, the ethical and legal challenges to even just the warranted “title,” fungibility and compensated alienability required for a market are enormous. That leaves only voluntary donation. HHS’s “Blue Button” program3 empowers patients to download their EHR data and HIPAA amendments permit patients to direct these records to others – an easy mechanism for donations. But in contrast to charitable financial donations where template legal arrangements abound, data donation faces a vacuum. We believe that, if conditions were reasonably structured and well-known, both “information altruists”12 and disease-specific, self-assembling patient groups will donate data to speed social and direct benefit through innovation and research.13-17 The government could incent donation with deductions and credits without embracing the stark commoditization of market sales. HIPAA authorizations, the current transfer instrument, are under-regulated despite obvious risks. Regulations fail to ensure transaction integrity. A patient signature on a minimally explanatory template yields naked disclosure, without boundaries on risks, and without recourse if developers misuse data or disclose it to embarrass. An authorization, if competently executed, is in force, without regard to whether issues are present that would have prevented a comparable research use, and without regard to whether the data recipient has an iota of privacy or data confidentiality protection. As a regulatory solution to govern the relations of three parties – the patient, the provider and the entity disclosed to – authorizations fail completely because they do not address the behavior of the entity disclosed to. It can wheedle, misrepresent, coerce or hypnotize; and there is no HIPAA violation. HIPAA offers no answers to obvious donor questions: how to verify the data-seeker’s purposes and qualifications; how will data be protected from hackers; is there a de-identified option; this authorization form says that HIPAA will not protect my privacy once data are disclosed – so what will prevent this entity from misusing my data; the entity that wants my data isn’t even a party to this authorization – do I need a contract with the entity and if so what would it look like and can I get a free form online? Answers to such


Harvard Health Policy Review

Regulations fail to ensure transaction integrity. A patient signature on a minimally explanatory template yields naked disclosure. questions are critical to public trust. In our view, donation for innovation modeling, presented fully below, is distinct from the only extant model for data donation we are aware of, the open consent model for genomic research,18 in three ways: (1) nature, purposes, methods and review; (2) we address the exclusion of innovators external to academic medicine from relying on research infrastructure to address barriers to contribution, e.g. qualifications; and (3) our conviction that the problems open consent models do not address, structurally assuring that social benefits and non-self-dealing reciprocate donor altruism despite the ultimate exploitation of discoveries by commercial entities for profit, are essential to identify and resolve to a public that made Henrietta Lacks’ story a continued best-seller, and two professions, medicine and law, that must keep awareness and prevention of conflicts of interest front and center in their endeavors. No “Second Best” for Innovation: a Proposal for Ethical Solicitation and Responsible Use We propose nesting donation in a voluntary process designed to make donation accessible, safe and effective. We avoid proliferating new standards by importing selected, workable, participantprotective elements whose familiarity confirms plausible implementation. The model is not tied to apps, but generalizable to uncompensated patient data donations for health systems innovation and modeling involving innovators outside organized medicine. 1. Go Beyond a HIPAA Authorization: Build Sound Internal Review and Import Standards. Research requires attention to consent and independent review. HIPAA does not; an authorization suffices. The difference is philosophically significant. For research, consent implements respect for autonomy; review reflects that beneficence is not reducible to choice - consent can be mistaken or flawed, and promises of benefit, warnings about risks, and their relative significance require confirmation by disinterested experts and community members. Protocol complexity and the researcher’s conflict of interest (between care and knowledge) are inadequately addressed by consent: an incomprehensible consent form and a care relationship exploited may both endanger consent’s power to guard against overreaching. IRBs scrutinize recruitment and evaluate study risks in part because an executed consent form, rather than documenting consent, may reflect successful manipulation and radical misunderstanding. Not so under HIPAA: Consent need not be cognitively assessed and the template form, if executed, is definitive. Consistently, HIPAA prescribes a process to revoke an authorization, but no process is offered or prescribed to assert an authorization’s invalidity. For many privacy issues, the difference between research and privacy regulations may reflect the relative transparency and comprehensibility of

privacy issues compared to clinical research, and the alignment of patient interests and patient authorization. Here, the patients will be asked, perhaps by their own provider, to act potentially against their interests, without direct benefit to themselves, but apparent benefit to commercial third-parties; and the risks depend on mitigation steps unknown to the patient and not under their control. We see only two options: (1) the Office of Civil Rights, and professional associations, prepare model agreements for various donative purposes that address anticipatable issues with mutual fairness; or (2) Independent review ensures that risks are minimized, donor purpose is achieved, there is no donee self-dealing, and the primary benefit of the donors’ altruism, from equipping commercial parties to undertake modeling otherwise avoided, is public. That review should include appropriate experts in privacy and security systems and standards, and consults with clinicians who can assess app utility. Other gaps in HIPAA are addressed as follows: (a) solicitation of donors should meet research standards for recruiting research participants, protecting donors according to familiar institutional practice; (b) Authorizations should be very specific. For example, “disclosure directly to qualified med app developers to confirm safety and efficacy of their developed app, and retest, subject to reporting back their results, and no other use or disclosure to others except secure retention and disclosure of the data for purposes only of independent validation by government or accrediting bodies”. The authorization should precisely describe the data donated, whether or not it has been scrubbed of errors and the opportunity, if any, the donor has had to review the data. We cannot avoid sensitive data without ruling out apps addressing their care, but disclosure of such data should meet high standards. (c) Independent review of the authorization, the developer written application, and the overall plan. (d) Developers must contractually agree with the donor to obligations like business associates, including purpose limitations, the privacy and security standards the developer implements, an obligation to report and cooperate in the mitigation of breaches and misuses, destruction of the data at end or secure preservation if required by law, prohibitions on secondary uses and redistribution, and extension of the obligations to any subcontractors,. The agreements should also include consent to the regulatory jurisdiction of the federal Office of Civil Rights, and basic indemnification and insurance terms common for independent contracts. (e) Developers must have, in force, business insurance covering privacy and security breaches. While insurance was difficult to obtain in 2001, it is now common, inexpensive, and easily obtainable by solo developers. Requiring it meets the two-fold need for some on-site personal assessment of developers’ bona fides, and creating a remedy for injured donees. This model guarantees a financial

remedy for donors, which HIPAA omits and research permits but still fails to require. 2. Create a Nonprofit, Mission-oriented Intermediary focused on Procuring and Administering Data Donations, Evolving with Market and Regulatory Developments, and Exploring Stewarded and Other Approaches to Reducing Donor Risk. Here we borrow from the laws governing the organization, operating standards and oversight of charities and organ procurement organizations to create a focused organization for which this is not a potentially neglected distraction outside its core provider competencies. Donee entities should be organized as nonprofit, federally taxexempt organizations that primarily promote health care improvement. This will require a federally approved application for exemption that discloses ownership, control, and plans (including community membership and benefit) and commits them to values, oversight and processes that preclude self-dealing, private benefit, excessive salaries and profiteering. Net revenue must be spent on the charitable purpose. These entities should be separate from both the end-user of data and providers, to prevent end-users from skewing the solicitation process. Independence provides the basis for programs to encourage donation; annual reports back to donor alumni; and ongoing attention to market changes, risk management, technology and regulations. Donee organizations should explore related services, including stewarded inquiry models that reduce donor risk through staff services, conducted on data available only to the intermediary, minimizing donor privacy risks and potential selection biases while contributing to the organization’s self-support.19 3. Instead of Relying Solely on a HIPAA Authorization, Reinforce Donee and Developer Obligations to Donors and Providers through Uniform Revocable, Conditional, Data Use Licenses. To preempt claims of ownership against patients by donees or developers, respect donors, and add legal strength to enforce parameters for use and disclosure, we would treat donors, in this context, as the quitclaim co-owners of their data, interpret donation as nonexclusive licensing-in to the donee nonprofit, and disclosure out to users, such as app developers, as nonexclusive licensing out subject to rigorous terms and conditions, breach of which would result in automatic revocation of their license, and the donor’s authorization with respect to them. 4. Convene an International Working Group to Work towards a Consensus International Framework on Privacy, analogous to the Declaration of Helsinki in Research. A significant limitation of our analysis is that it involves only HIPAA, and privacy and security interests under U.S. law. An analysis of other countries’ laws would be enormous work of questionable reliability, since with laws, interpretation, enforcement and effect involve more than the smooth surfaces of printed pages, and there is no unifying set of international principles

to depend on. What is required for data donation to be internationally encouraged and protected is less a government-to-government protocol than an influential consensus framework analogous to the Declaration of Helsinki, protective of patient privacy while equally respectful of the safe progress on which patient welfare depends. Conclusion The proposed model enables data donation through creating enough of a “pipe” for informed donors to simply turn on the tap and say yes; they need not recreate the plumbing from scratch and secure design consensus anew for each donation. It is serviceable for a range of health system innovations, exposing the bad, and illuminating the safe and socially compelling. The design parsimoniously relies primarily on familiar research standards, HIPAA business associate obligations, and charitable organizational and operating standards with demonstrated practicality and utility. Donor remedies and developer scrutiny are enhanced by requiring developers to be insurable and maintain breach insurance. Not-for-profit intermediaries could help avoid profiteering, and ground further progress in reducing donor risks and strengthening altruistic reciprocity. Acknowledgements: This work was funded by the Strategic Health IT Advanced Research Projects Award 90TR000101 from the Office of the National Coordinator of 1. Fusaro VA, Patil P, Chi CL, Contant CF, Tonellato PJ. A systems approach to designing effective clinical trials using simulations. Circulation. Jan 29 2013;127(4):517-526. 2. Harnisch L, Shepard T, Pons G, Della Pasqua O. Modeling and simulation as a tool to bridge efficacy and safety data in special populations. CPT Pharmacometrics Syst Pharmacol. 2013;2:e28. 3. Conn J. Mobile push: Blue Button focus of government IT strategy. Mod Healthc. May 28 2012;42(22):10. 4. Office of the National Coordinator for Health Information Technology. Strategic Health IT Advanced Research Projects (SHARP) Program. February 10, 2015 5. Mandl KD, Kohane IS. No small change for the health information economy. N Engl J Med. Mar 26 2009;360(13):1278-1281. 6. SMART BP Centiles., accessed February 10, 2015 7. SMART Genomics Advisor. http://smartplatforms. org/smart-app-gallery/genomics-advisor/, accessed February 10, 2015. 8. Chopra A. SMArt Prize for Patients, Physicians, and Researchers., accessed February 10, 2015. 9. Gymrek M, McGuire AL, Golan D, Halperin E, Erlich Y. Identifying personal genomes by surname inference. Science. Jan 18 2013;339(6117):321-324. 10. Department of Health and Human Services. Uses and disclosures to carry out treatment, payment, or health care operations. 45 CFR

Health Information Technology and by NIH National Institute of General Medical Sciences grant R01GM104303. The academic work of Patrick L. Taylor (Assistant Clinical Professor, HMS, Affiliate Faculty, Petrie-Flom Center at HLS) focuses on emerging issues in science and medicine where law and science poorly intersect. His policy contributions concerning bioethics, stem cells, conflicts of interest, privacy, and genomics have been recognized nationally and internationally. Dr. Kenneth Mandl (Twitter: @mandl) is Professor at Harvard Medical School and the Boston Children’s Hospital Chair in Biomedical Informatics and Population Health. Through scholarship intersecting epidemiology and informatics, Mandl pioneered use of IT and big data for population health, discovery, patient engagement and care redesign. §164.501 (Definitions) and 164.506. US. Governement Printing Office2002. 11. Taylor PL. Overseeing innovative therapy without mistaking it for research: a function-based model based on old truths, new capacities, and lessons from stem cells. The Journal of law, medicine & ethics : a journal of the American Society of Law, Medicine & Ethics. Summer 2010;38(2):286-302. 12. Kohane IS, Altman RB. Health-information altruists--a potentially critical resource. N Engl J Med. Nov 10 2005;353(19):2074-2077. 13. Mandl KD, Kohane IS. Tectonic shifts in the health information economy. N Engl J Med. Apr 17 2008;358(16):1732-1737. 14. Zulman DM, Nazi KM, Turvey CL, Wagner TH, Woods SS, An LC. Patient interest in sharing personal health record information: A web-based survey. Ann Intern Med. Dec 20 2011;155(12):805-810. 15. Weitzman ER, Adida B, Kelemen S, Mandl KD. Sharing data for public health research by members of an international online diabetes social network. PLoS One. 2011;6(4):e19256. 16. Frost JH, Massagli MP. Social uses of personal health information within PatientsLikeMe, an online patient community: what can happen when patients have access to one another’s data. Journal of medical Internet research. 2008;10(3):e15. 17. Wicks P, Vaughan TE, Massagli MP, Heywood J. Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm. Nat Biotechnol. May 2011;29(5):411-414. 18. Ball MP, Thakuria JV, Zaranek AW, et al. A public resource facilitating clinical use of genomes. Proc Natl Acad Sci U S A. Jul 24 2012;109(30):11920-11927. 19. Taylor P. When consent gets in the way. Nature. Nov 6 2008;456(6):32-33.

Spring 2015 Volume 14, Issue 2


Photo from Wikimedia Commons, Creative Commons Attribution. HEALTH HIGHLIGHTS

Academic Institutionsâ&#x20AC;&#x2122; Critical Guidelines for Health Care Workers who Deploy to West Africa for the Ebola Response and Future Crises HCW and to their sponsoring institution in their Hilarie Cranmer, MD, MPH1, Miriam Aschkenasy, MD, MPH2, Ryan Wildes3, Stephanie Kayden, MD, MPH4 , David Bangsberg, MD, MPH5, Michelle Niescierenko, MD6, Katie Kemen7, Kai-Hsun Hsiao, MBChB8, Michael VanRooyen, MD, MPH9, Frederick M. Burkle, Jr, MD, MPH, DTM10, Paul D. Biddinger, MD11 The unprecedented Ebola Virus Disease (EVD) outbreak in West Africa, with its first cases documented in March 2014, has claimed the lives of thousands of people, and it has devastated the health care infrastructure and workforce in affected countries. Throughout this outbreak, there has been a critical lack of health care workers (HCW), including physicians, nurses, and other essential non-clinical staff, who have been needed, in most of the affected countries, to support the medical response to EVD , to attend to the health care needs of the population overall, and to be trained effectively in infection protection and control. This lack of sufficient and qualified HCW is due in large part to three factors: 1) limited HCW staff prior to the outbreak, 2) disproportionate illness and death among HCWs caused by EVD directly, and 3) valid concerns about personal safety among international HCWs who are considering responding to the affected areas. To the first point, there are more than 90% fewer physicians per 1000 persons in affected West African nations than there are in the United States, and there are also insufficient numbers of nurses, even when compared to the critical threshold for resource poor settings which is 2.3 doctors, nurses and midwives per 1000 population.1 These numbers are not met in the nations hardest hit by EVD (Table 1)2. To the second point, during the Ebola Crisis, HCWs have exhibited the highest levels of crude fatality rates (CFR)3*, 59% (488 deaths out of 830 cases of recorded Ebola illnesses in HCW thus

far), as compared with a CFR of 40% for the population overall4 (9380 deaths for 23253 cases overall thus far). Even more concerning is the fact that the two fatality rates appear to be diverging over time in a statistically significant manner.5 While no definitive studies or data are yet available, the very high CFR for national HCW staff may possibly be due, in part, to excessively long hours, inadequate working conditions especially in poorer clinics that may have increased the chances for greater exposure to and inoculation with the virus, lack of access to simple personal protective equipment like gloves and masks, minimal medical supplies, lack of appropriate medications or isolation facilities, and insufficient education and training in standard infection, protection and control measures. Lastly, while the mobilization of adequate numbers of well-trained and well-equipped international HCWs to assist the West African HCWs in both preparedness and response has been said to be essential to the response to the current Ebola epidemic, and to prevent its ultimate spread beyond existing national borders within the continent and around the world, there have been relatively few international medical workers responding to the EVD outbreak. Prior studies have asserted that Academic Medical Centers (AMCs) and Institutions worldwide may have a unique and essential role to play in response to public health and humanitarian crises.6,7,8 However, as has been clear during the EVD outbreak, responding to this challenge is not without significant risk to both the

home country. Both individual medical responders and the AMCs they work for need to carefully and honestly consider the risks to personal health and safety while deployed, potential risks to the patients cared for at home after deployment, costs that will and may be incurred, continuity of staffing at home while an individual is deployed overseas, the credibility and capabilities of the organizations with whom they will deploy overseas, and other unique social and political considerations that may become relevant for the individual and/or the organization. To date, although there has been some guidance for individuals to assist with this decision-making process, such as the advice provided in the CDCâ&#x20AC;&#x2122;s Advice for Humanitarian Aid Workers and in other Guideline Statements, there has not been published guidance designed to assist both the individual and their sponsoring AMCs or other affiliated institutions with the decision-making process when individuals wish to deploy to humanitarian crises.9,10,11 TABLE 1: Numbers of HCWs per 1000 Population in EVD Affected Countries in West Africa

* CFR is calculated by dividing the number of deaths that have occurred due to a certain condition by the total number of cases - but the outbreak in West Africa is still ongoing, so the proportion of fatal cases, PFC, is what is calculated, and often reported, and that is the number of deaths thus far divided by the number of cases to date.


Harvard Health Policy Review

Consequently, it is not uncommon that personal and institutional decisions and arrangements have been ad hoc and inconsistent in the heat of responding to crisis, even within the same institutions, which can frustrate and hinder potential volunteers. The guidelines below were developed by the Global Ebola Task Force Working Group (GETF), at the request of Partners HealthCare, a nonprofit organization that owns several hospitals in Massachusetts and has over 60,000 employees, for institutions in their management and support of individual employees considering deployment to the Ebola response in West Africa and to future deployments. These guidelines have been developed to assist individuals, their home institutions, and the host aid organizations with whom they affiliate to better understand how to deploy domestic HCWs to international humanitarian crises and support their safe return to their place of work with full recovery, respect, and dignity. The decision to volunteer is an entirely personal one, but the individual’s home and host institutions hold a significant role, influence and crucial support in this process. The vital roles for institutional staff include: • supporting individuals in making informed decisions on their readiness for deployment and on their choice of aid organization, • potentially placing restrictions on institutional trainees (defined as those in medical school, internships and residencies), • negotiating leave provisions and staffing coverage, • supporting gaps in medical and benefit coverage for the individual and their beneficiaries, • mitigating impacts on and concerns of other staff members in their units or departments,

• acknowledging limitations on ability and obligations in assisting the individual while on deployment, and, • determining the role of the institution’s Occupational Health Services in post-deployment health screening and monitoring. Framework Guideline Pre-deployment Evaluation and Registration The home institution should assist the individual in their own personal decision-making process by ensuring that their decision is fully considered and informed. The CDC’s Advice for Humanitarian Aid Workers7 provides a checklist, but three considerations are key: (1) Personal and professional readiness for deployment: Individuals must honestly assess their own knowledge, skills and experiences, not only in the clinical competencies required for their expected role in West Africa, but also in the personal capacities for coping with the stressors, unfamiliar environments, and basic living conditions likely to be faced on deployment. From the perspective of professional readiness, trainees (including medical students, residents and fellows) by default should not volunteer. This is due to the lack of necessary supervision in such contexts, lack of full medical licensing to practice independently, and limits on liability protection from their insurance carriers. This position intends not only to protect the professional well being of trainees, but also to uphold the same standard of care in medical humanitarian aid as that expected in any developed country. Understandably, there are exceptions due to specific skill-sets, experience, language, or subject matter expertise.4 However, these must be considered on a case-bycase basis with explicit permission from the relevant

Photo from Wikimedia Commons, Creative Commons Attribution.

Outbreak distribution map of ebola in West Africa as of February 2015.

director of training. (2) Personal health status: Individuals must consider their personal physical and mental health status, and obtain and complete a Travel Health Assessment, including relevant vaccinations and prophylactic medications, from their primary care physician or travel medicine specialist. (3) Risks to personal health and safety: Individuals must assess, understand and accept the risks associated with deployment. This must be coupled with the understanding that the home institution’s ability to assist in issues during deployment, including medical assistance and evacuation, is often extremely limited and cannot be guaranteed. The aid organization with which the individual deploys remains entirely responsible for that individual’s health, safety and security. Therefore, knowing the capabilities and medical contingency plans of the aid organization is an essential component of the individual’s assessment and decision-making process. In addition to these personal considerations, and before any decisions for deployment are made, individuals must have an explicit discussion with their unit manager or department chief, with specific coverage of the following four points: (1) anticipated duration of deployment, (2) possibility of a mandated 21-day personal quarantine on return, (3) implications on leave allowances, pay and benefits, and (4) available logistical support, if any, for the individual during deployment. Once the decision for deployment is confirmed, the individual must register their intended deployment with the institution’s travel risk management service, or equivalent, and submit relevant documents, such as travel itinerary, copy of passport biographical page, and “in case of emergency” form. Tracking of travellers is critical to enable travel risk management, especially for purposes of repatriation and return to work. It also enables the traveller to benefit from automated travel safety and security alerts during the course of their trip. Finally, a pre-deployment briefing should be conducted by the institution’s designated Point of Contact for humanitarian response volunteers. The purpose of the briefing is to (1) review assessment of readiness and risk, (2) discuss relevant critical issues with representatives from Occupational Health Services, Employee Assistance Program, Public Relations, and Finance and Administration, and (3) plan for repatriation, including reiterating the likelihood of a 21-day Personal Quarantine after responding to a country with known EVD (also known as home quarantine or self-quarantine) and/ or active fever and symptom watch, also known as active monitoring, or direct active monitoring, the latter means that at least one visit a day from a public health professional. Individuals should familiarize themselves with their potential exposure risk category, and associated monitoring and movement restrictions on return as per current CDC guidelines.1 Pre-return In-country Screening Prior to departure from the affected country, individuals should have a check-in meeting with their Point of Contact. The purpose of this is to assess their exposure risk category, to discuss

Spring 2015 Volume 14, Issue 2


anticipated monitoring and movement restrictions on return, and to plan for any contingencies that may arise en route. Potential contingencies include being barred from travel by commercial conveyance, getting legally quarantined at border crossings, or developing symptoms en route. The aid organization with which the individual deployed remains the primary contact should any contingencies arise, although the home institution may be able to offer secondary support, depending on capabilities, such as communications with their family. Post-deployment Monitoring and Return to Work The relevant public health authority will conduct post-arrival procedures, including screening, assessment of risk, assignment of public health officer, and determination of required public health actions. A 21-day active monitoring period is the likely minimum when returning from a country afflicted with EVD. For clinicians involved in frontline care of Ebola patients, a 21-day personal quarantine and active fever/symptom watch will be required before returning to work. Individuals must also be aware that personal and travel restrictions for returning volunteers also depend on the city and/or State, which may be more stringent than the CDC guidelines. Individuals must contact their institution’s Occupational Health Service to discuss risk category, required public health actions, any delay period before returning to work (which will be communicated to their unit manager) and their action plan should symptoms develop. The Occupational Health Service may be delegated with the responsibility for active monitoring of the individual by the public health authority. However, regardless of delegated responsibility, the Occupational Health Service will be the authority to provide written clearance allowing an individual to return to the workplace. The individual should have a clear, predetermined action plan in case symptoms develop. This will include direct communication with the Occupational Health Service. The home institution should have a designated person responsible for contacting and liaising with the relevant public health authorities and the individual’s aid organization to determine arrangements for clinical evaluation, disease testing, and isolation, if required. Finally, routine Occupational Health screening (including advice on completion of prophylactic

medications and considerations for tuberculosis testing), and a post-deployment de-briefing with the Point of Contact should be conducted. The purpose of the debrief is to (1) address the individual’s experience with the aid organization, which may be used to inform future volunteers, (2) review and offer available resources such as the Employee Assistance Program, mental health counselling, public relations advice, and financial and administrative supports, and lastly, (3) plan for re-entry into the workforce. Limitations and Conclusion These Guidelines were developed out of an internal Institutional need to provide guidance, structure, and process for home institutions to adequately respond to and manage individual employees requesting deployment to the Ebola response in West Africa, but it can easily be expanded for future crises responses for institutions. These Guidelines must be interpreted in and applied with consideration for the internal policies and capabilities of each individual institution or employer. However, these Guidelines are also considered to have broad scope with adaptability to other types of institutions, programs and corporations, and to future occurrences of other local, regional or international outbreaks, epidemics or pandemics of emerging infectious diseases of concern. These Guidelines represent our current Institutional best practice, and have been informed by prior experience with response to outbreaks and incidents including SARS (Severe Acute Respiratory Syndrome), Middle East Respiratory Syndrome (MERS), and Viral Hemorrhagic Fevers, including EVD, Lassa Fever, and Marburg Fever. However, there will likely be amendments and additions required as the current situation evolves and new information comes to hand. Regardless, the Guidelines provide a framework to understand the essential issues, structures, and processes that should be considered by institutions that employ individuals requesting deployment to the Ebola or other humanitarian aid response. Overall, these Guidelines aim to optimize Institutions’ capabilities in their three broad roles within this process: (1) to protect their patients, services and staff, (2) to support those employees who seek to answer the call for help, and (3) to contribute to the quality, safety, and professionalism of international humanitarian aid.

1. WHO Health Workforce Programme. Achieving the health-related MDGs. workforce_mdgs/en/ accessed 16 Feb 2015 2. WHO Density of Physicians and Nurses – total number per 1000, latest available year, Global Health Observatory Data1: health_workforce/ accessed 11 Feb 2015 3. Maia Majumder. Estimating the fatality of the West African Ebola Outbreak. Outbreak News Research & Policy.10 September 2014.http://www.healthmap. org/site/diseasedaily/article/estimating-fatality2014-west-african-ebola-outbreak-91014#sthash. BJCldDcq.dpuf Accessed 18 Feb 2015 4. WHO Ebola Roadmap Sitrep 18 Feb 2015 http:// situation-reports/ebola-situation-report-18february-2015 accessed 20 Feb 2015 5. Jack Linshi. Ebola health care workers are dying faster than their patients. Online Time Magazine Publication October 3, 2014. http://time.

com/3453429/ebola-healthcare-workers-fatalityrate/Accessed 16 Feb 2015 Burkle FM. Operationalizing Public Health Skills to Resource Poor Settings: Is this the Achilles Heel in the Ebola Epidemic Campaign? Disaster Med Public Health Preparedness. 2014;0:1-3 Johnson K, Idzerda L, Baras R, Camburn J, Hein K, Walker P, Burkle FM. Competency-based standardized training for humanitarian providers: making humanitarian assistance a professional discipline. Disaster Med Public Health Prep. 2013 Aug; 7(4):369-72. Burkle FM. Walls AE, Heck JP, Sorensen BS, Cranmer HH, Johnson K, Levine AC, Kayden S, Cahill B, VanRooyen MJ. Academic affiliated training centers in humanitarian health, Part I: program characteristics and professionalization preferences of centers in North America. Prehosp Disaster Med. 2013 Apr; 28(2):155-62. Centers for Disease Control and Prevention


Harvard Health Policy Review





Acknowledgements The authors would like to Acknowledge the following for their contributions in prior versions of the guidelines for Partners HeatlhCare System: Eric Goralnick, MD, MS, Medical Director of Emergency Preparedness, Brigham and Women’s Healthcare, Tim Murray MS, MBA, ARM, AIS, CPHRM, RF, Director of Risk Management and Insurance, Partners Risk and Insurance Services, Gregg Meyer, MD, Chief Clinical Officers, Partners HealthCare, Inc., Andrew Gottlieb, MGH Joanna Krasinski BWH of PHS Occupational Health, Dean Hashimoto, MD. Chief of Occupational and Environmental Medicine, Partners Human Resources. Hilarie H. Cranmer, MD, MPH, is the Director of Global Disaster Response at the Center for Global Health at Massachusetts General Hospital and is clinical faculty in the Department of Emergency Medicine. She is an Assistant Professor at Harvard Medical School and Harvard T.H. Chan School of Public Health. Her research focus is on the professionalism of humanitarian providers including developing educational initiatives, minimum standards in safety and security and providing effective response to those impacted by crises worldwide. She recently served as the Technical Advisor on Ebola for International Medical Corps who are providing care to those affected in West Africa. Dr. Paul Biddinger is the Vice Chairman for Emergency Preparedness in the Department of Emergency Medicine at Massachusetts General Hospital (MGH) in Boston. He is also the Medical Director for Emergency Preparedness at MGH and at Partners Healthcare. Dr. Biddinger additionally serves as the Director of the Emergency Preparedness and Response Exercise Program (EPREP) at the Harvard School of Public Health and holds appointments at Harvard Medical School and at the Harvard T.H. Chan School of Public Health. (CDC). Interim U.S. Guidance for Monitoring and Movement of Persons with Potential Ebola Virus Exposure. Atlanta, GA: CDC, 2014. (Accessed December 27, 2014, at ebola/exposure/monitoring-and-movement-ofpersons-with-exposure.html) 10. CDC. Advice for Humanitarian Aid Workers. Atlanta, GA: CDC, 2014. (Accessed December 27, 2014, at humanitarian-workers-ebola) 11. Wildes R, Kayden S, Goralnick E, Niescierenko M, Aschkenasy M, Kemen KM, Vanrooyen, M, Biddinger P, Cranmer, HH. Sign me up: Rules of the road for humanitarian volunteers during the Ebola outbreak. Disaster Med Public Health Preparedness. 2014;0:1-2 12. Rosenbaum, L. License to serve – U.S. trainees and the Ebola epidemic. N Engl J Med 2014, December 17 (Epub ahead of print).


Photo from Flickr, Creative Commons Attribution.

Using Data to Drive Innovation in Health and Biomedical Research John Quackenbush, PhD Nearly every major scientific revolution in history has been driven by one thing: data. Data drives innovation by allowing us to test and potentially falsify our existing models, to revise and refine those models, and to create and validate new models that can better capture the properties and behaviors of the natural world. From the development of the Copernican heliocentric model of our solar system to the modern theory of the gene, data has fueled discovery. In public health and biomedical research, new technologies are providing access to unprecedented quantities of data, accelerating the pace of research and discovery and, ultimately, changing how we approach a wide range of practical problems. The clearest example of data-driven innovation is tied to the explosion of omics technologies spawned in the wake of the Human Genome Project. Following the announced completion of the first draft of the human genome sequence, the cost of sequencing a human genome has fallen from hundreds of millions of dollars in 2000 to

Figure 1. The cost to determine the sequence of the 6 billion bases of DNA in each human cell has dropped to nearly $1000, making the use of human DNA sequencing practical in clinical and research applications.

a few thousand dollars today, with the time explore complex family structures to discover required to sequence shortening from years new genetic factors responsible for causing to about a day (Figure 1).1 In real terms, what disease. We have even begun to use sequencewas once beyond imagination is now possible based data to measure the composition of with a credit card. microbial populations that colonize our At the same time, sequencing technologies bodies to learn how these populations change have reduced the required quantities of input in response to or, in many cases, contribute to DNA and RNA, allowing different types changes in the state of our health. of omic data to be produced from a single The availability of omic data has biological sample. We can now generate led to many discoveries, particularly in previously unimaginable data sets that include cancer research, where we are increasingly the genetic sequence recognizing that of the genome itself, We can now generate cancer is defined estimates of which precisely previously unimaginable more of its 25,000 genes by the mutational are active in any given data sets that include the state of the tumor biological sample, genetic sequence of the cell than by the direct measurements tissue in which of which regulatory genome itself... enabling the disease occurs. factors are activating us, for the first time, to This has enabled each specific gene, therapies that analyze and model complex new and even the threetarget specific processes in mutations and the dimensional structure regulatory of the DNA as it is cells. development of packed into the cell’s what is referred nucleus, enabling to as “precision us, for the first time, to analyze and model medicine.” We now recognize at least four complex regulatory processes in cells. distinct molecular subtypes of breast cancer, Amazingly, data are being collected each of which has its own unique treatment not only on individuals but also on entire protocol. In melanoma and other cancers, populations, and at an exponential rate the discovery of a particular mutation in a that some people have characterized as gene called BRAF and the development of “faster than Moore’s Law” (meaning that drugs that specifically target this mutation the quantity of data doubles in less than 18 have dramatically increased survival rates. We months). Combining these data with complex hope that by better directing each patient to information gleaned from the corresponding the right therapy, we will not only improve patients’ electronic medical records allows overall outcomes, but also reduce the overall researchers to compare populations and cost of care (another substantial public Spring 2015 Volume 14, Issue 2


health benefit). And cutting-edge omics technologies apply not only to cancer; new biomarkers and target therapies are being developed for many other diseases as well. It is important to note that data-driven discovery goes beyond the exploration of sequencing data. Advanced technologies are producing an explosion in the quantity and types of imaging modalities that can be collected on single individuals. This capability, combined with new analysis techniques that extract quantitative numerical features from these images, has birthed new scientific disciplines, including radiomics, which may help us identify non-invasive biomarkers for predicting responses to therapies or even genetic changes that occur in disease. Advances in imaging could dramatically improve health outcomes by enabling early diagnosis and better management of treatment. Telecommunications and social networking data provide an unprecedented amount of information on social interactions. By combining this data with health information, and even search history data, researchers can learn more about the spread of disease as well as the social factors that influence health-related measures, such as obesity. Information on environmental exposures, gathered from remote-sensing devices or satellites, can be integrated with health records, direct experimental measurements,

This ongoing transformation in in health and biomedical research, although exciting, has nevertheless introduced one of our greatest scientific challenges, as we shift from generating data to addressing the important problems of collecting, managing, analyzing, interpreting, and presenting data in ways such that the data can be effectively consumed. and genetic profiles for us to begin decoding the complex gene-environment interactions that influence disease risk and severity. Even the role of microbes in maintaining and disrupting human health is being redefined by cataloging microbial populations and their structures across multiple body sites and many individuals, then combining that information with host health and genetic information. These are just a few examples of how access to unprecedented data is opening up new avenues of scientific investigation. This ongoing transformation in in health and biomedical research, although exciting, has nevertheless introduced one of our greatest scientific challenges, as we shift from generating data to addressing the important problems of collecting, managing, analyzing, interpreting, and presenting data in ways such that the data can be effectively consumed. Health and biomedical research are fast becoming information sciences that must

Photo via Wikimedia Commons, Creative Commons Attribution. François-Xavier Bagnoud Center for Health and Human Rights at Harvard T.H. Chan School of Public Health.


Harvard Health Policy Review

face the challenge of not working not only with “big data,” but with “messy data.” Indeed, the National Research Council’s (NRC’s) Committee on Massive Data Analysis concluded in its 2013 “Frontiers in Massive Data Analysis” report2 that the challenges associated with massive data go far beyond the technical aspects of data management (although they are not to be ignored). The consensus report noted that the key element in meeting Big Data’s challenges was the development of rigorous quantitative and statistical methods for interpreting the data. Failure to produce such methods invites the potential to “turn data into something resembling knowledge when actually it is not”; and that “overlooking this foundation may yield results that are not useful at best, or harmful at worst.” At the Harvard T.H. Chan School of Public Health, we have found ourselves particularly well suited to address many of the concerns raised by the NRC report. Our quantitative faculty in the departments of Biostatistics and Epidemiology, in particular, have a long and distinguished history of dealing with empirical, data-driven research questions in public health and biomedical research. Motivated by both the opportunities and challenges presented by big data, I and a large group of colleagues recently met to identify areas where there was a pressing need for developing new methods capable of handling the increasing volume, variety, and velocity of the data to which we had access. After presenting research challenges across domains ranging from disease research to environmental health applications, and sources of data that included cell phone records, medical records, and laboratory assay data, we found that nearly every significant problem we individually identified fit into one of four major theme areas. The first area we identified is data preprocessing (or normalization) and “hot spot” detection, the fundamental problem of which lies in comparing measurements and finding important components in the data. The issue may seem trivial, however

adaptive predictive models that can capture the meaningful biology in the systems that we seek to understand. A third area involves the need for better methods of ensuring reproducible research; meaning, that our results must be reliable. Although hundreds of thousands of omics assays have been run and thousands of omic biomarkers have been described in scientific journals as predicting medically relevant endpoints, only a handful of these biomarkers have been translated into routinely applied clinical assays. A big part of the reason for this failure is that published biomarkers are often not reproducible in other, independent data sets. The increased rate at which we collect data has given us a unique opportunity to use new data in an ongoing fashion to validate, Photo via Wikimedia Commons, update, and refine our mathematical models Creative Commons Attribution. and computational predictions. Of course, the challenge is to do it such that we can draw when performing quantitative analysis of conclusions without biasing the very models data, especially in large data sets, it is actually we are seeking to make reproducible. This significant. Imagine comparing CT scan may seem simple and nearly obvious, yet the images that were taken using two different subtleties involved in normalizing data and scanners. Although one might expect the building robust models are areas that require images to be identical, the truth is that they additional research. never are. In some instances, machineThe fourth and final area is the clear specific noise (also referred to as “batch need for methods that model the complex effects”) can be large enough to mimic actual networks represented in our data and then biological signals. We must determine how to use the structure of these networks to standardize and compare even simple data address health and biomedical research sets as well as develop robust methods that questions. In modeling gene regulatory work for Big Data so that researchers can networks, for example, we want to capture make meaningful comparisons and identify the interactions between various regulatory relevant features. elements and the genes they control. In A second major area requiring investigation principle, this is a combinatorially complex is the development of methods for integrative challenge, but one for which we can reduce data analysis. Our instinct is to use as much the complexity by using a variety of sources data as we can, but what we would actually of “prior knowledge” to posit a best guess prefer are independent measurements on as to the likely network structure and then to systems, not those so highly correlated that refine the structure given the available data. one system may be functionally equivalent to By comparing networks between states, we another. Indeed, in large, complex data sets can identify alterations in their structures and there are often hidden correlations—some suggest therapies that target a tumor cell but that are meaningful, some that represent leave a healthy cell untouched. In exploring redundancies, and disease transmission still others that are networks, we can use Data alone is not a mobile phone data to simply coincidental. Different data types panacea for propelling identify “communities” also carry different health and biomedical of people who are in levels of precision contact with each other and accuracy, and research forward. and overlay it with the may also carry very genetic data we can different weights in how we predict the collect on infectious agents to discover important properties of the systems that we likely routes of transmission and to develop study. Effective integrative analysis of data intervention strategies. The challenge that lies types will require both good preprocessing in these and other applications is in finding to remove batch effects and identify features, methods to infer networks from increasingly as well as methods that permit us to build large and complex data sets.

Data alone is not a panacea for propelling health and biomedical research forward. As the NRC report points out, a lack of investment in research that addresses, among other issues, the fundamental quantitative problems outlined here, could lead to erroneous conclusions. A wonderful example that illustrates this problem was created by Tyler Vigen, a Harvard Law School student who compiled data the US Census and the CDC and produced a website that allows visitors to pick a data element and search for correlations (http://www.tylervigen. com/). There you can find, for example, that the per capita consumption of cheese in the United States is highly correlated with the number of people who have died by becoming tangled in their bed sheets, or that honey bee populations are anti-correlated with the number of juvenile arrests for marijuana possession. Although these are obviously chance correlations that can be easily rejected as not being meaningful, in health and biomedical research data our incomplete knowledge about the systems we study can make rejecting correlations much more difficult. Despite potential challenges, I remain very optimistic that our growing ability to generate data will drive the next revolution in scientific understanding. I would also wager that the greatest advances made will have public health and biomedical research applications. But it is also clear that the path forward requires cautious and healthy skepticism, the hallmark of exceptional scientific research. John Quackenbush is Professor of Computational Biology and Bioinformatics in the Department of Biostatistics in the Harvard Chan School of Public Health. His research exploits the data revolution in health and biomedicine to explore the mechanisms that drive disease and to suggest new therapies. 1. Wetterstrand KA. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP). National Human Genome Research Institute, NIH [Internet]. 2014 Oct [cited 2015 Feb]. Available from: http://www. 2. Committee on the Analysis of Massive Data, Committee on Applied and Theoretical Statistics, Board on Mathematical Sciences and Their Applications, Division on Engineering and Physical Sciences. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press, 2013.

Spring 2015 Volume 14, Issue 2


Photo from Flickr, Creative Commons Attribution. HEALTH HIGHLIGHTS

The United Countries of America: Benchmarking the Quality of US Health Care Charles W. Slack, AB, Warner V. Slack, MD Social scientists often cite international statistics to demonstrate that the United States lags woefully behind other developed countries in the quality of health care.1,2 According to a Commonwealth Fund study of 19 countries, for example, the United States had the highest number of deaths that could have been prevented using existing health care procedures.3 A January 2008 position paper in the Annals of Internal Medicine warned that the United States ranks last among 6 developed nations (the others were Australia, Canada, Germany, New Zealand, and the United Kingdom) in quality of, access to, and equity of care, among other measures.4 How, policy experts and journalists ask, can the richest country in the world rank so low? What can be done to elevate our national health care system to the level of countries such as Sweden, Denmark, and Finland? As the authors of the Annals study opined, “The United States has much to learn from these countries.” We share the view that successful practices should be studied as potential models. One land, in particular, stands out to us as worthy of scrutiny, a land even better suited for study than those used in the Annals review. Located in the Northern Hemisphere, it has 5.3 million inhabitants notable for their overall health as well as a strong, comprehensive system of health care coverage. Best of all, to study and to learn from this “country,” US health care officials do not even need to pack their passports. The land of which we speak is Minnesota. Not only is Minnesota consistently cited by the


Harvard Health Policy Review

Commonwealth Fund and others as among the best US states when it comes to the availability and implementation of health care, it compares favorably in that area with the very countries cited in surveys as having superior health care to the United States. Consider Finland, a country often and justifiably lauded for its good health care, with a population (5.3 million) and demographic characteristics similar to those of Minnesota.5 Among benchmarks of comparison, Finland’s rate of mortality amenable to health care was 93.0 per 100,000 (2002-2003 data),3 somewhat higher than Minnesota’s rate of 63.9 per 100,000 (20042005)6; Finland’s infant mortality rate (deaths of infants younger than 1 year per 1000 live births) was 3.43 (2011),7 somewhat lower than Minnesota’s, which was 5.22 (2005);8 and Finland’s life expectancy at birth was 79.27 years (2011),9 slightly shorter than Minnesota’s of 80.5 years (2005).10 Indeed, when an American state can be measured against a similar country or region elsewhere, it tends to compare favorably. A 1993 study reported in Pediatrics compared infant mortality in Michigan with the Lorraine region of France—chosen for their similar socioeconomic makeup. Although crude infant mortality rates overall in the United States were higher than for much of the developed world, the study found that the rates in Michigan were lower than those in Lorraine—thanks (the authors surmised) to better access to neonatal intensive care.11 Nonetheless, blanket comparisons between the United States and individual countries persist and extend well beyond

medicine. Recently, a column in the New York Times presented statistics that showed the United States to be trailing in areas ranging from life expectancy, to student performance in math, to “food insecurity” (the percentage of respondents who said there had been times in the past 12 months when they did not have enough money to buy food).12 The author concluded “Not only are we not No. 1... we are among the worst of the worst.” In the words of the column’s headline writer, the United States is an “Empire at the End of Decadence.”12 Our intention is not to minimize the serious and complex challenges facing US health care and society as a whole, nor to sound a jingoistic refrain that America has nothing to learn from the outside world. Indeed, those reflexive chants of “We’re Number One!” seem as shortsighted and misguided as the lamentations that we are bringing up the rear. And it is clear that we have much to learn from our neighbors, near and far. Rather, we question what seems an overemphasis on statistical comparisons between a country of more than 300 million citizens—a country with liberal immigration policies and wide geographic, cultural, and economic diversity—and select, individual countries that are at once smaller and have more homogenous populations. (New Zealand, which has good reason to be proud of its health care system and is often used in comparisons, has a population [4.3 million] a little over half the size of New York City’s [8.4 million] and one that is more demographically homogenous).

Although the United States is certainly a unified republic politically, it more closely resembles 50 separate countries when it comes to health care delivery, each with its own government, its own immigration patterns, and its own socioeconomic challenges. Sparsely populated, mostly rural Wyoming, with a half million residents, has very different health care needs from California, with its diverse, immigrant-rich population of over 37 million. The same holds for small, homogenous, prosperous Vermont, with excellent health care statistics, when compared with, say, Mississippi.6 (The federal taxes paid by the citizens of Vermont do, however, help pay for the medical costs of the citizens of Mississippi, whereas the European countries typically used for comparisons are not for the most part levied for each other’s medical expenses.) Clearly, health-related information from countries, provinces, states, and other regions throughout the world can serve a highly useful purpose in helping us to know where there is a pressing need for improvement and where there is excellence that can be achieved under the best of circumstances. Agencies such as the Commonwealth Fund are an extremely valuable source for this information. However, lumping Wyoming, California, Vermont, and Mississippi together with 46 other states, and then comparing the combined results against Belgium or Switzerland (or even more populous countries, such as Germany or France), seems to us to be less productive than comparisons within our own country. This is particularly true because it is so hard to find countries, and regions within countries, that have demographic characteristics similar to American states, Finland’s similarity to Minnesota and Lorraine’s similarity to Michigan being exceptions. Countries with populations similar in size to American states tend to be quite different in other demographic characteristics (Table).13,14 The United Kingdom (population, 62.7 million; infant mortality rate, 4.62; and life expectancy, 80.05 years) is a country with more ethnic diversity than Finland and one that is used regularly in the international comparisons.15 Yet it is hard to find a sufficiently similar American state for comparison. California (population, 37.3 million; infant mortality rate, 5.2; and life expectancy, 79.7 years)16,17, the state that most closely approximates the United Kingdom in size of population, has substantially more diversity (57.6% white, 2011 census)18 than the United Kingdom (92.1% white, 2001 census).15

Although we share a sense of urgency about the need to strive continually to provide better access to health care19,20 , education, and proper nourishment, we cannot share the pessimism that such statistical comparisons reveal us to be a nation on the decline, that we suffer from some intrinsic national character defect, and that the solutions to our problems lie primarily in the sociopolitical structure of another, ostensibly more humane, republic. Indeed, an analysis of disparities in mortality among 8 regions in the United States21 and an analysis of recently released data demonstrating a wide range in life expectancy among counties throughout the United States—from 65.9 to 81.1 years for men and 73.5 to 86.0 years for women22— lend credence to the argument that it is better to focus on disparities between states, and perhaps even regions (eg, counties within states), than on our country as a whole for purposes of comparisons. Particularly in the area of health care, we think the answer lies not in becoming Luxembourg, Singapore, Denmark, or Austria—impossible, even if we were so inclined—but in asking why some American states lag so far behind other American states. Why, for example, do Mississippi, Louisiana, and Georgia have such a high rate of mortality amenable to health care when compared with Idaho, Oregon, and Washington?6 What can we do to eliminate these very real disparities, and, instead of focusing on what separates us from other countries, address the greatest challenge before us, in this vast, multifarious land—being the best at who we are and who we must become? Reprinted from Mayo Clinic Proceedings, Volume 86, Edition 8, Dr. Warner Slack and Dr. Charles Slack, The United Countries of America: Benchmarking the Quality of US Health Care, pages 788-790, 2011, with permission from Elsevier.

Charles Slack, a graduate of Harvard College, is a writer whose fourth book, Liberty’s First Crisis: Adams, Jefferson, and the Misfits Who Saved Free Speech (Atlantic Monthly Press) was published in March, 2015. He works in New York as an editor at Time Inc. His father, Warner Slack, whose research is devoted to the use of technology to empower both patients and doctors for better healthcare, is professor of medicine at Harvard Medical School and a member of the Division of Clinical Informatics, Department of Medicine at Beth Israel Deaconess Medical Center.

1. 2010 Commonwealth Fund International Health Policy Survey. Survey months: March 2010June 2010. The Commonwealth Fund Web site. http:// Surveys/2010/Nov/2010-International -Survey.aspx. Accessed June 17, 2011. 2. Krugman P. A healthcare system to die for. New York Times. January 9, 2008. http://krugman.blogs. -die-for/. Accessed June 17, 2011. 3. Noite E, McKee CM. Measuring the health of nations: updating an earlier analysis. The Commonwealth Fund Web site. Published January 8, 2008. Content/Publications/In-the-Literature /2008/Jan/ Measuring-the-Health-of-Nations--Updating-anEarlier-Analysis .aspx. Accessed June 17, 2011. 4. Ginsburg JA, Doherty RB, Ralston JF, et al. Achieving a high-performance health care system with universal access: what the United States can learn from other countries [published correction appears in Ann Intern Med. 2008;148(8):635]. Ann Intern Med. 2008;148(1):55-75. 5. Center for the study of upper Midwestern cultures, University of Wisconsin System Web site. About CSUMC. Accessed June 17, 2011 6. Mortality amenable to health care by state: aiming higher: results form a state scorecard on health system performance, 2009. The Commonwealth Fund Web site. http://www.commonwealthfund. org/Content/Charts/Report /Aiming-Higher-2009Results-from-a-State-Scorecard-on-Health-System -Performance/Mortality-Amenable-to-Health-Careby-State.aspx. Accessed June 17, 2011. 7. Finland infant mortality rate. IndexMundi Web site. http://www infant_mortality_rate.html. Accessed June 17, 2011. 8. Infant mortality in Minnesota. Minnesota Department of Health (MDH) Web site. http:// Accessed June 17, 2011.
9. Finland life expectancy at birth. IndexMundi Web site. http://www .indexmundi. com/finland/life_expectancy_at_birth.html. Accessed June 17, 2011. 10. Minnesota: Life expectancy at birth (in years) 2005. Kaiser Family Web site. jsp?ind=784&cat=2&rgn=25. Accessed June 17, 2011. 11. Howell EM, Vert P. Neonatal intensive care and birth weight-specific perinatal mortality in Michigan and Lorraine. Pediatrics. 1993;91(2):464469. 12. Blow CM. Empire at the end of decadence: it’s time for us to stop lying to ourselves about this country. New York Times. February 18, 2011. http:// www. Accessed June 17, 2011. 13. US Census Bureau. International Data Base (IDB). http://www.census .gov/ipc/www/idb/rank.php. Accessed June 17, 2011. 14. US Census Bureau 2010. Resident population data. http://2010.census .gov/2010census/data/ apportionment-pop-text.php. Accessed June 17, 2011. 15. United Kingdom demographics profile 2011. IndexMundi Web site. http://www.indexmundi. com/united_kingdom/demographics_profile.html. Accessed June 17, 2011. 16. California: infant mortality rate (deaths per 1,000 live births), linked files, 2004-2006. Kaiser Family Web site. http://www jsp?ind=47&cat=2&rgn=6. Accessed June 17, 2011. 17. California: life expectancy at birth (in years), 2005. Kaiser Family Web site. .jsp?rgn=6&cat=2&ind=784. Accessed June 17, 2011. 18. US Census Bureau: state and country quick facts, California. http:// states/06000.html. Accessed June 17, 2011 19. WelchWP, MillerME, WelchHG, FisherES, WennbergJE. Geographic variation in expenditures for physicians’ services in the United States. N Engl J Med. 1993;328(9):621-627. 20. Song Y, Skinner J, Bynum J, Sutherland J, Wennberg JE, Fisher ES. Regional variations in diagnostic practices. N Engl J Med. 2010;363(1):4553. 21. Murray CJ, Kulkarni SC, Michaud C, et al. Eight Americas: investigating mortality disparities across races, counties, and race-counties in the United States [published correction appears in PLoS Med. 2006;3(12):e545]. PLoS Med. 2006;3(9):e260. 22. Kulkarni SC, Levin-Rector A, Ezzati J, Murray CJL. Falling behind: life expectancy in US counties from 2000 to 2007 in an international context [published online ahead of print June 15, 2011]. Popul Health Metr. doi:10.1186/1478-7954-9-16]. http://www. /1478-7954-916.pdf. Accessed June 17, 2011.

Spring 2015 Volume 14, Issue 2


Photo from Flickr, Creative Commons Attribution. HEALTH HIGHLIGHTS

Privacy: Does the Identifiability of Data Mean That Privacy is Dead? Isaac Kohane, MD, PhD, Patrick L. Taylor, JD Over the last decade, with increasing volume, data merchants, some search engine companies and some data scientists have tried to persuade policy-makers and the public that personal privacy is dead, privacy is an illusion or that we live in a post-privacy era.1 It is no accident that these statements issue from those who have a strong economic interest in personal data being available to them for any purpose, good or bad, or academic specialists with unusual skills not generally available to criminals, or government agencies concerned about any public threat and uncertain how to weigh or address this one. The political goal of data merchants, however is undisguised: to urge a policy of permissiveness based on the asserted inevitability of any data in any context being identifiable. They count on people’s desire for technology and the value of good uses of data preventing a clamp down on data use overall. In doing so, they are playing a game of “informational chicken” with social policy, for their claims have fostered the creation of equally extreme if well-intentioned advocates to oppose them. So while one group might be caricatured as saying “The price of technology today is data nakedness. Put up no fight. Let us use information as we wish, target you with ads as we wish, sell your information as we wish, categorize you and generalize about you as we wish,” in medical research by contrast, there is an equally entrenched coalition which sees any social use of data as a personal threat of paramount importance, regardless of purpose, and might be caricatured as responding “Let no one touch my data, for any purpose, without my consent. Not to trace SARS outbreaks or address hospital infections, not to improve


Harvard Health Policy Review

heart surgeries for my contemporaries, develop under any circumstances, society will protect precisely targeted therapies for cancer2 or women and men from being looked at under improve care for tomorrow’s kids.” Those their clothes. “Every person, male or female, who cut away that sort has a right to privacy beneath of data access, as if What should we his or her own clothing,” right and wrong were Suffolk County District indistinguishable, want think? What should Attorney Daniel Conley said health, and are among we do? Those who in a statement. “If the statute the first to say they are argue that privacy as written doesn’t protect that entitled to it; but they privacy, then I’m urging the want to spin straw into is dead in Dataland Legislature to act rapidly and gold, forgetting that there should therefore take adjust it so it does.” The DA is no Rumplestiltskin alive not throw up his hands care, for what will did who can do that, and that and say “Oh well, there goes his price, in any event, happen if people privacy,” conceding that was the life of the first- really believe them? judicial Might makes moral born child. The rest of Right. Nor did he suggest us, feeling uncertain and ill at ease, nervously that the only solution is to shut down public adjust the privacy settings on our phones and transportation or ban cell phones. Legal power laptops. What should we think? What should should reflect moral clarity. Our social and we do? Those who argue that privacy is dead in individual fate depends on it. Dataland should therefore take care, for what The decision prompted immediate will happen if people really believe them? public furor. Within two days the impossible That whole heart-breaking vision is an happened: government, so often stalemated illusion. A better approach, instructive here, by the parties’ efforts to gain most votes arose last year, after the Supreme Judicial Court by accomplishing nothing, actually worked. of Massachusetts ruled that a man taking The Legislature revised the statute and the cellular phone-enabled photographs “up the Governor signed it. skirt” of a police decoy on the public subway The furor masks important principles. The system did not violate a “peeping Tom” Legislature did its job originally with a focused criminal law that prohibited photographs of statute addressing the problem it saw. Principle: nude and partially nude women in locations One should prohibit carefully. The court did its where they had reasonable expectations of job. Principle: when a person will face two and privacy. The statute was aimed at people who a half years in jail for doing something, courts peep into homes and photograph women as rightly compel government to warn them clearly, they undress. The court’s decision focused on without ambiguity. The Legislature did its job whether dressed women were “partially nude.” again. Principle: distinguish good and bad acts But it was read as a declaration on whether, and prohibit the bad uses of technology, not

the good uses. And an engaged public, living in the “Commonwealth” (a profound word) of Massachusetts, expressed definitively that people want everyone’s privacy protected, not just “my own,” and can distinguish bad acts and actors from overreactions. The protection is in the form of prohibiting peeping Toms, not, for example banning all men from carrying cell phones on the subway. People have no trouble distinguishing peeping Toms from men who use cell phones appropriately. We expect the law to reflect this distinction. So the solution was reasonable and common sense. The Legislature extended the criminal prohibition. But was that enough? They should have banned all cell phones, right? More, they should have banned women from the public ways, or required them to be clothed from head to foot in large polymer sacks. The latter would be good for business so must be right: it would promote industrial competition around the extent to which even nanoparticle size cameras could not penetrate its secrecy. Most effective of all, completely foolproof, would be removing the eyeballs of all men, so they were disabled from seeing – even doctors who were trying to assist the women within when they were ill, or scientists who were asked to discover ways to prevent or cure autism in their children. That discussion is heterosexist of course: it needs to be applied in all ways to all genders. So we would all shuffle around in polymer sacks, we will all be alone, unable to communicate even by cell phone, and we will all of us be blind and in the dark. These approaches would be precise analogies to what our country is doing with theoretically identifiable health data. But legal change did not adopt those approaches. Instead the Legislature declared that peeping was egregiously bad behavior and should be regarded as a criminal invasion of someone’s privacy with consequent punishment. This commonsense and broadbased response to the SJC ruling points, we believe, to the way we should address protection of data privacy and data identifiability, and would be if the debate were brought to an alerted public rather than debated in arcane publications and ornate legislative lobbies. That is, we would keep in mind that we care about health and research and privacy3 – all three not just any one; we would know that technology can be used for good or bad; we would shake our heads at the idea that good and bad cannot be told apart; and we would recognize that people can be told apart too – scientists who care for children can be told apart, with an ultra-high degree of confidence, from commercial data merchants who secretly

mine and sell data – and we would not rely on a phony argument to prohibit both acts or send both to jail. Just because patients can be identified hypothetically from extremely rich datasets by society’s small squad of technogeeks,4,5 like us, does not mean that an attempt to do so cannot be identified and punished. Just as women taking public transportation should have the expectation that they will not be subject to intrusive photographs even if cell phones make it possible, so too should altruistic volunteers of data for the acceleration of medical research have the expectation that anybody who attempts to achieve identification will be punished by law for unacceptable antisocial behavior.6 In truth, there is no legitimate reason or motivator for a scientist to do the substantial work of deriving a name, except as one way to make sure records really relate to a single person, and if the person so wishes, to contact the person at their request if a discovery relating to them is something they wish to know or a research trial might interest them. The effort creates no personal benefit for a scientist, only for the person who wants to sell you something or sell information about you to someone else. Scientists do neither. Moreover, as we advocated over the last decade, individuals should be able to attach to their data declarations whether or not they wish to have their identity shared or they wish their name unknown to research. Data in aggregated form, with identification completely impossible, and some identifiable but not identified personlevel data are essential, underscore essential, to seeing how viruses move across the country, or invade the body depending on particular genes, or proteins accumulate in the brain, or blood vessels deteriorate, or cancers spread. How could it not be? How could knowledge about us not depend on knowledge about me and you? Will cures for you and people who are like 1. Esguerra R. Google CEO Eric Schmidt Dismisses the Importance of Privacy. Electronic Freedom Frontier [Internet]. 2009. 2. Toward precision medicine : building a knowlege network for biomedical research and a new taxonomy of disease. Washington, DC: National Academies Press; 2011. 3. Grande D, Mitra N, Shah A, Wan F, Asch DA. Public preferences about secondary uses of electronic health information. JAMA Intern Med. 2013 Oct 28;173(19):1798-806. PubMed PMID: 23958803. Pubmed Central PMCID: 4083587. 4. Sweeney L. Privacy and medical-records research. N Engl J Med. 1998 Apr 9;338(15):1077; author reply -8. PubMed PMID: 9537887. 5. Gymrek M, McGuire AL, Golan D, Halperin E, Erlich Y. Identifying personal ge-

you come solely from people who are not you and not like you? Of course not. To express the idea is to expose it as magical and self-defeating thinking. To imagine personalized medicine for you arising from someone else’s data borders on the insane. But personal data need not be identified data and almost never should be. The space between identifiable data and identified data is the same as the space between prohibiting peeping Toms and prohibiting cell phones or putting us all alone in polymer sacks. It is the space between protecting privacy by stopping research and protecting privacy while improving our prevention of illness and care for the ill, elderly and dying. Indeed the analogy is quite exact: do we wish to be blinded and living in worlds of our own, and must we pay that price to stop peepers? Or can we tell the difference between good and bad intentions and conduct, refuse to play the seductive game of informational chicken that data merchants propose, and have law reflect our ability to declare what is wrong and what is right? At this moment, decisions about these matters are being made without the benefit of true public debate. Recent events in Massachusetts give us reason to believe that public discussion and government institutions can make the right choice, and not foreclose research and quality improvement in health care to stop misuse of personal data. Isaac Kohane, MD, PhD is Chair of the Department of Biomedical Informatics at Harvard Medical School. He develops and applies computational techniques to address disease at multiple scales: From whole healthcare systems as “living laboratories” to the functional genomics of neurodevelopment. The academic work of Patrick L. Taylor (Assistant Clinical Professor, HMS, Affiliate Faculty, Petrie-Flom Center at HLS) focuses on emerging issues in science and medicine where law and science poorly intersect. His policy contributions concerning bioethics, stem cells, conflicts of interest, privacy, and genomics have been recognized nationally and internationally. nomes by surname inference. Science. 2013 Jan 18;339(6117):321-4. PubMed PMID: 23329047. 6. Kohane IS, Altman RB. Health-information altruists--a potentially critical resource. N Engl J Med. 2005 Nov 10;353(19):2074-7. PubMed PMID: 16282184.

Spring 2015 Volume 14, Issue 2


Photo from Flickr, Creative Commons Attribution.


Using Predictive Modeling to Reduce Readmissions: Policy Implications Tanmoy Das Lala The United States is the highest healthcare spender in the world. High rates of preventable hospital readmissions contribute to the exorbitant spending. New incentives to lower costs have motivated healthcare systems to implement care management programs to reduce rates of preventable readmissions. However, identifying the best patient cohort for enrollment into care management remains a challenge. One method of identifying patients for care management is the use of predictive modeling techniques to determine individuals who are at high-risk for readmissions. I discuss three ways in which modeling healthcare data can be strengthened: 1) using Natural Language Processing, 2) incorporating psychosocial and demographic data, and 3) balancing sensitivity and specificity values to determine â&#x20AC;&#x153;optimalâ&#x20AC;? risk-score threshold for program enrollment. Finally, I discuss policy provisions that may increase adoption and implementation of predictive modeling to reduce potentially preventable readmissions. Recent estimates show that the United States has the highest per capita healthcare spending in the world.1 Hospital readmissions are one of the driving factors of high healthcare spending.2 Almost one in five Medicare inpatients are re-hospitalized within 30 days of discharge, accounting for almost $17 billion or 20% of Medicareâ&#x20AC;&#x2122;s total hospital payments.3 Of the $17 billion, Medicare spends over $12 billion (70.6%) annually on readmissions that are deemed potentially preventable.2 A readmission is potentially preventable if there is a reasonable expectation that it could have been prevented by: (1) the provision of quality care in the initial hospitalization, (2) adequate discharge planning, (3) adequate post-discharge follow up, or (4) improved coordination between inpatient and outpatient healthcare teams.4 Several recent policy developments have focused on reducing potentially preventable 30-day readmissions. In 2009, Centers for Medicare & Medicaid Services (CMS)


Harvard Health Policy Review

began publicly reporting hospital specific Risk-Standardized Readmission Rates to motivate hospitals to reduce readmissions.5 In 2010, introduction of the Medicare Hospital Readmissions Reduction Program, part of the Patient Protection Affordable Care Act (PPACA), created new penalties to reduce readmissions for acute myocardial infarction, heart failure, and pneumonia. Through this program, hospitals with high 30-day readmission rates can lose up to 3% of their Medicare reimbursement by 2015.6 Consequently, many national, state-based, and community-level initiatives have emerged to help reduce readmissions. In this article, I argue that predictive modeling (PM) techniques can be used to identify patients who are at high risk for readmissions and can then be enrolled in care management (CM) programs. First, I discuss the role of CM programs in reducing preventable readmissions along with the challenges associated with patient

identification for enrollment. Next, I evaluate the pros and cons of using PM techniques for identification and discuss additional work that needs to be done to strengthen modeling performance. Finally, I discuss policy changes that can bolster the use of modeling in CM enrollment to reduce preventable readmissions. Intervention using Care Management In order to reduce preventable readmissions, many healthcare organizations have created CM teams.7 Care managers work directly with patients to coordinate care services associated with hospital discharge, provide information about their disease, reconcile medications, answer follow-up questions, and manage treatment-related services with other healthcare providers.7 Well-implemented care management programs can reduce hospital use, lower costs, and improve clinical outcomes for patients with chronic diseases. For example, patient-centered management of selected high-risk patients in California decreased inpatient hospitalizations by 38% and reduced costs by $18,000 per patient.8 As another example, care management programs in California and New York led to a 46% reduction in 30-day readmissions rates among elderly patients with heart failure, and 21% reduction among dually eligible Medicare and Medicaid beneficiaries with special needs.7 Despite these positive outcomes, identification of the best patient cohort for potential enrollment in CM programs remains a challenge due to insufficient published research and a lack of consensus among administrators, healthcare providers, and policy experts to guide enrollment strategy.

Predictive Modeling for Care Management Enrollment Given the challenges associated with patient cohort identification, one proposed approach is to use PM techniques.9 A predictive model takes quantifiable data elements from a patient’s medical history and makes a prediction about future healthcare use. Predictive models may use classical methods like regression or more sophisticated machine learning techniques like neural networks to predict future outcomes, such as readmissions. A PM can generate a readmission risk score for each patient based on medical history. Subsequently, a risk score threshold can be used to determine a patient’s inclusion into CM programs. Patient readmission can also be predicted in a binary sense; that is, the model will predict the outcome as “yes” or “no”: the patient will either be predicted to be readmitted for a certain condition or will not be readmitted at all. Patients who are predicted to be readmitted may be enrolled in CM. Investigators have used PM to identify several factors that are associated with elevated risks of readmissions such as demographics, healthcare utilization history, insurance status, and length of stay.10,11,12 However, a 2011 review found that many risk prediction models have only modest levels of accuracy.13 In the past, a majority of the models have used regression. While there are several benefits of using regression, the limitations are significant. For example, regression techniques have difficulties accounting for missing data, large numbers of variables, hidden relationships between variables, and unusual distributions commonly seen in healthcare data. Since then, significant efforts to optimize predictive performance have achieved much higher discriminative accuracies. These newer techniques, adopted from machine learning, can better handle the limitations associated with regression.14,15,16,17 Subsequent to achieving greater modeling accuracy, several medical institutions including the University of Pittsburgh Medical Center,18 Carolinas Healthcare System,19 and Parkland Hospital20 have used these models to enroll patients in CM and have successfully reduced readmissions. The promise of predictive modeling technologies has led to the development of several commercially available products like The Johns Hopkins ACG System,21 IBM Predictive Modeler, SPM v7.0, Predixion, etc.19 Hospital systems are increasingly either using these software products or building their own models to assess the readmission risks of high healthcare utilizers. As healthcare enters the digital age, the continual improvement of predictive performance can help increase the use of these statistical tools in CM enrollment. Modeling outcomes, however, are dependent not only on analytic methods but

also on the quality of data source and extent of accessible data elements. Many studies use discharge and insurance claims data from statewide or national databases.24 This approach has several strengths: it has lower data collection costs compared to medical record abstraction, information on uninsured patients not available from third party payers, and greater reliability than self-reported medical expenditures. However, there are inconsistencies across states and providers in the way they report specific data elements. Additionally, there are hospital-specific errors in data reporting that can lead to data quality problems. An alternative approach is to use data from electronic health records (EHRs). In addition to billing codes, EHRs contain detailed clinical information such as test results, severity of illness, behavioral risk factors, and physicians’ narratives. Creating more comprehensive data repositories that include EHR, claims, and discharge data may improve PM performance.23 Improving Predictive Modeling Outcomes to Reduce Readmissions Despite the promise PM holds in reducing readmissions, additional work is needed. First, EHRs have a significant amount of free text associated with patient summaries that contain valuable clinical information.24 Strategic implementation of advanced Natural Language Processing is necessary to extract content from the free text and convert them to structured data. These data elements can then be used in modeling studies. Second, healthcare systems should collect and encode additional information from patients about broader social, environmental, and medical factors such as perceived health, self-care abilities, access to care, social support, and substance abuse. Some models in the past have found these factors to contribute to readmission risk, but the predictive value of these factors is widely understudied.13,25 Third, a better method is needed to determine risk score cut-off levels for CM enrollment. At present, individuals are identified for CM if their risk scores exceed a predetermined cut point.21 Current thresholds are often based on program capacity where the distribution of risk scores lies on the high end of the scale to match the proportion of members who can be managed.21 Resource availability and expected savings are also important players in decision-making.9 Using a high threshold creates a situation where most patients enrolled are high-risk for readmissions. However, many high-risk patients are missed. On the other hand, using a low threshold qualifies many more patients to be enrolled in CM. However, not all patients are necessarily at high risk for readmissions. Consequently, finding a balanced threshold is imperative. Research shows that using “optimal” cut point tailored

to each disease condition exhibits better predictive performance than an arbitrarily determined cut-off level approach.21 Thus, a more systematic statistical approach along with financial considerations is desirable. Finally, use of predictive modeling in healthcare must overcome the ‘impactibility’ problem.26,27 Even if sophisticated algorithms accurately identify high-risk patients for readmissions, the extent to which providers can intervene is limited because of current disease management approaches and resource constraints.27 Organizations should prioritize managing patients more likely to respond to preventive care, such as patients who are sensitive to ambulatory care, rather than extremely high-risk patients for whom preventing re-hospitalization is unlikely because of factors such as disease severity, mental health diagnoses, demographic characteristics, etc.26 Assessing efficiencies of these approaches can provide novel insights into how predictive modeling can be better used to create greater impact in healthcare. Policy Implications To reduce preventable readmissions using PM, some policy provisions are warranted. First, increased funding opportunities for research are necessary. Compared to other industries like financial services, climate, and entertainment, the use of PM in healthcare is recent. In the past few years, although many technical advances have been made to improve algorithms and achieve better outcomes, resources are limited. There is significant need for greater funding from government, state, and private agencies to explore effective applications of these algorithms. Also, very few informatics-related national societies exist where users and developers of PM can convene to discuss work in this field. More funding opportunities to adequately train interested personnel and form associations can enhance cross-fertilization of ideas and innovations. Second, making provisions for greater EHR adoption can improve the quality of data for research. Earlier I argued that PM research might be improved by having more comprehensive data. To realize that goal, widespread EHR adoption is necessary. Under Medicare and Medicaid’s EHR Incentive program, eligible professionals and hospital systems receive incentive payments if they adopt, implement, upgrade and eventually demonstrate “meaningful use” (MU) of certified EHR technology.28 It is hoped that compliance with MU can result in improved clinical and population health outcomes along with more robust research data on health systems.29 As of 2013, nearly six in ten (59%) non-federal acute care hospitals had adopted at least a basic EHR system; a 34% increase from the previous year.30 Additionally, nearly 78% of office-based physicians used any type Spring 2015 Volume 14, Issue 2


of EHR system in 2013 compared with 18% in 2001.31 While EHR adoption rates continue to increase, a recent study shows that adoption varies widely by specialty.32 Psychiatry, dermatology, pediatrics, ophthalmology, and general surgery practices are significantly less likely to adopt EHRs compared to general practitioners. Increasing financial incentives to specialty practices as part of the EHR Incentive Program may boost EHR adoption rates. With greater EHR adoption, more comprehensive data can be collected about patient populations. Third, greater reimbursement for costs of CM program implementation is needed. After identifying high-risk patients using risk assessment tools, implementing care management programs can be resourceintensive and expensive. To bear the costs, some institutions have used internal funding, while others have had partial or full capitation arrangements with Medicare or Medicaid.7 Medical centers like UCSF used grant funding to start and prove its model and then relied on Medicaid pay-for-performance program for safety-net institutions to sustain their initiatives.7 However, not every medical institution, especially in underserved areas, has funding available to implement such programs, but continue to be at risk for readmission penalties. Additionally, given Medicare’s current “prospective payment” system of reimbursement, which has many characteristics of a traditional fee-for-service system, if a hospital reduces readmissions but cannot fill its beds, then revenues decrease. Consequently, some hospitals may find it more financially sustainable to pay CMSimposed penalties than expend resources on reducing readmissions. Recent policy provisions may help defray some costs. For example, hospitals may form Accountable Care Organizations (ACOs), a provision made under the PPACA.33 In an ACO, doctors, hospitals, and other healthcare providers form integrated networks to coordinate better patient care. Even though ACOs follow a fee-for-service system, they are eligible for financial bonuses through programs like Medicare Shared Savings if the care reduces spending. CMS created a second incentive called the Pioneer Program where high-performing healthcare systems can keep more of the expected savings in exchange for greater investment in the implementation of cost-effective care frameworks.28 These, along with direct reimbursements for CM costs and appropriately structured pay-for-performance strategies may incentivize hospitals to focus on the quality and efficiency of care to reduce readmissions rather than risk losing revenues. Conclusion Under the nation’s current financial landscape, reducing exorbitant healthcare spending is a priority. Recent policy changes are focusing


Harvard Health Policy Review

on rapid development and implementation of novel strategies that can reduce preventable hospital readmissions –a key contributor to healthcare spending. In this article, I have discussed how well implemented CM programs can potentially decrease readmissions. Using statistical tools like PM can address challenges associated with identifying patients who can most benefit from CM enrollment. As modeling performance improves, PM’s potential can be cultivated even further to address the issue of high-risk readmissions. Greater funding opportunities for research and CM implementation along with incentives for EHR adoption can increase the use of PM in this domain. Given the flexibility of data mining and machine learning tools, PM can find integral utility in future goals of precision medicine, and early interventions in preventive care; it can also contribute to addressing other healthcare outcomes aimed at eventually improving overall population

health and wellness.

1. The World Bank. Health Expenditure Per Capita. [Online] Available from: SH.XPD.PCAP [Accessed 19th January 2015] 2. Medicare Payment Advisory Commission. Payment Policy for Inpatient Readmissions in Report to the Congress: reforming the delivery system. 2008. 3. Jencks SF, Williams MV, Coleman EA. Rehospitalizations among patients in the Medicare fee-for service program. New England Journal of Medicine. 2009; 360(14): 14181428. 4. Goldfield NI, McCullough EC, Hughes JS, Tang AM, Eastman B, Rawlins LK, et al. Identifying potentially preventable readmissions. Health Care Financing Review. 2008; 30(1): 75-91. 5. Centers for Medicare and Medicaid Services. Medical Hospital Quality Chartbook 2010. [Online] Available from: http:// [Accessed 10th January 2015]. 6. Centers for Medicare and Medicaid Services. Readmissions Reduction Program. [Online] Available from: http://www. AcuteInpatientPPS/Readmissions-Reduction-Program.html [Accessed 10th January 2015]. 7. McCarthy D, Cohen A, Johnson MB. Gaining Ground: Care Management Programs to Reduce Hospital Admissions and Readmissions Among Chronically Ill and Vulnerable Patients. Case Study Series Synthesis, The Commonwealth Fund. 2013. 8. Sweeney L, Halpert A, Waranoff J. Patient-centered Management of Complex Patients Can Reduce Costs Without Shortening Life. American Journal of Managed Care. 2007; 13(2): 84-92. 9. Billings J, Blunt I, Steventon A, Georghiou T, Lewis G, Bardsley M. Development of a Predictive Model to Identify Inpatients at Risk of Re-admission Within 30 Days of Discharge (PARR-30). British Medical Journal Open. 2007; 2(4). 10. Hu J, Gonsahn MD, Nerenz DR. Socioeconomic Status and Readmissions: Evidence From an Urban Teaching Hospital. Health Affairs. 2014; 33(5): 778-785. 11. Hasan O, Meltzer DO, Shaykevich SA, Bell CM, Kaboli PJ, Auerbach AD, et al. Hospital Readmission In General Medicine Patients: A Prediction Model. Journal of General Internal Medicine. 2010; 25(3): 211-219. 12. Herrin J, St Andre J, Kenward K, Joshi MS, Audet AM, Hines SC. Community Factors and Hospital Readmissions Rates. Health Services Research. 2014. 13. Kansagara D, Englander H, Salanitro A, Kagen D, Theobald C, Freeman M, et al. Risk Prediction Models for Hospital Readmission: A Systematic Review. Journal of the American Medical Association. 2011; 306(15): 1688-1698. 14. Shulan M, Gao K, Moore CD. Predicting 30-day All-Cause Hospital Readmissions. Health Care Management Science. 2013; 16(2): 167-175. 15. Yu S, van Esbroeck A, Farooq F, Fung G, Anand V, Krishnapuram B. Predicting Readmission Risk With Institution Specific Prediction Models. In: Proceedings of the IEEE International Conference on Healthcare Informatics 2013. 2013. p. 551-556. 16. Mather JF, Fortunato GJ, Ash JL, Davis MJ, Kumar A. Prediction of Pneumonia 30-Day Readmissions: A Single Center Attempt to Increase Model Performance. Respiratory Care. 2014; 59(2): 199-208. 17. Natale J, Wang S, Taylor J. A Decision Tree Model for Predicting Heart Failure Patient Readmissions. In: Krishnamurthy A, Chan WKV. (eds.) Proceedings of the 2013 Industrial and Systems Engineering Research Conference. 2013.

18. Healthleaders Media. Readmissions ‘Drop Like a Rock’ With Predictive Modeling. [Online] Available from: http://www. [Accessed 13th January 2015]. 19. Healthcare IT News. Predictive Analytics Lowers Readmissions. [Online] Available from: [Accessed 25th January 2015]. 20. Amarasingham R. Applying Data Analytics and Information Exchange to Improve Care for Patients. Health Affairs. 2012; 31(12): 2785-2786. 21. Murphy SME, Castro HK, Sylvia M. Predictive Modeling in Practice: Improving the Participant Identification Process for Care Management Programs Using Condition-Specific Cut Points. Population Health Management. 2011; 14(4): 205-210. 22. Schoenman JA, Sutton JP, Kintala S, Love D, Maw R. The Value of Hospital Discharge Databases. Agency for Healthcare Research and Quality. Report number: 282-98-0024, 2005. 23. Health Catalyst. Why Predictive Modeling in Healthcare Requires A Data Warehouse. [Online] Available from: https:// Why-predictive-modeling-healthcare-requires-a-data-warehouse.pdf [Accessed 28th January 2015]. 24. Ohno-Machado L. Realizing the Full Potential of Electronic Health Records: The Role of Natural Language Processing. Journal of the American Medical Informatics Association. 2011; 18(5): 539. 25. Meek JA. Predictive Modeling Challenges and Opportunities for Case Management. Professional Case Management. 2012; 17(1): 15-21. 26. Lewis GH. “Impactibility Models”: Identifying the Subgroup of High-Risk Patients Most Amenable to Hospital-Avoidance Programs. The Milbank Quarterly. 2010; 88(2): 240-255. 27. Wharam JF, Weiner JP. The Promise and Peril of Healthcare Forecasting. The American Journal of Managed Care. 2012; 18(3): e82-85. 28. Centers for Medicare and Medicaid Services. EHR Incentive Programs. [Online] Available from: Regulations-and-Guidance/Legislation/EHRIncentivePrograms/index.html?redirect=/ehrincentiveprograms/ [Accessed 25th January 2015]. 29. Centers for Medicare and Medicaid Services. 2014 Definition Stage 1 of Meaningful Use. [Online] Available from: http:// EHRIncentivePrograms/Meaningful_Use.html [Accessed 21st January 2015]. 30. Office of the National Coordinator for Health Information Technology. Adoption of Electronic Health Record Systems Among U.S. Non-Federal Acute Care Hospitals: 2008-2013. [Online] Available from: [Accessed 21st January 2015]. 31. U.S Department of Health and Human Services. Use and Characteristics of Electronic Health Record Systems Among Office-Based Physician Practices: 2001-2013. [Online] Available from: db143.pdf [Accessed 29th January 2015]. 32. Grinspan ZM, Banerjee S, Kaushal R, Kern LM. Physician Specialty and Variations in Adoption of Electronic Health Records. Applied Clinical Informatics. 2013; 4(2): 225-240. 33. The Henry J. Kaiser Family Foundation. FAQs on ACOs: Accountable Care Organization, Explained. [Online] Available from: [Accessed 24th January 2015].

Acknowledgements The author would like to thank Dr. Zachary M. Grinspan, research adviser, of Weill Cornell Medical College (WCMC) for his helpful comments and edits. The author would also like to thank Dr. Mark Unruh (WCMC), Eric J. Kutscher (WCMC), and Brandy K. Richmond (DePauw University) for their advice.

Tanmoy Das Lala is currently a Master’s student at the Center for Health Informatics and Policy at Weill Cornell Graduate School of Medical Sciences in New York, NY. His career interests include practicing primary care and continuing health services research using machine learning tools.


Whole Genomic Data Integration Into the Electronic Health Record: Examining Current Practical and Ethical Challenges cancer) as well as those associated with certain Achal P. Patel

The success of the Human Genome Project in 2003 generated widespread interest in genomics or personalized medicine. Roughly a decade later, our ability to generate genomic data greatly outstrips our ability to interpret it. For such “big data” to be useful, it must be organized and incorporated into another emerging technology – the Electronic Health Record. Integration of genomic data and Electronic Health Records is crucial to realization of personalized medicine. However, before that step is achieved, we must first address key practical and ethical challenges involved in the integration process as well as downstream implications. It is quintessential that we frame our solutions to these problems around the preferences of health care’s central stakeholders—the patient. In 2003, the completion of the Human Genome Project (HGP) marked one of the greatest accomplishments of modern science and technology. A multi-billion dollar investment at the time of its undertaking, the project involved sequencing and mapping of the complete set of genetic information (the genome) that serves as the blueprint for human development.1 The eventual success enjoyed by the project proved to be a catalyst towards capturing widespread interest in the concept of genomic medicine.1 Suddenly, the idea of sequencing personal genetic information to aid diagnostics and tailor therapeutic strategies did not seem so improbable, withholding the enormous cost and time barriers. A little over a decade later, unforeseeable advancements in DNA sequencing and associated high-throughput technologies have virtually resolved both cost and temporal barriers towards feasibility. Today, the entire human genome can be sequenced within a short time span at a cost of roughly $1,000, with a trend towards decreasing costs.2,3 The resulting potential for benefit on the clinical side is vast. Physicians can now examine an array of associated genetic variations underlying a particular disorder rather than being restricted to analyzing specific single nucleotide variations mapped out by the HGP.4,5 However, this exponential growth in the ability to generate whole genomic data presents its own share of issues and challenges. At the moment, the rate of genomic data generation greatly exceeds the rate at which we are able to organize and meaningfully interpret that generated data, thereby limiting the

desired end product: personalized, precise clinical care.6 Fortunately, the advent of Electronic Health Records (EHRs) holds exceptional promise in overcoming this fundamental limitation.7 An integral component of a recent paradigm shift in health care towards increased Information Technology, EHRs consolidate real-time data on a range of patient characteristics and health metrics. From a theoretical standpoint, the integration of whole genomic data into EHRs should grant physicians an efficient and powerful tool in synthesizing genomic profiles and tailoring their clinical care accordingly. Presently, a multitude of challenges, ranging from practical to ethical, exist concerning this integration. This article discusses some of the key challenges across both frontiers and evaluates potential solutions. The Need For Genomic Data Processing Sequencing of the entire genome can yield as many as 50,000 gene variants per individual.7 If all of the raw information is uploaded to EHRs, the sheer volume would actually detract from a physician’s ability to effectively diagnose and treat patients.7 Thankfully, a vast majority of these gene variations are benign based on our current understanding, and for the sake of practicality, should be excluded from EHRs.7 Amongst the clinically relevant, potentially malignant gene variants, researchers recognize two major categories: clinically actionable variants and clinically non-actionable variants.5 Clinically Actionable Gene Variants Clinically actionable gene variants include both variants that place an individual at high risk for preventable disease (e.g. BRCA1/2 for breast

Mendelian disorders.5 While rare, these set of variants theoretically offer the greatest clinical utility upon detection. In the context of EHRs, incorporation of actionable variants is imperative because they inform and guide well established medical care. Presently, there is a large practical barrier towards utility following incorporation; the existing clinical infrastructure does not promote patient engagement in deciding care pathways. At this juncture, there is great potential for both shared decision-making (SDM) between the physician and patient as well as incorporation of patient decision aids. Decision aids engage the patient and communicate a readily understandable summary of the pros and cons associated with various clinical options (ex. mastectomy versus no surgical treatment for women with BRCA1/2 gene variants). Results from Randomized Controlled Trials assessing the efficacy of these decision aids in such clinical settings have been highly encouraging.8 Presently, the Department of Health and Human Services is tasked with the creation of an array of decision aids under Section 3506 of the Affordable Care Act.9 The initiative, however, has failed to see implementation owing to a lack of funds.9 Consequently, one of the pressing tasks is to incentivize both the public and private sectors to bear the burden of creating and disseminating such support tools, with added focus on Accountable Care Organizations. Clinically Non-Actionable Gene Variants Clinically non-actionable variants represent much more of a gray area as it pertains to the question of integration into EHRs. Variants of this category have been shown to be associated with increased risk for a breadth of incurable medical conditions such as Alzheimer’s disease and Huntington’s disease.5 The aforementioned gray area pertains to both the physician and the patient. From the physician’s perspective, a practical concern is counterproductivity. The medical literature and genome-wide studies underlying the risk association of non-actionable variants are often less definitive compared to the literature concerning actionable variants.5 Occasionally, there is possibility of conflicting evidence, further complicating the physician’s task of translating positive findings into effective clinical care strategy. Yet, another point of contention is that some physicians across the nation

Spring 2015 Volume 14, Issue 2


are not well versed in genomics to be able to analyze complex genomic associations. Thus, there needs to be modifications to the medical school curriculum and the post-graduate training process to better equip future physicians in this regard and to ensure a more equitable standard across the United States. A recent study demonstrated that medical students are generally poor at applying genetics concepts in the clinical context.10 Such findings support a practical, longitudinal, and case-based approach to genetics education through medical school and beyond. Moving on to the patient’s side, the argument boils down to an ethically charged question: Is it acceptable to present findings to patients regarding gene variants for which there are no established care protocols, particularly if the detection of such variants was not the primary motivation behind the decision to undergo whole genome sequencing? As is true for many patient-centered questions in health care, the answer is nuanced and cannot be dichotomized to a simple “yes” or “no.” One side of the argument is that individuals may experience both acute and chronic distress from learning that they possess gene variants that could lead to incurable disease.5 It may be that the harms arising out of distress outweigh the harms associated with development of incurable disease, at least in the short run. The alternate side is that it would be discriminatory to withhold such life-altering information if uncovered.5 Importantly, we cannot establish a uniform policy towards inclusion based on observed patient preferences from research studies. Instead, a case-by-case approach is necessary. It is recommended that non-actionable variants be incorporated into the EHR and become available to physicians for initial viewing. The onus then falls on the physicians to explore individual preferences of the patients. Depending on the response, a physician may either withhold the information or recruit supplementary health services such as genetic counseling centers to construct a longitudinal plan with the patient and to ensure preparedness in the event that the variants do lead to disease. It follows that efforts need to be made towards crafting health care policies that promote horizontal integration amongst primary care and counseling services, and thereby promote patient satisfaction. Ultimately, the hope is that some of the current non-actionable variants will transition to the actionable category owing to the ever-expanding body of scientific and clinical research. Downstream Ethical Challenges Decisions regarding the integration of genomic information into EHRs and appropriate dissemination to patients constitute only a part of the challenge. For years, there was great concern that insurance providers would eventually gain access to personal genomic data and structure their services towards individuals with low genetic risk profiles to minimize costs. Presently, the Genetic Information Nondiscrimination Act (GINA) assuages these concerns by mandating insurers to provide services independent of an individual’s genetic information.11 While GINA protects sharing of family medical history to third parties, it fails to adequately explore the question of whether family members should be warned if an individual tests positive for potentially malignant gene variants.12 Prevailing viewpoints towards this ethical conundrum can be broadly classified as either libertarian or utilitarian.13,14 The libertarian viewpoint values autonomy


Harvard Health Policy Review

over moral obligations to others.14 It recommends that individuals be the ultimate decision makers regarding dissemination of their genetic information. Given that the bedrock of the physician-patient relationship is trust, the physician in this instance has little option but to respect the choice of the patient. By contrast, the utilitarian viewpoint places the interests of society–including immediate family members–over that of the individual.14 An individual may share a personal conviction for either of these two viewpoints. However, context is equally important. For instance, an individual that is hesitant to share their genetic information with family members may not be acting out of self-interest but instead out of fear of stigmatization and social segregation. Similarly, individuals with genetic risk for potentially fatal condition may choose to share this information with their family; however, the family might not want to learn such life-altering information. These two scenarios represent only a fraction of the complications raised by applying the libertarian and utilitarian viewpoints to various scenarios. Instead of favoring one over the other, it is more important for physicians to recognize the respective merits of each and withholding exceptional circumstances, trust the decisions of their patients. Data Transfer and Patient Security Protection of sensitive genomic data is a final practical hurdle concerning the integration of whole genomic data into EHRs. The key strength of EHRs is that they facilitate timely sharing of a patient’s genomic data across the full continuum 1. T H Murray E. The Human Genome Project: ethical and social implications. Bulletin of the Medical Library Association [Internet]. 1995 [cited 18 January 2015];83(1):14. Available from: 2. Vance A. Human Gene Mapping Price to Drop to $1,000, Illumina Says [Internet]. Bloomberg. com. 2014 [cited 16 January 2015]. Available from: 3. DNA Sequencing Costs [Internet]. 2015 [cited 1 January 2015]. Available from: 4. Li R, Li Y, Fang X, Yang H, Wang J, Kristiansen K et al. SNP detection for massively parallel whole-genome resequencing. Genome Research. 2009;19(6):1124-1132. 5. Berg J, Khoury M, Evans J. Deploying whole genome sequencing in clinical practice and public health: Meeting the challenge one bin at a time. Genet Med. 2011;13(6):499-504. 6. Fuller J, Khoueiry P, Dinkel H, Forslund K, Stamatakis A, Barry J et al. Biggest challenges in bioinformatics. EMBO Rep. 2013;14(4):302304. 7. Hazin R, Brothers K, Malin B, Koenig B, Sanderson S, Rothstein M et al. Ethical, legal, and social implications of incorporating genomic information into electronic health records. Genet Med. 2013;15(10):810-816. 8. Wakefield C, Meiser B, Homewood J, Taylor A, Gleeson M, Williams R et al. A randomized trial of a breast/ovarian cancer genetic testing decision aid used as a communication aid during genetic counseling. Psycho-Oncology. 2008;17(8):844-854. 9. Affordable Care Act | Informed Medical Decisions Foundation [Internet]. 2015 [cited 11 January 2015]. Avail-

of care.15 From a practical standpoint, this requires that EHRs be standardized to some degree or at the very least share compatible elements that make data transfer seamless. At the same time, it is important to realize that frequent data sharing is a double-edged sword. Internally, there are greater opportunities for improper or malignant use of data that may compromise privacy.7 Furthermore, we live in the age where cyber attacks are becoming more and more commonplace. EHRs may represent a large potential asset to such cyber attackers. Thus, we must sharpen our existing security measures accordingly. Concluding Remarks Just one decade following the completion of the HGP, the prospect of personalized medicine is close to being realized. The simultaneous rise of EHRs holds tremendous promise for future patients: the integration of whole genome data with EHRs will grant physicians an efficient tool to customize care. However, before this grand vision is fully realized, we must overcome key challenges in the areas of logistics, data privacy, patient autonomy, and medical ethics. Nearly all of these challenges lack a definitive answer. Yet, the proposed solutions to them share one essential element: patient engagement. Patients are the key stakeholders in the business of health care and any proposed improvements to health care, including those rooted in Bioinformatics, must align with patient preferences.

Achal Patel is a graduate of Wake Forest University with a B.S. in Biology. Currently, he is a M.P.H. candidate at The Dartmouth Institute for Health Policy & Clinical Practice. His academic interests include medicine, health care policy and economics, as well as global health. able from: federal-legislation/affordable-care-act/. 10. Greb A, Brennan S, McParlane L, Page R, Bridge P. Retention of medical genetics knowledge and skills by medical students. Genet Med. 2009;11(5):365-370.11. 11. Prerna Sethi K. Translational Bioinformatics and Healthcare Informatics: Computational and Ethical Challenges. Perspectives in Health Information Management / AHIMA, American Health Information Management Association [Internet]. 2009 [cited 11 January 2015];6(Fall).Available from: http://www.ncbi. 12. Genetic Discrimination [Internet]. 2015 [cited 11 January 2015]. Available from: 13. Fulda K. Ethical issues in predictive genetic testing: a public health perspective. Journal of Medical Ethics. 2006;32(3):143-147. 14. Brannigan M, Boss J. Healthcare ethics in a diverse society. Mountain View, Calif.: Mayfield Pub. Co.; 2001. 15. Bitton A, Flier L, Jha A. Health Information Technology in the Era of Care Delivery Reform. JAMA. 2012;307(24).


NAMCShiny: An Interactive Web Application to Explore Health Trends in 2003-2010 National Ambulatory Medical Care Survey data Jean Fan1,2â&#x20AC; , Kamil Slowikowski1

Big medical datasets are rapidly being generated and made publicly available, providing numerous opportunities for bioinformatics in health policy. Visual inspection and exploration of the data are useful for hypothesis generation and trend identification, but generating such visualizations is often time-consuming and cumbersome. Here, we present NAMCShiny, an interactive web application that uses data from the National Ambulatory Medical Care Survey (NAMCS) to visualize trends in the reasons for physician visits from 2003 to 2010, while stratifying on patientsâ&#x20AC;&#x2122; demographic information such as sex, race, and age. We envision a future where big medical datasets, such as the NAMCS, and bioinformatics tools, such as NAMCShiny, will guide the development of data-driven health policies. However, additional statistical methods and analytical techniques must be developed in order to form accurate interpretations and extract meaningful insights.

Rise of Big Data The rise of big data, coupled with the development of sophisticated informatics, has transformed industries by enabling data-driven decision-making. Businesses now use big data and informatics to make intelligent decisions about where to focus resources and implement informed strategic and operational decisions. Insights about particular consumer demographics are used to develop targeted marketing strategies. Target used big data to identify the shopping trends of pregnant women, and subsequently devised targeted marketing strategies to engage customers they predicted to be pregnant.1 A major goal for bioinformatics is to use big data to transform healthcare and health policy via data-driven decision-making. As in business, this ability to identify trends and predict patterns in particular patient demographics can help us to develop and achieve specific healthcare goals. If we use big data to identify a trend in a particular disease for a specific patient demographic, we can make predictions and develop health policies to target this demographic. Big medical datasets are being rapidly generated owing to the declining cost and the increasing ease of collecting and storing patient information.2 Many datasets are already publicly available [Supplementary Table 1]. Here, we present a tool to explore the National Ambulatory Medical Care Survey (NAMCS) dataset. The NAMCS are annual survey results from samples of physician visits. Data include patient demographics, reasons for the visit, any medications or therapies prescribed, and more.3 The NAMCS data files for public use are available through the Centers for Disease Control website: Health_Statistics/NCHS/Datasets/NAMCS/. Researchers have already used the NAMCS data to study trends in pelvic inflammatory disease4, fungal infections5, psoriasis6, HIV testing7, EMR adoption8, and even the frequency of physician visits on Fridays9. Bioinformatics tools are needed to harness the potential of these big medical datasets and extract additional insights in an efficient and high-throughput manner. Opportunities for Health Policy The rise of big data presents numerous potential opportunities for healthcare and health policy.10,11,12,13 Traditional analytical tools are not suited to handle such large-scale and

multi-dimensional datasets.11,12,13 Therefore, bioinformatics is needed to elucidate the insights big data may provide. One such approach is interactive visualization, which is useful for hypothesis generation and data exploration prior to downstream analysis.14 Interactive visualization also creates a compelling visual story that helps the general audience gain understanding, and increases the appeal of datasets like the NAMCS. The NAMCS lends readily to trend identification. The ability to easily visualize trends for different patient demographics will enable health policymakers, physicians, and researchers to ask questions, test new hypotheses, and develop ideas for possible interventions at the individual, group, and population levels. At the individual level, each patient has a combination of demographic characteristics that can identify him or her within the larger population. By leveraging information from other patients with similar characteristics, big

data and trend identification can help researchers and physicians predict future illness and find interventions that are more likely to benefit this patient.13 At the group and population levels, health policymakers can identify at-risk populations by their shared demographic characteristics and develop more targeted strategies to prevent or curb potential outbreaks.13 Although trend identification has potential to be powerful and informative, generating visualizations can be difficult. Given the thousands of possible combinations of patient demographic stratifications, hundreds of different diseases to analyze, and other parameter combinations, identifying relevant stratifications for different trends can be challenging. Finding such trends and stratifications requires generating and inspecting a large number of visualizations, which is time-consuming and cumbersome.14 Therefore, data visualization tools are needed to improve the efficiency of trend identification. Enabling Big Data Visualization: NAMCShiny To address this challenge, we developed NAMCShiny, an interactive web application to visualize trends in the reasons for physician visits for different patient demographic stratifications based on sex, race, ethnicity, and age [Figure 1]. The visualization displays a count of the userselected reasons for physician visit as it changes over time. Users can define a sliding window of days to emphasize seasonal or yearly trends,

Figure 1. Screenshot of NAMCShiny, an interactive web application to visualize trends in the reasons for physician visits.

1 Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA 02138, USA 2 Department for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA

Spring 2015 Volume 14, Issue 2


and stratify on specific patient demographic information to easily visualize and identify trends in specific reasons for physician visits for different patient demographics of interest. Using NAMCShiny, we are able to quickly discover several interesting trends. We observe “Cough” and other related terms as strongly seasonal reasons for medical visits with peaks in the winter months and troughs in the summer months [Figure 1]. These trends recapitulate known seasonal flu trends,15 supporting the validity of trends identified using our application. We observe that “Depression” is becoming a less common reason for visits in females 18 and younger on a yearly basis. We also find that “Occupational problems” and “Marital problems” are highly correlated on a yearly basis, though the correlation is stronger for males than for females. “Back pain” and “Diabetes mellitus” are both increasing as reasons for visit on a yearly basis and are highly correlated in males. From a health policy perspective, such trends may encourage increased allocation of resources for combating obesity in males due to its association with increasing rates of “Back pain” and “Diabetes mellitus,” while allocation of resources for addressing “Depression” in young girls may be more flexible. Similarly, physicians seeing new male patients visiting for “Occupational problems” may need to look for signs of “Marital problems.” We encourage readers to recreate these plots to confirm our findings and identify additional potentially interesting trends at https://jefworks. NAMCShiny can also be launched from within R with the command: shiny::runGitHub(“NAMCShiny”, “JEFworks”). Source code and additional NAMCS datasets converted into ready-to-use RData files are available on github: JEFworks/NAMCShiny. We strongly encourage readers to explore the NAMCS datasets within the R statistical environment to witness the full potential of big data and recognize the need for bioinformatics to harness this potential. Challenges and Limitations To form accurate interpretations and extract meaningful insights from big medical datasets, we must understand and account for inherent challenges and limitations in the data. Here, we present key challenges relevant to many big datasets, including the NAMCS. Small samples do not represent the population Big datasets may include information on thousands of patients, but these are small samples of the entire population. In NAMCShiny, selecting rare reasons for visit or rare patient demographic stratifications results in sample sizes of 30 or fewer patient visits. Trends observed in small samples may be misleading and cannot be generalized to the larger population. Population inference from non-uniform non-random sampling requires weighting Each patient visit in the NAMCS represents approximately one out of 30,000 per year. Due to the non-random, multi-level sampling design employed by the NAMCS, each visit is assigned an inflation factor called “patient visit weight.” These weights are necessary to make population estimates for the 1,000,000,000 office visits per year in the United States. Unweighted analysis may lead to biased representations of demographics or geographical regions.


Harvard Health Policy Review

Correlation does not imply causation Establishing causative relationships requires controlled studies. Without appropriate controls, observed correlations are influenced by confounders and mediators with directional relationships largely derived from speculation. Using NAMCShiny, we observed a correlation between “Cough” and “Throat Soreness.” However, we cannot conclude that “Cough” causes “Throat Soreness” or vice versa. An unobserved confounding variable—whether the patient has a cold—may be directly associated with both “Cough” and “Throat Soreness.” In this case, having a cold causes “Cough” and “Throat Soreness,” resulting in the observed correlation. Similarly, the correlation may be explained by the unobserved mediator variable of the patient’s inflammatory response. Independent of having a cold, the cough may result in strong inflammatory response that causes “Throat Soreness.” The degree of inflammatory response may differ between demographic stratifications, resulting in different correlations. Data may be biased Study design, survey biases, missing data, or other factors may lead to biased representation of the population. Survey-based datasets such as the NAMCS are vulnerable to any-response, non-response, or other survey-related biases. In particular, the phrasing of questions in the survey may bias the response of the patient or physician away from a truthful response. Non-response also introduces bias if nonresponders are characteristically different from responders. For the NAMCS, physicians with

larger visit volumes were more likely to refuse to participate.3 Patients who visit physicians with larger visit volumes might be characteristically different from those who visit physicians with lower volumes, biasing patient trends by omission. Missing data can result due to various mechanisms including (1) Missing Completely at Random, (2) Missing at Random, and (3) Missing Not at Random.16 Determining the mechanism of missingness is difficult, requiring assumptions based on reason, judgment, and prior knowledge. Imputing missing data is problematic for MNAR because data to be imputed depends on unobserved data.15 Researchers and healthcare policymakers must be aware of these challenges and limitations to prevent drawing spurious and inaccurate conclusions. Conclusion We created NAMCShiny, an interactive web application for exploring trends in the reasons for physician visits based on the NAMCS data from 2003 to 2010. We believe that easy access to trend visualizations and ready-to-use datasets will aid the development of effective data-driven health policies. However, we caution that meaningful interpretation depends on understanding and controlling for challenges and limitations such as small sample sizes, non-random sampling, confounders, and biases. We envision a future where other datasets are released to the public with interactive browsers designed specifically for policymakers to inform their decisions. We believe that further development of bioinformatics tools will unlock the potential of big medical datasets to drastically improve the efficiency and efficacy of health policies with data-driven decision-making.

Supplementary Table 1 can be found in the full online version of this article at Jean Fan is a graduate student at Harvard University in the Bioinformatics and Integrative Genomics program. She is interested in software development for big data analysis and visualization.

Kamil Slowikowski is a graduate student at Harvard University in the Bioinformatics and Integrative Genomics program. He develops computational and statistical methods for functional genomics.

1. Duhigg, Charles (2012) How Companies Learn Your Secrets. New York Times, Sunday Magazine(MM3), shopping-habits.html 2. Gibbs, W. W. (2014). Medicine gets up close and personal. Nature, 506(7487), 144–5. CDC (2010) 2010 NAMCS MICRO-DATA FILE DOCUMENTATION, NCHS/dataset_documentation/namcs/doc2010. pdf 3. Sutton, M. Y., Sternberg, M., Zaidi, A., St Louis, M. E., & Markowitz, L. E. (2005). Trends in pelvic inflammatory disease hospital discharges and ambulatory visits, United States, 1985-2001. Sexually Transmitted Diseases, 32(12), 778–784. 4. Panackal, A. A., Halpern, E. F., & Watson, A. J. (2009). Cutaneous fungal infections in the United States: Analysis of the National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS), 1995-2004. Int J Dermatol, 48(7), 704–712. doi:10.1111/j.13654632.2009.04025.x 5. Trends in Systemic Psoriasis Treatment Therapies from 1993 Through 2010 6. Tai, M., & Merchant, R. C. (2013). HIV testing in US emergency departments, outpatient ambulatory medical departments, and physician offices, 1992-2010. AIDS Care, 1–4. 7. Hsiao, C.J., Hing, E., & Ashman, J. (2014). Trends in electronic health record system use among

office-based physicians: United States, 2007-2012. National Health Statistics Reports, (75), 1–18. 8. Shaw, M., Davis, S. a, Feldman, S. R., & Fleischer, A. B. (2013). Decreasing frequency of office visits on fridays. The Journal of Dermatological Treatment, 24(6), 405–7. 9. Weber, G.M., Mandl, K.D., Kohane, I.S. (2014) Finding the missing link for big biomedical data. JAMA, 311:2479–80. 10. Kayyali, B., Knott, D., Kuiken, S.V. (2013). The Big-Data Revolution in US Health Care: Accelerating Value and Innovation. Chicago, IL: McKinsey & Co. 11. Roski, J., Bo-Linn, G. W., & Andrews, T. a. (2014). Creating value in health care through big data: opportunities and policy implications. Health Affairs (Project Hope), 33, 1115–22. doi:10.1377/ hlthaff.2014.0147 12. Cottle, M. (n.d.). Transforming Health Care Through Big Data. 13. Streit, M., Lex, A, Gratzl, S., Partl, C., Schmalstieg, D., Pfister, H., et al. (2014). Guided visual exploration of genomic stratifications in cancer. Nat Methods [Internet]. Nature Publishing Group, 11(9):884–5. 14. 15. Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., and Rubin, D. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Texts in Statistical Science (Book 106)

Spring 2015 Volume 14, Issue 2



HHPR Spring 2015 Issue  

Big Data's Big Challenge

HHPR Spring 2015 Issue  

Big Data's Big Challenge