Issuu on Google+

SMI 57

Swiss Medical Informatics

SGMI Schweizerische Gesellschaft für Medizinische Informatik

SSIM Société suisse d'informatique médicale Società svizzera d'informatica medicale

SSMI Swiss Society for Medical Informatics

Inhalt Medical Informatics

2

2005 in Switzerland Health On the Net

3

An interactive visualization system

8

The digital pen and

11

Schwerpunktthema / Thème principal: Medical Informatics 2005 in Geneva

paper technology Integrated traceability

15

Information retrieval

20

Automatic abnormal

23

region detection in lung CT images Simplified represen-

28

tation of concepts and relations on screen Events

32

SGMI/SSIM-News: Improving communication to members / Scientific Meeting in St. Gallen, May 2/3, 2006 Schwabe Verlag Basel


SMI 2005: Nº 57

1

Table of contents Inhaltsverzeichnis Table de matières 2

Medical Informatics 2005 in Switzerland (R. Baud, C. Lovis)

3

Health On the Net’s initiative to improve online health information quality (C. Boyer, V. Baujard, A. Geissbuhler)

8

An interactive visualization system for large-scale telemedical disease management (D. Brodbeck, R. Gasser, M. Degen, J. Luthiger, S. Reichlin)

11

The digital pen and paper technology: implementation and use in an existing clinical information system (C. Despont-Gros, C. Bœuf, A. Geissbuhler, C. Lovis)

15

Integrated traceability in the hospital and home care settings (C. Hay, J. Bracken, J.-Y. Gerbet)

20

Information retrieval: proactive semantic search strategies (S. Hoelzer, R. K. Schweiger, J. Dudeck)

23

Automatic abnormal region detection in lung CT images for visual retrieval (H. Müller, S. Marquis, G. Cohen, P.-A. Poletti, C. Lovis, A. Geissbuhler)

28

Simplified representation of concepts and relations on screen (H. R. Straub, N. Frei, H. Mosimann, C. Perger, A. Ulrich)

32

Events

Impressum Herausgeber / Editeur SGMI, Schweizerische Gesellschaft für Medizinische Informatik Dählhölzliweg 3, Postfach 229 CH-3000 Bern 6 Tel. 031 350 44 99 / Fax 031 350 44 98 e-mail: admin@sgmi-ssim.ch Internet: http://www.sgmi-ssim.ch Vorstand der SGMI / Comité de la SSIM Martin Denz, Antoine Geissbühler, Felix Heer, Christian Lovis, Eusebio Passaretti, Benno Sauter, Judith Wagner, Ulrich Woermann

Redaktionsadresse / Adresse de rédaction

Inserate / Régie des annonces

Christian Lovis Service of Medical Informatics University Hospitals of Geneva CH-1211 Geneva 14 E-Mail: christian.lovis@hcuge.ch

Schwabe AG Chantal Schneeberger Frankfurtstrasse 14, Postfach 340 CH-4008 Basel Tel. 061 333 11 07 / Fax 061 333 11 06 E-Mail: c.schneeberger@schwabe.ch

Autorenrichtlinien / Directives pour les auteurs http://www.sgmi-ssim.ch Verlag / Editions Schwabe AG Steinentorstrasse 13, CH-4010 Basel Betreuung im Verlag: Natalie Marty Tel. 061 467 85 55 / Fax 061 467 85 56 E-Mail: n.marty@schwabe.ch

Chefredaktor / Rédacteur en chef

Druck und Versand / Impression et distribution

Christian Lovis

Druckerei Schwabe AG Farnsburgerstrasse 8, CH-4132 Muttenz Tel. 061 467 85 85 / Fax 061 467 85 86 E-Mail: druckerei@schwabe.ch

Redaktion / Rédaction Rolf Grütter, Ulrich Woermann

Abonnemente / Abonnements Schwabe AG, Verlagsauslieferung Farnsburgerstrasse 8, CH-4132 Muttenz Tel. 061 467 85 75 / Fax 061 467 85 76 E-Mail: auslieferung@schwabe.ch Abonnementspreis / Prix d’abonnement CHF 40.– (zuzüglich Porto / port en plus) Einzelnummer / Exemplaire unique CHF 15.– (zuzüglich Porto / port en plus)

ISSN 1660-0436 erscheint 3mal jährlich paraît 3 fois par an


MIE 2005 in Geneva

SMI 2005: Nº 57

2

Medical Informatics 2005 in Switzerland

Robert Baud, EFMI President Christian Lovis, SSIM-SGMI representative at EFMI

In August 2005, for the first time, the major scientific conference of medical informatics in Europe was held in Geneva. The SSIM-SGMI was hosting this event and the University Hospitals of Geneva was in charge of the local organization. This conference was the 19th occurrence of MIE, after numerous cities in Europe. This event was a success for several reasons. First of all the number of participants was high with an attendance of more than 550 persons. The number of persons from Switzerland was particularly appreciated with some 80 participants, the highest rate of all countries, not really expected from a small European country. Second, the location of Uni-Mail at the University of Geneva was especially convenient, with large spaces and excellent facilities. Third, the scientific contributions were of excellent quality, including the tutorials, the workshops and the keynote speakers. Last but not least, the organizing team was extremely competent and able to master the numerous technical challenges relevant to such congresses, as well as preparing the accompanying conference events like a cruise on the lake of Geneva.

Correspondence: Pr Christian Lovis, MD Service of Clinical Informatics Unit of Clinical Informatics University Hospitals of Geneva 24, rue Micheli-du-Crest CH-1211 Geneva 14 christian.lovis@hcuge.ch

Swiss participation was also impressive at the level of the scientific papers submitted to this conference. The rate of acceptance of papers was fixed at 60% and several authors in this country reached this limit. This means that the SSIMSGMI was simultaneously active as host of the congress and as an active scientific participant. For this reason, the present volume of SMI is entirely devoted to papers submitted at MIE2005. Seven papers have been selected as relevant of ongoing work in the domain of medical informatics in Switzerland. It is worth to notice that all these 7 papers went through a double reviewing process. First, they have been among the Swiss papers accepted at the MIE conference, and second, among them, they have been selected by the editorial board of the SMI to be published in this issue. It is important for our country to be present at future scientific events, such as MIE2006 in Maastricht that will concentrate on medical informatics for aging societies, or the Special topic spring conference STC2006 in Timisoara,

Romania, devoted to the integration of medical information. The search for new solutions involving informatics applied to health care is quite a universal problem. The dimension and complexity of this problem makes it out of reach of local or individual initiatives. Only the conjugated efforts of several teams, exchanging information across boders in Europe is able to bring substantial benefits on the long term. We have to remember that, as SSIM-SGMI members, we are automatically members of EFMI.

Dear organizers of MIE 2005, dear members of the SSMI I want to thank Professor Antoine Geissbühler and all members of the organizing team in Geneva for the splendid work they did achieve! This major event should remind us that scientific activities in Medical Informatics are crucial in order to have a basis for resolving real problems in healthcare. It is also the mission of the SSMI to transfer this knowledge into practice. Switzerland, at the heart of Europe, should realize that we are able to contribute on an international level to the development of healthcare. Thus, it could be a little step to also apply the results of MIE. 2005 onto Swiss healthcare – shouldn’t we? Many thanks again and best regards, Martin Denz, President of the SSMI


SMI 2005: Nº 57

3

Health On the Net’s initiative to improve online health information quality Celia Boyera, Vincent Baujarda, Antoine Geissbuhlera, b a

Health On the Net Foundation, Switzerland b Geneva University Hospital, Switzerland

Summary Created in 1995 in response to consumer enthusiasm for the World Wide Web, Health On the Net Foundation has developed solutions to address the problem of potentially dangerous online health and medical information. Then as now, no international legal framework regulated online content, and consumers needed to be given the means to check the reliability and the relevance of health information. HON was first to introduce a code of conduct for online health and medical publishers, the HONcode, which was readily adopted by webmasters aware of the need for credibility in the new competitive online space. HON went on to develop Web applications to enhance access to reliable information, making use of innovative NLP-based and semantic search technologies. This paper describes the implementation of a quality standard for online information (HONcode) based on the information production process; its evolution through nearly 10 years of effectiveness, its challenges and results. Today, over 4700 HON-accredited websites, respecting minimum standards for disclosure and responsibility in online medical publishing, constitute the largest voluntary accreditation network on the Web.

Background

Correspondence: Health On the Net Foundation 24 rue Micheli-du-crest CH-1211 Geneva 14 Celia.Boyer@healthonnet.org

Health On the Net Foundation was created in 1995 by Prof. Jean-Raoul Scherrer, former director of the Geneva University Hospital; Donald Lindberg, Director of the U.S. National Library of Medecine, Michel Carpentier, former director of the European Commission DGXIII and GuyOlivier Second, former Director-General of the Geneva Ministry of Health. They foresaw that consumers, newly empowered to research their own medical conditions, could easily fall prey to misleading advice [1, 5]. New information technologies have given online publishers the means to replicate and re-use digital content, and the public can now access previously hard to find material. With no international legal framework to regulate online content, consumers urgently needed to be given the means to assess the relia-

bility and the relevance of health information, and be provided with enhanced access to information of the highest quality. Persistent medical controversies meant that multiple information sources needed to be considered, weighed and analysed. Health On the Net Foundation was created to fill this role. Its stated mission is to “guide the growing online community of healthcare consumers and information providers to sound, reliable medical information and expertise”. Over a ten-year period, HON has developed strategies to improve information quality addressed to content creators, and technologies to enhance access to reliable information intended for the general public. The HONcode, discussed in the following section, is a strategic, human, pedagogic and consumer-oriented effort aimed at two target audiences: web publishers and the general public. HON’s MedHunt search engine, HONselect directory and WRAPIN next-generation search tool complement the human-based ethical approach with technological means [6–9].

Objectives and solutions set up by Health On the Net Foundation Establishing trust is crucial in any relationship between a patient and a provider of health services. This applies not only to doctors and other caregivers, but also to information providers. Requesting transparency on the operational level of providing online health information infers awareness of responsibility from the information provider. To achieve this, HON first needed to reach a consensus among information providers, medical experts and consumers, to define criteria for trustworthy health information websites. The HONcode was the first code of conduct developed in 1996 for online medical and health information providers, with a voluntary accreditation programme setting out eight criteria that any well-intentioned webmaster could follow. The accreditation process was designed not only to protect the public, but to educate the web-


MIE 2005 in Geneva

SMI 2005: Nº 57

4

The HONcode is now translated in 29 languages. The HONcode obliges Websites to respect and disclose the following information: Authority: Is medical advice given by a medical professional? Complementarity: Information should support, not replace, the doctor–patient relationship Privacy: What use is made of personal data collected through the Web site Attribution: Refer accurately to source information Justifiability: Site must back up claims about benefits & performance Transparency: Accessible presentation, identities of editor & webmaster, email contact Financial disclosure: Identify sources of funding Sponsorship: Clearly distinguish advertising from editorial content

master and improve online information quality by requiring the addition of meta-information, including complete citations and revision dates, and through the labelling of advertising or commercially-oriented content and to declare the funding sources of the web site.

HONcode The HON Code of Conduct consists of elementary rules that any conscientious webmaster can easily adopt [Text Box 1]. It is a voluntary accreditation system based on an “active seal” concept. Site administrators must take the initiative to apply for HONcode accreditation. HON found that some 90% of sites did not fully or correctly apply the Code of Conduct. To promote accountability and combat fraudulent use of the HONcode logo, a unique certificate was assigned to each accredited site, appearing when a user clicked on the accreditation seal.

Review process Each request for accreditation is examined by a member of the HONcode review team. The site is accredited by HON when the site not only complies with all eight of the HONcode ethical principles according to the HONcode accreditation guideline [10] but also demonstrates how each principle is implemented. As the result of the HONcode accreditation the site which is found to respect the eight HONcode principles is given a unique ’active seal’ to place on their pages. The presence of the distinctive blue and

red HONcode seal on subscribing sites helps users identify sources of reliable information. The term ‘active seal’ refers to the link which webmasters are required to maintain from the HONcode seal to a certificate attesting to their accreditation status, residing on the HON site. Tampering with the seal is prohibited by the HONcode agreement. Visitors are encouraged to ‘Click to verify’ the site’s HONcode status, which can be revoked at any time. The HONcode accreditation process is strengthened by regular monitoring. An accredited site receives a check-up visit periodically, beginning one year after initial accreditation, or following a complaint or technical malfunction detected by our monitoring services. The HONcode has undergone one revision since its introduction in 1996 [11], and an elaborate system has been built to support the reviewing process, dubbed HONuser. This system provides a workflow for administrators and manages the many interactions among teams of scattered reviewers and webmasters.

Regionalisation To more effectively extend HONcode accreditation to websites targeting regional, and in particular, non-English language audiences, HON is extending its reviewing activities in European countries, North America and Africa. In addition to linguistic support, this initiative aims to preserve cultural diversity. HON has already formed a reviewing team based in Valencia, Spain [12], and new teams are being set up in Mali and Romania, to carry out reviewing and to promote online health information quality to content creators and the public. HON’s regional initiatives are a necessary response to the growth of Internet use in the target regions, allowing HON to remain sensitive to questions of health and culture and more easily conform to differences in legal requirements. The question of financial support is not yet resolved. We believe that the new regional activities should be supported by local and regional authorities.

Search engine and other technologies Beginning in 1996, HON opted for a restricteddomain approach to search only medical resources [13]. The next-generation search tool, WRAPIN (Worldwide Reliable Advice to


SMI 2005: Nº 57

5

Patients and Individuals), is a set of technologies developed within the framework of a two-year EU project (IST-2001-33260) [8, 9]. WRAPIN was developed to identify relevant documents and make assertions as to their trustworthiness. The system, powered by Natural Language Processing, is able to identify the main scientific concepts contained in a text, and performs ‘background checks’ on a document to ascertain the origins and relations of the ideas it contains. The result is a semi-automatic editorial policy engine which aims to enhance access to the knowledge contained within a collection of scientific documents.

Results When initially applying for HONcode accreditation, not all web sites are compliant with all eight HONcode principles. Currently, 78% of web sites applying for HONcode accreditation respect Principle 1, which requires that the author or editor of the information be named (Principle 1: Authority). In the study of Hernández Borges [14], which evaluated a sample of 159 paediatrics web sites, this number was 65%. Some 76% of web sites with health information clearly stated the goal of the site and its target audiences (Principle 2: Complementarity). The sources of medical information are clearly established in 81% of the sites. Often, the date provided in the web site is not the date of last modification, but is set to automatically display the current date. A credible date of last modification is only stated in 70% of sites (Principle 4: Attribution). Very few web sites (less than 6%) made claims on a specific treatment (Principle 5: Justifiability). Over the years, HON has noted a significant change in the provision of a confidentiality policy. 65% of the web sites have a confidentiality policy (Principle 3: Confidentiality), as opposed to 24% in the previous study. Some 99% of sites provided a means of contacting their webmasters (Principle 6: Transparency of Authorship). Finally, 52% of webmasters clearly stated their sources of financial support (Principle 7: Transparency of sponsorship). Only 41% declared that they host advertising on their site (Principle 8: Honesty in Advertising and Editorial Policy). This showed a mean rate of compliance with the HON principles of 73.6%. 7.5% of applicants were rejected due to inappropriate content or to an impossibility to respect the HONcode. At the initial application, despite the voluntary nature

of the accreditation process, only 23% of sites were fully compliant with the HONcode. However, following communications between the webmaster and the HONcode staff, 99% of sites that decided to continue the accreditation process were brought into compliance with the eight HONcode principles. According to the ‘Web Impact Factor’ of Alta Vista, over 800,000 quality web pages link to the HON web site in 72 countries. HONcode accredited web sites are categorized under more than 33,000 subject headings, covering both most common and rare diseases. In an outreach to the public, HON released a multifunctional browser toolbar (a plug in named the HON Toolbar) [15] which, when activated, automatically displays a site’s HONcode status. This toolbar was implemented in order to assist Web surfers to access trustworthy Web sites. The HONcode tool bar is used by more than 20,000 persons each day.

Discussion and conclusion Comforting the HONcode initiative, in 2001 the European Commission has released Quality Criteria for Health related web sites recommending to members states to respect elementary rules [16]. This recommendation has been developed in collaboration with HON. As a neutral, standards-setting body, HON retains the confidence of all concerned parties. HON is free of commercial influence and perceived by the public and the industry as free of influence. The HONcode accreditation is free. The introduction of the HONcode in 1996 was a milestone for online health information, as evidenced by numerous references to the HONcode in the Health Informatics literature. The HONcode has often been shown to be a major indicator of accuracy in content in scientific studies [17–21]. The need for online information quality standards arose concurrently with the popularity of the World Wide Web. Ten years on, substantive discussion of Internet Governance is just beginning. The approach pioneered by HON has received wide support from government, industry, and citizens/patients groups; HON was the winner of the Best Content in e-Health category of the World Summit Awards, attributed in December 2003, for a comprehensive overview of best practices in e-content and creativity, within the framework of the World Summit on the Information Society


MIE 2005 in Geneva

SMI 2005: Nº 57

6

(WSIS). In May 2004, HON was named winner of the eEurope Award for eHealth, in the category “eHealth Information tools and services for citizens.” at the EU High-Level Conference on eHealth held in Cork, Ireland [22].

Acknowledgments Health On the Net Foundation is a NonGovernmental Organization under the aegis of the Direction générale de la santé Département de l’Action Sociale et de Santé (DASS –

République et canton de Genève, Switzerland). The WRAPIN (Worldwide online Reliable Advice to Patients and Individuals) project, IST-2001-33260, is supported by the European Commission and the “Office Fédéral de l’Education et la Science” (OFES, Switzerland). The authors wish to thank all HONcode members, collaborators and friends for their precious contributions to the development and achievements of HON. HON servers are powered by Sun Microsystems.

References 1 Impicciatore P, Pandolfini C, Casella N, Bonati M. Reliability of health information for the public on the world wide web: systematic survey of advice on managing fever in children at home. BMJ 1997; 314:1875–9. 2 HONcode principles. In Health On the Net Foundation. Retrieved Jan 21 2004, from http://www.hon.ch/HONcode/ 3 Selby M, Boyer C, Jenefski DA, Appel RD. Health On the Net Foundation Code of Conduct for Medical and Health Websites. MEDNET96 – European Congress on the Internet in Medicine, Brighton, U.K., Oct. 14 to 17, 1996. 4 Boyer C, Selby M, Appel RD. The Health On the Net code of conduct for medical and health web sites. MEDINFO 98, Seoul, 9th World Congress on Medical Informatics, vol 2, 1163–1166, August 98. 5 Pandolfini C, Bonati M. Follow up of quality of public oriented health information on the world wide web: systematic re-evaluation BMJ 2002;324:582–3. 6 Baujard O, Baujard V, Aurel S, Boyer C, Appel RD. A multi-agent softbot to retrieve medical information on Internet. MEDINFO 98, Seoul, 9th World Congress on Medical Informatics, vol 1, 150–154, August 98. 7 Boyer C, Baujard V, Scherrer JR. HONselect: Multilingual Assistant Search Engine Operated by a Concept-based Interface System to Decentralized Heterogeneous Sources. Medinfo, 2001. 8 Gaudinat A, Joubert M, Aymard S, Falco L, Boyer C, Fieschi M. WRAPIN: New Generation Health Search Engine Using UMLS Knowledge Sources for MeSH Term Extraction from Health Documentation. Medinfo. 2004; 2004:356–60. 9 WRAPIN- Worldwide online Reliable Advice to Patients and Individuals European Project: IST-2001-33260. Retrieved Jan 21 2004, from http://www.wrapin.org 10 Accreditation Guidelines for the HON Code of Conduct (HONcode). Accessed Jan. 21 2005, from: http://www.hon.ch/HONcode/Guidelines/guidelines.html 11 HONcode first version in 1996,. Retrieved Jan. 21 2005 from http://www.hon.ch/HONcode/ ConductVs1_5.html 12 García S, Montesinos E, Baujard V, Boyer C. Health on The Net Foundation, Geneva Switzerland. MedNet 2003 3–7 December 2003. A descriptive analysis of strategies in Spain for internet biomedical content evaluation 13 Supporting families’ savvy use of the Internet for health research. Ahmann E. Pediatr Nurs 2000 Jul–Aug;26(4):419–23. 14 Hernández Borges AA, Macías Cervi P, Torres Álvarez de Arcaya ML, Gaspar Guardado MA, Ruíz Rabaza A, Jiménez Sosa A. Rate of compliance with the HON code of conduct versus number of inbound links as quality markers of pediatric web sites. Proceedings of the 6th world congress on the internet in medicine, Udine, Italy, Nov 29–2 Dec 2001. http://mednet2001.drmm.uniud.it/proceedings/paper.php?id=75 (accessed 2005 Jan 21). 15 HONcode tool bar: Health Information you can trust (Jan 2004). In Health On the Net Foundation. Accessed Jan 21 2004, from http://www.hon.ch/HONcode/Plugin/Plugins.html 16 European Union recommendation: Quality Criteria for Health related Websites http://www.hon.ch/HONcode/HON_CCE_intro.htm 17 Wilson P, How to find the good and avoid the bad or ugly: a short guide to tools for rating quality of health information on the internet, BMJ 2002;324:598–602. 18 Diabetes websites accredited by the Health On the Net Foundation Code of Conduct: readable or not? Kusec S, Brborovic O, Schillinger D.Stud Health Technol Inform 2003;95:655–60.


SMI 2005: Nº 57

7

19 Luo W, Najdaqi M. Trust-building Measures: A Review of Consumer Health Web sites are employ a medley of trust-building approaches. But does a definitive formula exist for winning consumer trust? Comm ACM 2004;47:109–3. 20 Search Engine Retrieval Effectiveness For Medical Information Queries, IST 637 Research Project – Group Elmo by Linda Galloway, Nicole Chase-Iverson, Glen Wiley, Gerrit Vander Sluis (March 2003 – http://web.syr.edu/~gevander/ “In order to determine authoritativeness of the medical information returned by each search engine, adherence to the HON (Health on the Net) Code was used as a measure.” 21 Fallis D, Fricke M., Indicators of accuracy of consumer health information on the Internet: a study of indicators relating to information for managing fever in children in the home. JAMIA 2002;9:73–9. 22 2004 eEurope Award for eHealth http://www.hon.ch/Global/eHealth2004/winner.html

Berner Fachhochschule Hochschule für Technik und Informatik HTI Berufsbegleitende Weiterbildungsstudiengänge in

Medizinischer Informatik. Gewappnet für das Informationszeitalter im Gesundheitswesen. Nachdiplomstudiengang

Medical Informatics (NDS MedInf) Inhalte: Informatik (Einführung in die Teleinformatik, Unified Process und UML, Architekturen und Datenbanken, Codierung), Biostatistik, Gesundheitswesen und Gesundheitsökonomie, Systemik und Projektmanagement, Psychologie und Kommunikation, Organisation, Wissensverarbeitung in der Medizin, Ordnungssysteme, Dokumentationen, Patientenkarten, Informationssysteme des Gesundheitswesen (RIS, KIS, PACS etc.), Funktionalitäten, Bilddatenmanagement, Telemedizin, Standards.

Master of Advanced Studies in

Medical Informatics Management (MAS MedInf) Inhalte: Wie NDS MedInf. Zusätzlich Unternehmungsstrategie, Marketing, Finanz und Rechnungswesen, Betriebsprozesse, Wirtschaftsrecht, Arbeitstechnik und Zeitmanagement, International Regulatory Affairs im Bereich Medizinischer Informatik.

Mehr Infos unter www.mzbe.ch oder roger.burkhard@bfh.ch oder Tel. 031 33 55 111


MIE 2005 in Geneva

SMI 2005: Nº 57

8

An interactive visualization system for large-scale telemedical disease management Dominique Brodbecka, Roland Gasserb, Markus Degena, Jürg Luthigera, Serge Reichlinb,c a

University of Applied Sciences Northwestern Switzerland, Olten, Switzerland b Medgate Telemedical Center, Basel, Switzerland c University Hospital Basel, Department of Internal Medicine, Basel, Switzerland

Summary Automated collection and storage of medical data leads to large amounts of heterogeneous and time-dependent information. Out of this follows the problem of how to access and interpret this data in order to support therapeutic decision making. Telemedical disease management holds great potential for the efficient and effective treatment of chronic diseases. The realization of this potential however depends on finding a solution to the information overload problem. This paper describes a highly interactive visualization system that gives caregivers an overview of trends and critical patterns, and provides easy access to details without loosing the big picture. We report on the results from two case studies that confirm the validity of this approach and suggest that it is well suited to enable large-scale telemedical disease management programs. Key words: disease management; telemedicine; decision support systems, clinical; user-computer interface

Introduction

Correspondence: Dr. Dominique Brodbeck University of Applied Sciences Northwestern Switzerland Riggenbachstrasse 16 CH-4600 Olten dominique.brodbeck@fhso.ch

Advances in information technology have greatly improved our abilities to collect and administrate large amounts of data. Medical data can be brought together from many sources and stored in central databases. The fact that data is available however, does not automatically imply that it is also useable. Standard database interfaces for instance employ a “drill-down” way of navigation and data access that typically only shows isolated aspects of the data and make it difficult to get the big picture. While there are many initiatives whose concerns are the standardisation, security and reliability of medical data, we find little efforts that support the decision- and sense-making part of the process. In this paper we look at telemedical disease management as an example for this situation. Disease management is an approach to the treatment of chronically ill patients, consisting of a disease-specific system of decision making algorithms, coordinated healthcare interventions and communications for populations with conditions in which patient self-care efforts are significantly

involved. Applying telemedical techniques to disease management has the potential to help transfer the knowledge and practice of positive health behaviour to patients, and make the management of their condition more effective and more economical. Through telemedical monitoring, medical data measured in every-day context is easily available and can be used to support therapeutic decision making. With this goal in mind, we built a telemedical disease management system that allows not only monitoring of a patient’s condition, i.e. data transfer from the patient to a centre of competence, but also supports therapeutic decision making and intervention. In order to scale up such programs to large numbers of participants, our system replaces the resource intensive 1-to-1 relationships with 1-to-n relationships between caregivers and dynamically changing groups of patients. This approach makes it possible to reduce the number of relationships while still providing the necessary personalized care. The caregivers that use this system need support with the following tasks: – track system use and patient’s task compliance; – discover trends, critical incidents, or causeeffect relationships; – define target groups for new therapeutic interventions; – easily access detailed information. In order to make it possible for caregivers to oversee and control this large flow of data, and eventually turn it into actionable information, they need tools that go beyond traditional “query and list”-based database interfaces. In the following sections, we describe an interactive visualization application that we built in order to enable largescale telemedical disease management programs.

Methods Our application consists of three main views (figure 1) that show the different aspects of this complex data set. The size of each view can be adjusted continuously to match the task. On the left side there are two views that show the data as time lines. The goal of these views is


SMI 2005: Nº 57

9

Figure 1. Overview of the caregiver application

as the overview attribute. An automatic level-ofdetail mechanism adapts the graphical representation for this overview to the available screen space. Even in a highly condensed case (e.g. number of patients >100), individual patients can still be selected, and the frequency of the observations is still readily visible and can reveal important patterns and clues to the caregiver to guide further analysis.

interface.

In order to support therapeutic decision making, indicators can be defined and combined with the patient time lines. Indicators are algorithms that operate on any combination of the available attributes. They are shown as red sections in the background of the individual time lines (red shows as dark areas in figure 1). In figure 1 for example, a simple threshold indicator is used to highlight areas where the value of the Fev1% attribute becomes critical.

Figure 2. Details are embedded in the overview

Horizontally, selected areas can be enlarged in time by sliding a “magnifying lens” along the time axis. The region inside the lens is enlarged, while the two regions on the outside are compressed accordingly (figure 2). This provides a very efficient and intuitive mechanism to navigate and find details on a large time-scale, without loosing the orientation or overview. Vertically, patient details can be enlarged by doubleclicking on a patient’s time line in the overview. The chosen time line dynamically expands to reveal full details about selected attributes. The other time lines are compressed accordingly to make space for the expanded panel, but remain visible at all times (figure 3).

through distortion-oriented techniques.

Figure 3. Filters can be used to create new peer

groups for interventions. to present a single overview of the whole program in a graphical way, while allowing quick access to details of interest. Time lines are an intuitive way to show the data for the current state of the patients in the program in relationship to past developments and the interventions that were administered. The upper view shows the duration and size (number of patients mapped to thickness of the line) of the patient peer groups formed for the purpose of common interventions. Clicking on one of the groups highlights the participating patients in the lower view, and vice versa. Selecting the patient in the centre of the lower view in figure 1 for example, highlights the two groups that he is/was part of in the upper view. The lower view shows one time line for each patient in the program. One of the attributes can be chosen to be shown in this overview. In figure 1 for example, the symptom of coughing (3 states: no, medium, maximum) was chosen

Finally, along the right hand side of the interface there is space for various additional views. The filter view for instance, allows the definition of filtering criteria for any attribute or combinations thereof, through the use of dynamic queries, a graphical way to intuitively formulate complex queries. Patients whose attributes fall outside of the criteria, are filtered out and removed from the display or greyed out (figure 3). This mechanism can be used by the caregivers to identify target groups for common interventions.

Results We tested the application with two different case studies. The first case is data from a telemedical disease management program for patients with obstructive pulmonary disease (i.e. asthma and chronic obstructive pulmonary disease). In this program, chronically ill patients receive specific


MIE 2005 in Geneva

SMI 2005: Nº 57

10

interventions based on daily home monitoring of symptoms and self-measurement of pulmonary function. The aim is early detection of deterioration in the status of the disease. Here, the visualisation concept is used in a retrospective way to analyze the course of these patients in order to help improve on diagnostic and therapeutic algorithms in the future. In the second case we used simulated data for a future program for the treatment and counselling of a large group of obese teenagers. We created data sets for 100 fictive patients over the time period of 18 months. This artificial data includes repeated measurements of activity, body-mass-index (BMI) and well-being. The purpose of this simulation is to test the scalability of our system, and to get a feeling for the kind of data that we expect in future large-scale programs. Disease management program for obstructive pulmonary disease

In figure 1 we focused on patient 0168 in the centre that shows an interesting pattern in symptoms, and three conspicuous periods where the indicator signals that the Fev1% attribute (a measure for pulmonary obstruction) fell below a threshold of 60. We looked at it in more detail (figure 3) and found consecutive rises in subjective symptoms in parallel with a decrease in pulmonary function, as measured by the Fev1% attribute. Patient 0168 received an ambulant treatment (no hospitalisation), the symptoms subsided and the lung function improved within a few days, and then stabilised after the third episode. Simulated disease management program for obese teenagers

In the fictive scenario shown in figure 3, the caregiver uses the filters on the right to look for young patients with high BMI and a preference for swimming. All other patients are therefore greyed out in the overview. The caregiver then goes through all of the remaining patients, looking at their BMI trends, as indicated by the background colouring; green (light grey): BMI falling, orange: stable, red (dark): rising. Patient 057 shows signs of a typical ‘yo-yo effect’. The caregiver decides to start a new intervention group including this particular patient. The goal of the intervention is to motivate the patients to swim more often. In addition, the caregiver plans to call patient 057 to find out more about his condition.

Discussion The domain experts that worked with the data of the pulmonary patients, were using Excel to sift through the large tables of data. When they used our system, they were quickly able to gain further insights and discover anomalies that they were not aware of before, either because the data could not be represented as well with their current tools, or simply because they didn’t look further for lack of convenient data access. The main benefits that were reported were the overview and the ease of access to details without loosing the big picture. The simulated data served as a proof-of-concept to investigate the scalability of our system. The data that was used contained 100 patients with over 60,000 observations. It convinced us that the approach will scale to an even larger number of patients with many different attributes. The next step is now to use our system in a prospective way in a live telemedical disease management setting. Further user studies are needed to tune the system to these tasks. Finally, we would like to point out that the interface makes extensive use of colour, animation and interactivity, all aspects that are impossible to show in a static set of black and white pictures, yet that are essential to our approach. We nevertheless trust that the esteemed readers have enough imagination to understand the main points.

Conclusion We developed a highly interactive visualization system to solve the information overload and data access problems inherent in telemedical disease management programs and other forms of large-scale patient data collections. The system consists of multiple coordinated views that show an overview of the data in the form of parallel time lines, and uses direct manipulation and distortion-oriented visualization techniques to provide quick access to details. We applied this system to two case studies. The first consisted of real field data from a telemedical program and showed the potential of the system for decision support and sense making. The second study used simulated data to confirm the scalability of our approach.

Acknowledgments We thank the Swiss Innovation Promotion Agency CTI, for providing the funding.


SMI 2005: Nº 57

11

The digital pen and paper technology: implementation and use in an existing clinical information system Christelle Despont-Gros, Christophe Bœuf, Antoine Geissbuhler, Christian Lovis University Hospitals of Geneva, Service of Medical Informatics, Switzerland

Summary Objective: Evaluation of the technical feasibility of tight integration of the digital pen and paper technology in an existing computerized patient record. Technology: The digital pen is a normal pen able to record all actions of the user and to analyse a micro pattern printed on the paper. The digital paper is a normal paper printed with an almost invisible micro pattern of small dots encoding information such as position and identifiers. We reported our experience in the implementation and the use of this technology in an existing large clinical information system for acquiring clinical information. Discussion: It is possible to print uniquely identified forms using the digital paper technology. These forms can be pre-filled with clinical readable information about the patient. When healthcare providers complete these forms using the digital pen, it is possible to acquire the data in a structured computerized patient record. The technology is easy to integrate in a component-based architecture based on Web Services. Conclusion: The digital pen and paper is a costeffective technology that can be integrated in an existing clinical information system and allows fast and easy bedside clinical information acquisition without the need for an expensive infrastructure based on traditional portable devices or wireless devices. Key words: digital pen and paper; bedside clinical information acquisition; computerized patient record; human-machine interfaces

Introduction Correspondence Christelle Despont-Gros Hôpitaux Universitaires de Genève Service d’Informatique Médicale (SIM) 24, rue Micheli-du-Crest CH-1211 Genève 4 christelle.despont@ sim.hcuge.ch

Access to clinical reference information at the point-of-care is a goal that is difficult to achieve due to the lack of really portable devices. There are many problems that must be addressed when trying to tackle bedside data acquisition, such as global costs, wireless connections, robustness of devices, size of the screen, usability of the acquisition methods (touch pad, keyboard, sensitive screen, …) and cultural acceptance [1], amongst

others. By far, the handwriting data acquisition paradigm remains the most adapted in several clinical contexts, mostly because of the mobility of care providers [2]. The transfer of handwritten data into the computerized patient record (CPR) requires digitalizing the paper. This operation can rarely be achieved in real time, and does not provide access to structured data. Currently, several mobile devices allowing bedside data acquisition are used in clinical settings [3]. They are usually based on PDA’s or notebook technologies, including tablet PC’s. However, these devices suffer several defaults. The smallest devices are really portable but have very small screens [4] and the larger devices are often heavy. Most of them have short battery life, especially if connected using a wireless network. In addition, these devices are expensive, especially if used in large settings, and are often accompanied with crucial maintenance problems, both for hardware and software. In the fall of 2003, the University Hospitals of Geneva (HUG) had the opportunity to evaluate, in real clinical situations, a beta pre-commercial release of a package, including a digital pen developed by Logitech®, digital paper using a micro pattern of dots developed by Anoto® and a form and pen management system, the Forms Automation System (FAS), developed by Hewlett Packard®. This technology was tested in two clinical settings with the objective of evaluating technical integrability, data acquisition reliability and acceptance of users according to both technical aspects and human factors. The assessment of data acquisition reliability and acceptance of users are out of the scope of this paper and are available separately [5]. The objective of this paper is to present our experience in implementing and integrating concretely this new technology in our CPR.

Background The HUG is a consortium of primary, secondary and tertiary care facilities employing 5,000 care providers, with approximately 2,000 beds and managing over 45,000 admissions and 450,000 outpatients encounters each year. The clinical information system (CIS) is a Java based 3-tiers


MIE 2005 in Geneva

SMI 2005: Nº 57

12

architecture using event-driven processes and interoperability with Web Services. More than 20,000 patient records are open every day in the CIS. Clinical context: Post-natal care in Obstetric Anaesthesia (PNC)

Since July 2001, the anaesthetists evaluated anaesthetic complications and maternal satisfaction after labour analgesia in the labour room using a paper form. Data collection was performed in two parts corresponding respectively to one of the two columns of the form: a) data about the labour and the delivery, that is pre-printed on the form and comes from the CPR (fig. 1, Section B); b) data relative to the “post partum”, which is filled within the next 72 hours of follow up care using the form (fig. 1, Section A).

Figure 1. The PNC form.

Since July 2003, a web application allowed acquisition of clinical information pertaining to labour and delivery (fig. 2, PFAnesthesio). These data were usually collected before and during the labour. Normal PC and wireless laptops were available. Within the next 72 hours, the form was printed with these data, and the second part about post-natal care was filled during visits to mothers performed by an anaesthetist, sometimes scattered in several wards. The filled forms were scanned after discharge of the patient to allow data to be transferred in the CPR. When enough forms were entirely filled, an operator collected them and processeed a scanning with human-assisted optical character recognition. Only single character fields, such as check-boxes, were reliably recognized. For ambiguous situations, the operator decided which value was correct. The DPP technology

The DPP combines mainly three components: a) a HP colour LaserJet Printer with specific drivers; b) a HP software package; c) a Digital Pen with a specific firmware. By the time of the study, all components were in alpha or beta release and not available commercially.

Figure 2. Workflow of data acquisition before the DPP trial. “*” means that the request or action can be performed several times.

The form: printer drivers, document and the “digital” pattern. – When the form was printed, using the dedicated driver, a layer of a slight pattern of black dots was also printed. This layer, using a technology developed by Anoto®, identified the function of the paper and encoded much information, such as unique ID and 2D position. It allowed the pen to record the cursive information and an unambiguous association between


SMI 2005: Nº 57

13

the pattern, that is the document identification and the patient whose information had been printed on the document. In addition to the standard driver of the printer, a digital driver allowed to establish a link with a Paper Lookup Server (PLS). This server, which has to be installed first, allowed the storage of the clinical context and the distributed patterns, and ensured the link between clinical contexts and patterns. When a user requested a print of a digital form, the driver sent a request to get an instance of pattern associated to the corresponding clinical form. The PLS stores a) the context received from the CIS, in our case a unique ID identifying the encounter; b) a unique form identifier, and c) the unique ID created for the new pattern to be printed with the form. This pattern was printed with the document and was recognized by the digital pen. The pattern was made out of very small black dots resulting in a slightly off-white colour. To increase discrimination of the pen’s camera between the pattern and the layout of the form, the black colour was reserved to the pattern. Therefore, layout or any information devoted to human reading on the form had to be printed in another colour, generally blue, but a complete colour palette is provided by HP. The HP software package. – In addition to the PLS, several components were required to allow a fine-tuned integration between the existing CPR and the DPP technology. The most important components were a) a plug-in added to Adobe Acrobat® to design forms; b) a toolbox that allowed the development of services and the transfer of structured data on the form to web services, and c) several management tools for users and administrators. For healthcare providers, the package included a tool for validating data transferred and for the identification of users. The tools for administrators allowed linking a service with a form, registering and managing users and pens as well as linking specific pens with users. The plug-in added to Acrobat allows to design forms. For the form designer, the operation consists in drawing an area above each structured field of the form and defining its type, such as Boolean, free text, etc. A unique ID must be assigned to each area, which will be associated with the information recorded by the pens. The toolbox provided by HP allowed access to all information transmitted by pens, but did not process the data nor established the link with the existing CPR. In order to get the correct data in its corresponding field in

our CPR’s database, we had to develop an application service handler (ASH). This has been done using JAVA, but it is not mandatory. The granularity of data recorded by pens allowed access to every single elements corresponding to one sample (see next section), including unique ID of the form, coordinates of the pattern area defined before, timing in millisecond, information from the pressure captor and the ID of the pen used to fill the area. It was also possible to access consolidated data, where all single points are grouped into strokes, defined as a cursive path performed without pressure interruption. Strokes had a start time (pen down), an end time (pen up), and belonged to a field of the form. For all simple types, such as lists and checkboxes, the toolbox gave direct access to the value of the field. For text fields, using the SDK, pen data could easily be transform in two picture formats: BMP and SVG (vector). We used both and stored them. The system could be linked to an Intelligent Character Recognition (ICR) system to recognize handwriting. The Digital Pen. – The digital pen contained a standard ink cartridge, a camera, a communication unit, a pressure captor, an image processing unit, a storage unit and a battery. The camera, placed under the ink cartridge, was able to record 50 frames per second. When the pressure captor detected that the ink cartridge was in contact with the paper, the camera sampled the position of the pen on the paper using the pattern. Less than 2 square millimetres are needed for the pen to localize its position, whatever the entry place, direction or angle. The pen stored up to 40 handwritten pages between transfers and one full power charge allowed the writing of up to 25 full pages. A led located on its side indicated battery charge and its status. The activation of the pen was ensured by the cap which acts as a power switch. The pen was able to emit vibrations to provide feedback to users, for example when the pen was unable to recognize the pattern. Once docked to its USB cradle, the PLS was called with information of every patterns for which data had to be transmitted. The server retrieved the context and a pointer to the ASH allowed data to be correctly processed. Several pens and users can contribute to a unique given form simultaneously or with time intervals. Data was merged when transferred, and at each transfer forms could be consolidated if needed.


MIE 2005 in Geneva

SMI 2005: Nº 57

14

tion (path not represented in fig. 3). The DPP technology proved to be as reliable as OCR using a professional scanner without human intervention. Acquisition errors only occurred for specific fields when the design of the form was badly adapted to the technology. Quality surveys as well as a complete user satisfaction study have been conducted [5]. The DPP appears to be a well accepted technology.

Conclusion

Figure 3. Workflow of data acquisition during the DPP study. “*” means that the request or action can be performed several times. Details of the implementation of the DPP in the CIS

There were two important steps to implement the system: a) installation of all components, required only once and b) development of the ASH for each form to be linked with existing databases. Before the DPP study, PDF forms including existing data were generated using XSL-FO (fig. 2). During the DPP study, the technology required the registration of the PDF file generated with the plug-in to the PLS. This file was stored on the server and linked with the corresponding ASH. Patient data then had to be merged with this file, using XFDF. The printer driver directly managed XFDF files and included data in the corresponding PDF descriptor file when printing. Acquisition quality and satisfaction of users

The DPP is a promising technology that proved to be easy to integrate with an existing CIS, using new technologies such as JAVA and Web Services. One major inconvenience of the technology is the need to print using colour printers, in order to increase discrimination of the camera of the pen between human-readable information and the pattern devoted to the DPP technology. Structured data originating from single state fields, such as checkboxes and radio buttons, or scales, are immediately addressable to store in a relational databases. Handwriting, for letters and numbers, must be processed with a third-part OCR or ICR. The data acquisition reliability proved to be similar to a professional scanning system, with the great advantage of mobility and direct acquisition at the bedside. Healthcare providers have been enthusiastic about using this technology.Criticisms towards the ergonomy of the pen are addressed with new versions of the system.

Acknowledgments This work has been funded by the Swiss National Science Foundation 632-066041 and the Geneva University Hospital, PRD 03-I-05.

The scan system has been maintained during the study to compare the reliability of data acquisiReferences 1 Kaplan B. Evaluating informatics applications – some alternative approaches: theory, social interactionism, and call for methodological pluralism. Int J Med Inf 2001;64:39–56. 2 Sicotte C, Denis JL, Lehoux P. The computer based patient record: a strategic issue in process innovation. J Med Syst 1998;22:431–43. 3 Wilcox RA, La Tella RR. The personal digital assistant, a new medical instrument for the exchange of clinical information at the point of care. Med J Aust 2001;175:659–62. 4 Lapinsky SE, Wax R, Showalter R, Martinez-Motta JC, Hallett D, Mehta S, et al. Prospective evaluation of an internet-linked handheld computer critical care knowledge access system. Crit Care 2004; 8:R414–21. 5 Despont-Gros C, Landau R, Rutschmann O, Simon J, Lovis C. The digital pen and paper: evaluation and acceptance of a new data acquisition device in clinical settings. Methods of Information in Medecine; 2005;44:359–68.


SMI 2005: Nº 57

15

Integrated traceability in the hospital and home care settings Christian Hay, Jim Bracken, Jean-Yves Gerbet

Summary The purpose of this article is to illustrate the need in interoperability for traceability tasks, in the complex environment of healthcare. Traceability usually focuses on objects or items, and their special and sequential movement. Here we want to embrace traceability as concerning at the end of the day, the patient himself (1). Therefore, the requirements to the traceability objectives include, in addition to the items and their sequential locations, the patient identified by the “hospital stay” or “patient episode”, that is the relationship between the patient and the healthcare provider. Patient episodes are already well identified, each healthcare provider using a proprietary key or reference, which may (or may not) be carried in a barcode or a RFID Tag. The multiplication of the proprietary references, when healthcare evolves in integration, collaboration or networks, requires an appropriate answer to avoid a single hospital stay to be identified in several keys, each possibly bridged to one or more others. Adopting a same standard to identify patient episodes enables different IT systems to collect and group clinical information about a patient episode, because the uniqueness of the key and its standardised structure reduces links, administrative work and enhances efficiency and patient safety (2).

Glossary GTIN: Global Trade Item Number, a 14 – 4 – numeric digit number identifying any object that can be traded, ordered, invoiced, etc. GLN: Global Location Number, a 13 – 3 – numeric digit number identifying any location or function GSRN: Global Service Relationship Number, an 18 – 8 – numeric digit number identifying any relation between the user and a third party; this can be the patient in his relation with the Healthcare System, the episode, the hospital staff, etc. SSCC: Serial Shipping Container Code, an 18 – 8 – numeric digit number identifying a logistic unit during its journey. reference, which may (or may not) be carried in a barcode or a RFID Tag. The multiplication of the proprietary references, when healthcare evolves in integration, collaboration or networks, requires an appropriate answer to avoid a single hospital stay to be identified in several keys, each possibly bridged to one or more others. Adopting a same standard to identify patient episodes enables different IT systems to collect and group clinical information about a patient episode, because the uniqueness of the key and its standardised structure reduces links, administrative work and enhances efficiency and patient safety.1

Purpose Contacts Christian Hay Medinorma LLC 34-36 Grand Rue CH-1180 Rolle hay@medinorma.ch Jim Bracken GS1 Ireland, Dublin (Ireland) jim.bracken@gs1ie.org Jean-Yves Gerbet Responsable du Service de transport des patients CHU Dijon (France) jean-yves.gerbet@ chu-dijon.fr

The purpose of this article is to illustrate the need in interoperability for traceability tasks, in the complex environment of healthcare. Traceability usually focuses on objects or items, and their special and sequential movement. Here we want to embrace traceability as concerning at the end of the day, the patient himself [1]. Therefore the requirements to the traceability objectives include, in addition to the items and their sequential locations, the patient identified by the “hospital stay” or “patient episode”, that is the relationship between the patient and the healthcare provider. Patient episodes are already well identified, each healthcare provider using a proprietary key or

Following the patient in his care journey The need to identify the patient episode in an un-ambiguous way emerged in a new approach at the Emergency Hospital Utrecht in the early

1

We refer to the EAN.UCC System, developed by its 1,200,000 users through the world and managed by GS1, formerly EAN International. In this article, when we use the acronym “EAN”, we refer to the standardised system and when we use “GS1”, we refer to the organisation and one of its 103 subsidiaries. GS1 publishes regularly updated Global Specifications; here we refer to the version 6.0, January 2005.


MIE 2005 in Geneva

SMI 2005: Nº 57

16

to analyse and project the workforce needs in the Emergency Department (there are only two departments with obviously permanent need of staff dedicated to patient transport; the second is the Radiology Department). Tools and methodology

Figure 1. CHU Dijon: hourly cumulated patient movement during the pilot.

1990s. A multidisciplinary group worked on the definition of the needs to enhance processes in the case of massive patient admissions; the working group expected the solution not only to identify the patient uniquely, but also to enable real-time location of the patient, and a link on his administrative and clinical data, which will be completed along his journey by the healthcare provider [2]. The use of the newly adopted international standard identification (the Global Service Relationship Number (GSRN) has been adopted by EAN International in 1995) as a key to track and trace the patient episode, enhanced the Emergency Hospital Utrecht possibilities in its triage functions, as the patient episode ID’s uniqueness was secured regardless of the possible destination of the patient for his treatment [3].

The tools used for the pilot involved 2 portable laser scanners (Barman Laser, Axiome Alpha SA), programmed using proprietary machine software2, a MySQL database to host the collected information and an MS-Access application for the easy data management and search functions. The transport staff delegated to the Emergency Department captured during the working day each movement of a patient with one scan at the collecting point (location ID, patient ID, date and time), followed with a second data capture at the delivery point (location ID, patient ID, date and time). The IDs used are structured according to EAN; we used the Global Service Relation Number (GSRN) to identify the hospital staff; and the Global Location number (GLN) to identify the locations. They were carried on an EAN-128 barcode produced by a simple standard software [5]. The Patient episode ID was allocated by the patient management system in a proprietary way and carried in a Code 128 symbology. The migration to the GSRN is in validation process. Results

The Pilot at the CHU Dijon Background

The patient transport department at the CHU Dijon, a 1600 beds teaching hospital in 3 main locations, has run a pilot to measure the movement of patients in the Emergency Department, over an eight-week period. Currently, the transport department manages patient transport between the 3 main sites with a web-application (“Ptah”, conceived by the CHU Dijon and GéoSoft Aquitaine; this software is now used by a growing number of French Hospitals). A detailed study about the consequences, in terms of human resources, of the transformation of the CHU in a single site Hospital has been made by the Head of the transport department in 2004 [4].

The study lasted 42 working days, or a total of 600.35 hours. It has involved 25 staff members who spent between 1 and 77.25 hours in patient transport at the Emergency Department between 11 May and 17 July 2005. In average, each patient has been moved 1.84 time during his stay (the average of patient movement per day varies between 1.5 to 8.5). The workload intensity demonstrated that the average movements were stable through the working hours (10.00 to 24.00). As there was no dedicated staff for the transport of patients between 00.00 and 10.00, we can state that this was only possible because knowingly, there were only limited movements during this timeframe.

Objective

The objective of this pilot is to collect additional information about the workload and work-peaks,

2

The same equipment is widely used at the University Hospital of Geneva.


SMI 2005: NÂş 57

17

Conclusion

The study has demonstrated that the use of a barcode reader was well accepted by staff members, but presented some difficulties in the handling and the reading of the patient episode code on the wristband. As a result, the study had to be interrupted once when the barcode readers had to be repaired. No particular difference in the data management process has been noted between the wristband carrying the patient episode (code-128) and the GSRN or the GLN (UCC/EAN-128). The workload was now objectively available to be evaluated in the specific existing environment. The first evaluation demonstrated that the hourly relatively stable workload and the short reaction times explain why dedicated staff are necessary in the Emergency Department. Further, a centralised work allocation would be difficult because of the short reaction times; the extension of administrative work for the care givers may slow down the whole rhythm of the Emergency Department. On the other hand, the first evaluation confirmed that staff members can be allocated to the Emergency Department in a rotation without negative impact on the workflow in the Department (provided the staff knows the Department and its working methods).

The Irish Implementation Background

A pioneering project between GS1 Ireland and local health bodies is using the latest GS1 technology to trace expensive and time-sensitive Clotting Factor Concentrate (CFC), the product used by haemophilic patients. GS1 Ireland is working with the National Centre for Hereditary Coagulant Disorders (NCHCD, opened at St James Hospital, Dublin in 2002) on the project which kicked off by using EAN-128 barcodes and later looked at Electronic Product Code (EPC) technology. It is believed that the project, which has being launched by the Health Minister in April 2004, could be destined for global application, like traceability of vaccines. To bring this about, a consultative group has been organised that includes representatives from the US Food & Drug Administration, the EU Commission and from EMEA [European Agency for the evaluation of Medicinal Products] as well as clinicians, medical informaticians and patient representative bodies – both local and global. Their role is to validate the solution during its

implementation and to specify the eventual system for application in further countries and for the extension to the full vaccine traceability.3 There are about 2000 patients suffering of Haemophilia in Ireland, 300 of them being treated at their home with Clotting Factor Concentrate and followed closely by their clinician, located in 13 treatment centres. Some years ago, arising from the treatment of haemophilic patients with infected blood products causing some fatalities, and a subsequent inquiry, the Department of Health and Children addressed haemophilia treatment as a matter of priority. Objective

The objective of the Health authorities in Ireland is to implement real time identification of the CFCs (to make immediate product recall possible and to manage the distributed CFCs as a virtual stock), real time updated patient treatment history, permanent validation of the cold chain storage and delivery process, an accurate solution to ensure that the correct product has been prescribed and will be administered to the right patient, a permanently updated information tool to analyse patient treatment data. Tools and methodology

The ID used are structured according GS1 standards; the Patient ID is the GSRN, the patients home and alternate delivery points are each identified with a GLN, the CFCs are identified with a GTIN with also a lot number, an expiry date and a serial number. The data is carried on an EAN-128 symbol, which will progressively be labelled directly by the manufacturer at the end of the manufacturing process. Even if currently the tender for the data capture is still open, it is conceived that the barcodes will be read with a mobile phone (or a similar device) and sent to the NCHCD for populating the electronic patient file and other management tools. The EPF is based on OpTx, software developed by the US solution provider Varian; it has been adapted by Clintech Healthcare Systems, an Irish company, to correspond to the clinical needs and to host GS1 formatted information. Data is captured at each product movement within the cold chain company (TCP, Tempera3

The Ministry of Health of Canada has decided to implement full traceability of the vaccines by using EAN Datamatrix.


MIE 2005 in Geneva

SMI 2005: Nº 57

18

ture Controlled Pharmaceuticals Ltd), when the products are delivered at the patient’s home, and finally by the patient when the patient administers himself the CFC. This enables real time information about the product step by step from its entry in the Irish Supply Chain through its journey to the patient. Data are managed for logistical and clinical purposes in an automatic way. Results

The first phase of the project started in the summer of 2005 with home deliveries, and the country wide roll out is planed for the first half of 2006. The GRSN was assigned to the patient and was used as the key identified for both the Supply Chain Database and the Clinical Database, thereby establishing a globally unique reference linking the two databases. Later phases of the project will use this reference to link with the prescription, patient self administration and treatment within the hospital. Conclusion

At a mid-term stage of the implementation, we can state that the use of the EAN.UCC System had very little influence on the investment for this project; it offers the security of the uniqueness of the identifications and is in full alignment with industrial practices within the Pharma supply chain as well as with the requirements of the US FDA on product (and single dose) identification. Further, the approach made by NCHCD and GS1 Ireland is in alignment with the implementation of the Electronic Product Code, which may be carried in a RFID Tag or in a barcode – or both together. The patient can be treated at his home or at any treatment centre in Ireland, with the assurance of the highest quality and security.

Discussion We consider the two examples of use of the GSRN as an interesting opportunity to compare the use of proprietary solutions – as it is widely the case – with a standardised data format, providing worldwide uniqueness. First we can state that anyway the patient episode has to be identified; this can be achieved with any solution. What is then the benefit of the GSRN? When considering an isolated hospital (having no interaction with another hospital), the patient

episode will be generated by one of the IT components – i.e. the patient administration tool. Then all the surrounding IT components – for laboratory samples and other clinical processes – should adopt this identification as a key. The use of one single patient identification across the hospital reduces significantly the number of data capture devices, as these should be able to capture the same data format. When items have to be linked to a patient episode, the data capture process will face sequentially the hospital episode and the item identity, which is a GTIN in at least 54% of the cases [6]. The risk of confusion cannot be excluded if we consider that a significant part (7%) of the hospital supplies arrive with proprietary identification. Then, the data capture system has to be implemented to “decode” EAN data structure (for example to distinguish a lot number from an item number). As the hospital produces also product identification (for example at the central sterilisation), these are usually identified with another proprietary solution. A special attention to avoid any data collision with the patient episode ID will be necessary. On the data capture process, either one will use a dedicated scanner or will have to implement an additional data structure in the existing one, beside episode ID and supply ID. If another task in the hospital produces item IDs, the complexity of the process grows dramatically – or the number of dedicated scanners. By adopting the EAN system with the target to enhance some elements of interoperability, the hospital adopts an existing standard describing the data formats and securing the uniqueness of the IDs. The episode ID is, according the EAN standard, a GSRN, whilst an item (coming from the suppliers or being produced at the central sterilisation), is identified with a GTIN (with an appropriate format for the lot number, the expiry date, the serial number, etc.). The marginal additional cost to any proprietary solution is the membership fee to the local GS1 organisation; the marginal economy is the security of uniqueness of the IDs, the capacity to integrate one single data decoding process for various IDs, regardless of their origin (external or internal supply, staff or episode identification, assets, etc.). Unfortunately we have no indication about the economies made in Ireland by adopting the EAN system compared to any proprietary solution. We can state that the additional costs have been extremely low, and when the CFC manu-


SMI 2005: Nº 57

19

facturers will extend the standard they already use to the CFCs, one cost reduction will be put in evidence. The use of the EAN system in the Dijon pilot does also not demonstrate cost reduction, and did not cost more than any proprietary solution (as the Hospital in Dijon is already a member of GS1 France). It has demonstrated that the current patient ID could be integrated in a GSRN, with full discrimination with the staff GSRN. The hospitals of Geneva and Lausanne are considering the adoption of GSRN for episode ID and staff ID, and the Dijon study has been part of the input for the decision and implementation process.

Conclusion Whilst the EAN system is widely used in the world in more than 20 market segments, we

have very few cost/benefit evidence of its advantage to any other proprietary solution in the Healthcare environment. The two cases in Ireland and Dijon helped to understand that interoperability becomes cost effective by using this standard, as the hospital is in relation with a large number of external parties and intends to outsource or collaborate: each of these considerations requests the standardised approach. We express our thanks to Jim and Jean-Yves for their close collaboration in the preparation of this summary. Conflict of interest: As an independent consultant, Christian Hay and Medinorma LLC provide services and participate to implementation of identification and traceability projects in Healthcare. One of the important customers of Medinorma LLC is the GS1 Organisation in Switzerland, France and Europe.

References 1 “Patient died after drugs went to man with the same name”, Daily Mail, 2 November 2001. 2 Noordergraaf GJ, et al. Development of computer-assisted patient control for use in the hospital setting during mass casualty incidents. Am J Emerg Med 1996;14:257. 3 Boumn HH, et al. Computerization of patient tracking and tracing during mass casualty incidents. Eur J Emerg Med 2000;7:211. 4 L’informatisation de la fonction transport patients au CHU de Dijon, Conséquences et perspectives d’évolution, Jean-Yves Gerbet, Mémoire de licence de logistique hospitalière, IUT Chalon sur Saône, 2004. 5 Seagull Scientific Inc, BarTender version 7.5.1. 6 La belle histoire des CHU et du code à barres, Christian Hay, Décodez No 89, Paris 2005.


MIE 2005 in Geneva

SMI 2005: Nº 57

20

Information retrieval: proactive semantic search strategies Simon Hoelzera, b, Ralf K. Schweigerb, Joachim Dudeckb a

H+ The Swiss Hospitals, Berne, Switzerland b Institute of Medical Informatics, Justus-Liebig-University, Giessen, Germany

Summary Information access and retrieval are essential to serve the delivery and application of evidencebased medicine. The eXtensible Markup Language (XML) provides a standard means to explicitly describe a document’s structure and to identify meaningful elements inside textual narrations. We have developed an Information model to represent medical knowledge contained in Clinical Practice Guidelines, textbooks, patient information, articles, etc. that provides a transparent, granular, scalable representation of text-based medical information. Access to these processed and “enriched” electronic resources is achieved through a new concept using an XML search engine. This search engine exploits XML for improving the search quality. Search mechanisms that analyse questions posed by quantitative and semantic parameters are becoming increasingly important. When, and to what extent they will be deployed in this use or similar types of uses, will depend on the efforts being made towards the production, upkeep and updating of structured (XML) documents. Key words: clinical practice guidelines; quality of care; eXtensible markup language; information retrieval

Introduction The increasing sources of medical knowledge need to be made efficiently available as and when required for individual care situations in order to press ahead with the application of evidencebased approaches in medicine. There are several elements that are critical for success in this: the quality of the source and its availability at the point of care should be seen as central among these.

Correspondence: Simon Hoelzer, MD H+ The Swiss Hospitals Lorrainestrasse 4A CH-3000 Berne simon.hoelzer@hplus.ch www.hplus.ch/main/ Show$Id=481a.html

Active decision support bump up against their limits if the clinical decision-making situation or the required knowledge base and framework are too complex [1, 2]. In such a situation, however there is the possibility of drawing on selected knowledge items direct from their sources (textbooks, journal articles, guidelines, etc). With regard to representation of those items, efforts are being directed towards a single, uniform and

comprehensive data model (XML schema) for structuring and semantic labeling [3].

Materials and methods Data and information in medicine are mainly represented in slightly structured or even unstructured, narrative text documents. It is nearly impossible to detect and handle relationships between data elements within narrative documents or to retrieve parts of documents that contain specific information. But information access and retrieval are essential to serve the delivery and application of evidence-based medicine. This way, the exploitation of medical information resources by electronic means is still limited [4, 5] without an explicit structure. One possible exploitation, for example, is the quick access of the healthcare professional to the information of interest. A physician may not want to read a complete clinical practice guideline (CPG) if he is only interested in a specific part of the guideline. Irrelevant search results can be reduced to a minimum as soon as we insert meaningful structures such as diagnoses and therapies into clinical documents. The eXtensible Markup Language (XML) provides a standard means to explicitly describe a document’s structure and to identify meaningful elements inside textual narrations. XML provides powerful concepts to represent the structure in narrative documents that is otherwise immanently hidden. It will play an important role to improve the management of Healthcare documents of any kind [6]. At the same time, additional information contained in the text either implicitly or explicitly, on say metatags or an attribution process, can be placed in the XML document. This includes enhancement with standardised, encoded information (for example MeSH or ICD coding), assigning clinically-relevant text properties (e.g. levels of evidence of individual recommendations) as well as linking external information sources (figure 1). This document structure defined, inherent information can subsequently be drawn upon to look up case-relevant knowledge.


SMI 2005: Nº 57

21

– Queries on content and structure by means of typical search items: diagnosis, codes, patient characteristics … (proactively structured queries) – Queries using data types: date, codes ... – Linking data items of an electronic patient record to corresponding information resources

Figure 1.

– Provision of metainformation about the text resource as well as about the quality of the recommendations within the text resource – Customized presentation of the most relevant parts of a set of matching guidelines

Discussion

Figure 2.

Results If we want to communicate the meaning of XML documents we must standardize the underlying XML models, i.e. XML modeling and standardization will play a significant role in the future. This task will remain a challenge to standardization bodies. However, XML can support the implementation of a flexible and composite bottom up approach to this process. XML Schema, for example, suggests the construction of content models that can be shared and used as building blocks for creating new schemas. We have developed an Information model to represent medical knowledge contained in Clinical Practice Guidelines, textbooks, patient information, articles, etc. (figure 2). This core model provides a transparent, granular, scalable representation of text-based medical information. With this generic document-based approach it is possible to convert and map an XML file to the original text resource as well as other computable formats. Concerning the electronic representation of clinical practice guidelines, the model allows for:

In this way, one can expect improved availability of clinically-relevant knowledge on a specific medical problem to meet the user‘s needs. Access to these processed and “enriched” electronic resources is achieved through a new concept using an XML search engine [7, 8]. This search engine exploits XML for improving search quality (relevance and completeness of URIs). It can handle Internet resources of any type (text/xml, image/jpeg, etc.) and uses XML Topic Maps to establish relationships between search items. Thus, documents in different formats (HTML, PDF, XHTML and XML) and with varying degrees of refinement with regard to their structure as well as semantics, can be processed. In the absence of any structure or semantic markup, as a last resort the search will be carried out similar to a free-text search using the search algorithms of normal search engines, that weigh the referent links, the proximity, position and emphases (bold print, headers, metatags) of search terms. However, such purely textual searches severely restrict the automatic identification and extracting of relevant knowledge from within highly-structured sources. Improvements can be achieved through a combination of the following approaches: 1) To increase the precision of search results it is possible to define so-called proactively structured queries for individual clinical care situations (scenarios) in advance. In this way, the occurrence of target search terms can be inspected context-sensitively in pre-defined document structures (for example, the occurrence of a particular stage of a disease in the “Therapy” paragraph, the meaning of which has been defined in


MIE 2005 in Geneva

SMI 2005: Nº 57

22

the XML data model). Thus, when selecting qualifying sources, it is ensured that only the really relevant sections of the complete document are inspected, and that the search term has the correct meaning assigned to it (examples: a) Stage of illness “III” will not be confused with paragraph number III; b) a document will only be included on the list of results if the contents thereof refer only to the primary diagnosis “Apoplexy”, and not to apoplexy as a complication of medical intervention or the underlying illness). The marking of a section of text with a description of its function within the document results in a machine-readable semantic plan, which should ensure this document will also appear in the correct lists of results. At the next step, the scenarios and corresponding proactivelystructured queries already defined will determine which sections of text are to be displayed, and in what layout. To do this, standard XML transformation and rendering tools (XSL, CSS) are deployed. 2) Furthermore, the search engine is not just able to associate different search terms by their frequency intervals in the text, but can also make use of both the structure (terms are found in the same document paragraph or are found inside a hierarchy of documents) as well as the standardised semantics described in the model above (the meaning of a tag‘s contents). In this way, the closeness of a relationship is interpreted using the document‘s structure or semantics, and weighted appropriately. In general, search terms can be input actively. However, the above approaches can be combined with the automated production of sets of search terms, or filling in a

proactively structured query with search terms extracted from the electronic medical record. This process is currently supported when processing diagnostic terms by using synonymous terms taken from the ICD-10 diagnostic thesaurus which, by contrast, can be compiled and updated in XML using the concept of topic maps [8]. It is possible here to call up an alternative term or the appropriate ICD-10 code for the search.

Conclusion Search mechanisms that analyse questions posed by quantitative and semantic parameters are becoming increasingly important. When, and to what extent they will be deployed in this use or similar types of uses, will depend on the efforts being made towards standardisation, also future work on the production, upkeep and updating of documents. The efforts being made by W3C consortium, such as adopting the XHTML 2.0 standard, with a clear separation of structure and presentation, along with setting hierarchies, chapters and sub-chapters, could acquire a mediating role in the medium-term, and help drive forward the move towards highly-structured sources and their use. In the future, experience in practical use at various levels must be collated and drawn upon. By doing this, it becomes apparent to what degree structure and semantics are essential and can be usefully put to work.

Acknowledgments This project is supported by the German Federal Ministry of Health.

References 1 Bakken S. An Informatics Infrastructure is essential for evidence-based practice. J Am Med Inform Assoc 2001;8:199–201. 2 Shiffman RN, Liaw Y, Brandt CA, Corb GJ. Computer-based guideline implementation systems: a systematic review of functionality and effectiveness. J Am Med Inform Assoc 1999;6:104–4. 3 Hoelzer S, Schweiger R, Dudeck J. Representation of practice guidelines with XML-modeling with XML schema. Methods Inf Med. 2002;41:305–12. 4 Roberts A. Analysing XML health records. XML Europe Conference Proceedings 2000;377–80. 5 Schweiger R, Tafazzoli A, Dudeck J. Using XML for flexible data entry in healthcare, example use for pathology. XML Europe Conference Proceedings 2000;357–62. 6 Schweiger R, Hölzer S, Altmann U, Rieger J, Dudeck J. Plug and Play XML? A Healthcare Perspective. J Am Med Inform Assoc 2002;9:37–48. 7 Intelligenter suchen als bisher mit LuMriX-Forschergruppe der JLU entwickelt neuartige Suchmaschine, Giessener Anzeiger vom 31.07.2002. 8 Hoelzer S, Schweiger RK, Liu R, Rudolf D, Rieger J, Dudeck J. XML Representation of Hierarchical Classification Systems: From Conceptual Models to Real Applications Proc AMIA Symp 2002.


SMI 2005: Nº 57

23

Automatic abnormal region detection in lung CT images for visual retrieval Henning Müllera, Samuel Marquisa, Gilles Cohena, Pierre-Alexandre Polettib, Christian Lovisa, Antoine Geissbuhlera a

Service of medical informatics, University and hospitals of Geneva b Emergency radiology, University hospitals of Geneva

Summary Image management, analysis, and retrieval are currently very active research fields mainly because of the large amount of visual data being produced in modern hospitals and the lack of applications dealing with these data. Most often, the goal is to aid the diagnostic process. Unfortunately, only very few medical image retrieval systems are currently used in clinical routine. One application domain with a high potential for automatic image retrieval is the analysis and retrieval of lung CTs. A first user study in the United States (Purdue University) showed that these systems allow improving the diagnostic quality significantly. This article describes an approach to an aid for lung CT diagnostics. The analysis incorporates several steps and the goal is to automate the process as much as possible for easy integration into clinical processes. Thus, several automatic steps are proposed from a selection of the most characteristic slices, to an automatic segmentation of the lung tissue and a classification on the segmented area into visual observation classes. Feedback to the MD is given in the form of marked regions in the images that appear to be different from the norm of healthy tissue. We currently work on a small set of training images with marked and annotated regions but a larger set of images for the evaluation of our algorithm is in work. The article currently only contains a short quantitative evaluation.

Correspondence: Henning Müller University and Hospitals of Geneva Service of Medical Informatics 24, rue Micheli-du-Crest CH-1211 Geneva 14 henning.mueller@ sim.hcuge.ch http://www.sim.hcuge.ch/ medgift/

For most tasks we use existing open source software such as Weka, GIFT, and itk. This allows an easy reproduction of the search results and limits the need for costly redevelopments, although response times are slower than possible with optimized software. Key words: content-based image retrieval; highresolution lung CT; diagnostic aid; classification

Introduction Content-based image retrieval (CBIR) has been an extremely active domain in the fields of computer vision and image processing for more than 20 years [1]. In the medical field, this domain

is also starting to become more active, as an increasing amount of visual data is being produced in hospitals and made available in digital form [2, 3]. General medical image retrieval in PACS-like databases is in this context very different from specialized retrieval in a very focused domain. In the medical field, the main goal is its use as a diagnostic aid and for accessing medical teaching files. Current medical use is on the retrieval of tumors by shapes [4] as well as on histological images [5], and in other specific fields (dermatology, pathology). A domain where textures play a very important role in the diagnostic process is the analysis of high-resolution lung CTs [6]. A user test among radiologists [7] showed that an image retrieval system can improve the diagnostic quality significantly, especially for less experienced radiologists. Still, most of these systems either rely on much complicated interaction with the user, which makes them hard to introduce into a clinical context, or they are too broad to be used as a diagnostic aid in a specialized domain. This article details a solution for helping with the interpretation of high-resolution lung CTs, which is a domain where diagnostics are fairly hard especially for non-chest specialists. A large number of unrelated diseases exist with often unspecific symptoms as well as many classes of visual observations and other data to be integrated (lab results, age, environmental exposures, …). The diagnostic result strongly depends on the overall texture of the lung tissue, so automatic analysis seems possible. Our project limits the direct interaction with the user and performs as many tasks as possible in an automatic fashion, so a minimum of time is needed to operate the system and get responses for feedback. Several steps of the process have already been integrated, whereas a few are still in their first tests.

Diagnostic aid on lung CT interpretation This section describes the steps that are necessary for a complete diagnostic aid system for lung diagnostics and their degree of automation.


MIE 2005 in Geneva

SMI 2005: Nº 57

24

good idea of healthy tissue. Other systems in the literature often use only pathologic classes for the classification but the first step in the diagnostic process is to find out whether the tissue is abnormal or not. We are currently creating a larger database projected to contain at least 100–200 series of 50-60 images that will allow us a better representation of the classes. In figure 1, a screenshot of our tool for image annotation can be seen. It generates a simple XML file containing regions of interests as a set of points (outline) and a label for each outline. The files are then fed into the system along with the images at the training step.

Figure 1. A screenshot of our utility for the

annotation of image regions.

Generation of a test database and acquisition of representative samples

The first and most important part is the creation of a database of thin-section lung CTs. This database needs to include healthy cases as well as pathologic cases with the same slice thickness and slice distance. Characteristic regions need to be marked by a radiologist to allow learning the characteristics of a disease (or visual observation) with respect to healthy tissue. Currently, we only have a fairly small database of 12 CT series containing a total of 326 images. 112 regions of varying size are marked in 69 of the images (if possible non-bordering slices) by a radiologist to represent the following classes of visual observations for the classification step: – healthy tissue (52 sample regions); – emphysema (21); – micro nodules (19); – macro nodules (3); – interstitial syndrome (5); – ground glass attenuation (1); – fibrosis (6). It is important to note that prototypically healthy regions have to be annotated by the radiologist as well so that a classifier can get a

Figure 2. Partitioning of lung tissue into small

blocks for feature extraction and classification.

Analysis of blocks of lung tissue

In figure 2 the partitioning of the lungs into smaller blocks (size 16x16 pixels in the image) for further detailed texture analysis can be seen. A block is taken into account if three edges are inside the area marked by the expert or automatically segmented by the system. These lung blocks are stored as references together with the original image. This avoids artifacts of the filters that can occur due to missing border pixels as we can take into account the entire block environment. The framework is designed to facilitate finding the optimal block size for analysis and classification. From each block we extract and store the following visual features: – average grey level, standard deviation, skewness, kurtosis, and min-max of the grey levels; – grey level histogram using 32 grey levels;


SMI 2005: Nº 57

25

Figure 3. A lung CT and the tissue of the two lung halves segmented.

database (mySQL). We performed cross-validation using various classifiers to get an idea of how discriminative our features are. Besides Weka, we also included libsvm [9], an easy-touse Support Vector Machine (SVM) classifier. The SVMs finally had the best overall results, whereas the tested classifiers k-nearest neighbor, naïve Bayes, and C4.5 perform slightly worse. Lung segmentation as data preparation for classification Figure 4. Regions of interest that are not well marked can result in classification errors.

– features derived from co-occurrence matrices (four directions, two distances); – responses of Gabor filters in four directions and at three scales; – run length after thresholding, and other features based on mathematical morphology. A small number of grey levels in the histogram is sufficient for this kind of classification as has been shown in several image retrieval applications. Training

The features together with the region label become a sample in a classification problem [8]. Based on the acquired training data, the weights of the features for classification are calculated by Weka automatically. To develop an optimal classification strategy, several classifiers are tested and their performance is evaluated on the currently available data by cross validation. An open source utility allowing us to compare several classifiers is Weka, which has also the advantage of being able to connect directly to the feature

When submitting a new image for analysis and as a diagnostic aid, we concentrate on the part of the image that we are interested in, the lung tissue. While manual region selection of the image is still possible (using the tool described in 2.1 – only without a label), automatic segmentation is desired to minimize user interaction in the final diagnostic aid step. To this aim we use an algorithm described in [10] to find an “optimal” threshold for lung tissue segmentation, which works on DICOM images having a full 12-bit resolution as well as on the jpeg images from our radiology teaching file. As basis for the segmentation we use the Insight Toolkit (itk). In figure 3, a lung CT, its segmented version and a view of the outline discovered by the software can be seen. For the final texture classification we do not plan to take into account the entire lung tissue but rather the diagnostically interesting part, which is the outside part of the lung with less vessels that can change the texture strongly and introduce noise for the classification. The inside part with the vessels is automatically removed from further analysis. Classification of lung blocks

For the classification step of a new lung image, a partitioning of the image into blocks is performed. Then, the features of each block in the


MIE 2005 in Geneva

SMI 2005: NÂş 57

26

(manually or automatically) marked regions corresponding to lung tissue are extracted. The samples are created by the block’s features and have no label attached to them, yet. The integrated classifier performs the classification and attaches a label to each block based on previous learning data. Best results are obtained when using a block size between 16–32 pixels with 32 (resulting in a total of 833 blocks) being slightly better. A larger size of the blocks left too few regions. We also performed tests with several overlaps but with such a small number of regions this could lead to having the training data overlapping too much with the test data and resulting in incorrectly good results. One problem discovered at this stage was that some manually marked regions are large around the pathologic tissue, so part of the healthy tissue is also in the pathologic class and further on creates classification errors (see figure 4). Slice selection from a CT volume

This step is currently not implemented and will likely be the last part to be done, as its development is not crucial. The goal is basically to perform the task of the medical doctor to find the slice that best characterizes a disease. Once a large database of labeled tissue samples is ready, it will be fairly easy to process the entire volume slice by slice and select those slices with the largest part of the tissue being marked as nonhealthy for further inspection. To select several slices, we can give the system a combination of maximum number of slices and a threshold of unhealthy tissue. Selected slices can be marked in the volume data directly by highlighting the part containing most pathologic blocks.

retrieval application using visual features from the current case and comparing them to those of past cases in the database.

Results We currently have a framework in place allowing the acquisition of knowledge from the radiologists in the form of marked regions and annotations within images. The database is still small but a larger number of cases is planned. The acquired data with labels is used to train the classifiers. This means that with a growing number of judged cases from the radiologist, the system is expected to perform better. Lung segmentation works reliably and is stable as well as the partitioning of the lung tissue into small blocks and feature extraction. All these steps work in a completely automated fashion. The MD can feed a volume of lung CT images into the system. The images are segmented and partitioned into smaller blocks automatically. These blocks are classified and unhealthy tissue is marked in a different color in the images so the MD has a feedback for regions to inspect further.

Results presentation

Current results using the 112 regions and block sizes of 32 x 32 pixels (833 blocks) are 84%, when using cross validation between healthy and nonhealthy tissue with SVM classifiers. When classifying into the 8 available classes, SVMs lead to 83% accuracy, nearest neighbors to 75.4% correctly classified, Bayes to 69% and C4.5 to 73%. Problems occur especially for classes with a very small number of representative blocks or where the blocks are too small to be evaluated. Another problem is that the radiologist marked regions with a margin, leading to several blocks that are actually healthy tissue but close to a pathologic region and labeled as pathologic (figure 4).

The goal of the results presentation is not to make the decision for the radiologist but rather to highlight parts of the lung tissue that are classified by the system as pathologic. Highlighting of the background zone in colored shades is planned instead of grey scales. Each color presents one of the classes detected in the classification step. Currently, the results are only presented in 2D and one image at a time. Slices with the largest pathogenic parts are taken and displayed. It can also be imagined to present the results in 3D, where the entire pathologic area over several slices can be highlighted within the volume. Retrieval of similar cases from the reference database will enable the MD to verify a diagnosis. This is easily done through an image

We currently run the framework on a desktop computer with a Pentium IV processor with 2,8 GHz and 1 GB of RAM. On this computer, the segmentation takes around 5 seconds per slice and the subsequent cutting into blocks, feature extraction and classification another 2 seconds. Thus the analysis of a single slice is almost interactive whereas an entire volume takes one to two minutes before results can be displayed. We still need to experiment with the classification part and also with the features that we extract from the images to obtain an optimal feature set for classification. The current framework is by now a research tool, designed to ease experimentation of features, classifiers, parameters, etc. The final system will probably discard a lot of these options,


SMI 2005: Nº 57

27

be much simpler and focus more on the user interface and the results display to the user.

Conclusions This article presented a framework to aid the diagnostic process for lung diseases using lung CT images. The domain has shown its potential in studies and our current validations lead to good results despite known problems of the dataset. The steps of the diagnostic process are performed in an automatic way. Abnormalities are highlighted in the images by a change of color. Once we have a larger database accessible, more quantitative evaluation is needed to evaluate the algorithm quality and show the usefulness of the application in a clinical environment. Many parameters need to be optimized, from the feature extraction phase to the training step and the classifiers employed. We also need to think about optimal block size and whether we should rather take overlapping blocks to avoid misclassifying small parts of the texture and reduce false positives.

Now it is particularly important to create a large reference database with a high-quality annotation and evaluate the many visual descriptors and techniques available to create a robust framework for routine use. Several questions still need to be solved, for example the handling of other available data on the patients. The age can play an important role for the texture of the lung tissue. For the classification, we need to integrate all these data into the framework. The possibility to compare images with annotated cases from the reference dataset is expected to further increase acceptance of the technology because the system does not make a decision itself but rather points out interesting areas of the lung tissue and gives evidence on these areas by supplying similar past cases.

Acknowledgements This work has been supported by the Swiss National Science Foundation (FNS) through grants 632-066041 and 205321-109304/1.

References 1 Smeulders AWM, Worring M, Santini S, Gupta A, Jain R. Content-Based Image Retrieval at the End of the Early Years, IEEE Transactions on Pattern Analysis and Machine Intelligence 2000;22:1349–80. 2 Müller H, Michoux N, Bandon D, Geissbuhler A. A review of content-based image retrieval systems in medicine – clinical benefits and future directions. Int J Med Inform 2004;73:1–23. 3 Lehmann TM, Güld MO, Thies C, Fischer B, Spitzer K, Keysers D, et al. Content-based image retrieval in medical applications. Methods of Information in Medicine 2004;43:354–61. 4 Korn P, Sidiropoulos N, Faloutsos C, Siegel E, Protopapas Z. Fast and effective retrieval of medical tumor shapes, IEEE Transactions on Knowledge and Data Engineering 1998;10:889–904. 5 Tang LHY, Hanka R, Ip HHS. A review of intelligent content-based indexing and browsing of medical images, Health Informatics Journal 1998;5:40–9. 6 Shyu CR, Brodley CE, Kak AC, Kosaka A, Aisen AM, Broderick LS. ASSERT: A physician-in-the-loop content-based retrieval system for HRCT image databases. Computer Vision and Image Understanding 1999;75:111–32. 7 Aisen AM, Broderick LS, Winer-Muram H, Brodley CE, Kak AC, Pavlopoulou C, et al. Automated storage and retrieval of thin-section CT images to assist diagnosis: System description and preliminary assessment. Radiology 2003;228:265–70. 8 Jain AK, Duin RPW, Mao J. Statistical Pattern Recognition: A Review. IEEE Transactions of Pattern Analysis and Machine Intelligence. 2003;22:4–37. 9 Chang CC, Lin CJ. Libsvm, a library for support vector machines, technical report, available with software at http://www.csie.ntu.edu.tw/~cjlin/libsvm/ 10 Hu S, Hoffman EA, Reinhardt JM. Automatic lung segmentation for accurate quantification of volumetric X-ray CT images. IEEE Transactions on Medical Imaging 2001;20:490–8.


MIE 2005 in Geneva

SMI 2005: Nº 57

28

Simplified representation of concepts and relations on screen Hans Rudolf Strauba, Norbert Freib, Hugo Mosimanna, Csaba Pergera, Annette Ulricha a

Semfinder AG, Kreuzlingen, Schweiz b University of Applied Sciences St. Gallen (FHS), St. Gallen, Schweiz

Summary The fully automated generation of diagnostic codes requires a knowledge-based system which is capable of interpreting noun phrases. The sense content of the words must be analysed and represented for this purpose. The codes are then generated based on this representation. In comparison with other knowledge-based systems, a system of this kind places the emphasis on the data structures and not on the calculus; coding itself is a simple matter compared to the much more difficult task of incorporating the complex information contained in the words used in natural language in a systematic data model. Initial attempts were based on the assumption that each word was linked to one conceptual meaning, whereas such a naive viewpoint certainly no longer applies today. The notation of concepts and their relations is the task at hand. Existing notation methods include predicate logic, conceptual graphs (CGs) as proposed by J. F. Sowa [2], GRAIL as used by the GALEN Project [1] and methods developed as part of the WWW consortium, e.g. RDF’s (Resource Description Frameworks). For the purpose of coding, we developed a notation system using “concept particles” back in 1989 [3]. In 1996, the resulting experience led us to represent “concept molecules” (CM), with which both complex data structures and multi-branched rules can be denoted in a simple manner [4]. In this paper we shall explain the principles behind this notation and compare it with another modern concept representation system, conceptual graphs.

The dual demands made on concept notation Correspondence: Hans Rudolf Straub Semfinder AG Hauptstrasse 23 CH-8280 Kreuzlingen straub@semfinder.com http://www.semfinder.com/ methode

For concept notation in our text interpreter, we drew up the following requirements: 1. The notation system must be able to reproduce all the necessary structures. 2. The number of formal elements should be kept as low as possible.

3. The notation system must be unambiguous (one representation for one meaning). 4. Concept representation must be expandable (no “closed world”). 5. It must be easy and intuitive to read on the screen. 6. It should be as compact as possible, i.e. show as much content per screen as possible. These requirements are the result of the dual demands imposed on concept notation, i.e. that it should be easy to read, both for the machine and for humans. For machines, the notation system must be mathematically clear. Humans also impose the additional demand that the rules should be quick and easy to read; this is even more important with larger and more complex rule bases for expert systems. A system which is mathematically perfect, but which does not satisfy conditions 4 to 6, will inevitably fail, as it can no longer be maintained once it exceeds a certain size. This is why the last three points are so important. The semantic interpreter which we have developed for coding purposes uses a notation system which fulfils the specified conditions. This was possible thanks to the introduction of concept molecules which use the two-dimensional nature of the screen to depict complex concept structures.

The role of relations: atomic concepts and concept molecules (CM) Concepts are the meanings which we combine with words. However, it is not just the concepts themselves which play a role: how the concepts are linked together is the crucial element in formulating knowledge. The relations – i.e. the links between the concepts – contain the actual knowledge and must be capable of being represented explicitly in each concept representation. In our representation, we see concepts as indivisible units, i.e. as atoms. Everything that is said about a concept is said in the form of relations to other concepts. If the knowledge is extended, the


SMI 2005: Nº 57

29

concept does not change, merely the bundle of relations connected with the concept.

(CM’s), however, an atomic concept can be type and value at the same time (see figure 1).

When concepts are linked together, they form clusters. These clusters are represented on the screen in a strictly regulated fashion. Just as atoms in chemistry have binding sites via which they can bind with other atoms, our atomic concepts also have precisely defined binding sites via which they can form bindings with other precisely defined atoms. We call the resulting concept clusters concept molecules (CM).

In Section 1 we impose the requirement that the semantic net (overall concept representation) should represent an “open world” and must therefore be expandable at any point. The bifaciality of atomic concepts fits in well with this, in that the concept chains can be opened at any point and intermediate concepts can be inserted (fig. 2).

Atomic concepts: both type and value (bifaciality) In order to deal systematically with concepts, they are classified. This leads to two kinds of expressions, namely types (classes) and values. These two kinds of expressions are clearly differentiated in databases. In Concept Molecules

Disease

Carcinoma

Bronchogenic Carcinoma

Figure 1. Bifaciality: an atomic concept can be type as well as value.

Disease

Neoplasia

Carcinoma

Bronchogenic Carcinoma

Figure 2. The CM from figure 1 – extended by adding an intermediate concept.

Diagnosis

has-a

is-a

The two basic relations: hierarchy and attribution In CM’s the links can be traced back to two basic relations:

2. The attributive relation (= “has-a”) is represented vertically.

Pretibial

Both relationships are asymmetrical, i.e. the two linked concepts cannot be exchanged for one another. The relation includes a direction which cannot be reversed. This direction is defined on the screen by the left-right axis (table 1).

Figure 3. A branched CM with a “has-a” relation.

Diagnosis

In CM type and value are connected by an “is a” relation that is always represented implicitly. Whenever two concepts are side by side on the same line, it means that there is an “is a” relation between them. The left-hand concept is the superordinate concept in this case and the righthand concept is the subordinate concept. This hierarchical (“is a”) relationship can be extended over any number of stages. The resulting representation is easy to read and saves space. We aim to show below that implicit representation of relators is also possible for additional relators.

1. The hierarchical relation (= “is-a”) is represented horizontally.

Haematoma

Localisation

Implicit representation of relators

Haematoma

Localisation

is-a

Pretibial

With the aid of attributes, branched CM’s can be drawn (fig. 3).

Figure 4. The content of figure 4 in conventional notation (with explicit relators).

In figure 3, the concept “diagnosis” has one attribute, i.e. “localisation”. It is linked to the attribute via an attributive relation which is shown by the little hook beneath “diagnosis”.

Table 1. The two basic relations are asymmetrical.

A conventional concept representation might show the situation as follows (fig. 4).

Left

Right

Hierarchy

Superordinate concept

Subordinate concept

Attribution

Attributed concept

Attribute

Figure 4 does not merely contain more elements than figure 3, but also does not have a standardised spatial configuration. Several possible spatial arrangements of the concepts are possible.


MIE 2005 in Geneva

SMI 2005: Nº 57

30

Benefits of the strict spatial configuration within the CM The standardised spatial configuration in CM’s has more to recommend it to knowledge base engineers than the convenience of customary practice. The strict systematic nature of CM’s improves readability and processing quality: 1. Each line only contains concepts from one hierarchy. 2. As soon as the line is changed, the semantic dimension (i.e. semantic type) is also changed. 3. The number of lines thus also always displays the number of semantic dimensions (types). 4. The most general concept of the corresponding semantic dimension is always at the left of each line. 5. The root of the overall CM, i.e. the concept to which all the other concepts in the molecule are subordinate, be they hierarchical or attributive, is always at the top left-hand side. 6. Since a molecule, including all its branches, always has a tree structure (see figure 8), it can be simply and systematically processed by a computer program.

Fracture is condition of the skin barrier in

The last point shows how a representation which facilitates readability for humans can also improve readability for machines.

Converting the named relators into unnamed relators The two basic relators for CM’s, the hierarchical relator and the attributive relator, can be recognised by their position. Other relators are converted into a combination of attributive and hierarchical relations (fig. 5). The named relator is converted into a combination of 1 concept and 2 basic relators (fig. 6). This information is then written as follows using a CM (fig. 7). Figure 7 is shorter and easier to read than figure 5 or figure 6. The concept “condition of the skin barrier” is omitted however. This is admissible because, in CM’s, the binding sites – even if they are not named – are clearly defined in the semantic net. In our current knowledge base, the concept “fracture” has ten different attributive binding sites, for example. The two concepts “open” and “closed” are linked exclusively to one of these binding sites, whilst the two concepts “intraarticular” and “extraarticular” are linked to another. Although the binding sites are unnamed, the knowledge engineer can immediately see the content-related meaning of the binding site from the linked concepts (see figure 8). This has the following benefits: 1. The engineer does not need to worry about relator names – often lengthy and arbitrary.

Open

2. Representation with CM’s (figure 7) takes up less space on the screen.

Figure 5. Conceptual graph with a named relator.

3. It is thus quicker and easier to read. 4. More information can be taken in at a glance.

Fracture

5. Despite this, the information is always clear. has-a

Condition of the skin barrier

is-a

Open

Figure 6. Information from figure 7 using solely the two basic relators.

Fracture Open Figure 7. Information from figure 7, represented in a CM.

6. The conclusions which the computer draws from the CM’s are always unambiguous and relate only to a specific binding site. An additional benefit is associated with fundamental semantic considerations. The two basic relators – the hierarchical and the attributive – do in fact correspond to the two fundamental relationships which two values within the semantic framework can have (see Section 3.3 in [4]). In addition, concepts in CM’s and OOP object types display surprising affinities (see Section 8


SMI 2005: Nº 57

31

Diagnosis

Trauma

Fracture Extraarticular Closed Not Dislocated

Bone

Radius

Distal Localisation

Forearm

Left

1. The implicit meanings are reconstructed (“forearm”, “bone”). Implicit meanings can be crucial for coding or querying in a data warehouse. 2. The links are much clearer. Thus, “distal” belongs to the “bone radius” group and not to the fracture group. The inference machine for coding purposes is based on this clear, multi-dimensional and multifocal [5] structuring of the underlying data structure and could not function without it. 3. This structuring is a considerable help to the knowledge engineer in reviewing the knowledge base.

Suspicion Figure 8. An average branched CM.

Neoplasia

Carcinoma

malignant Figure 9. A simple rule.

in [4]). Furthermore, practical research shows that CM’s without named relators work, provide accurate results and make it possible to compile and maintain large and complex knowledge bases.

Multi-branched CM’s Figure 8 shows the interaction between hierarchical and attributive relations in a concrete diagnosis. The example contains 9 lines and thus has 9 hierarchies (or dimensions or axes). The diagram could have been obtained from the following noun phrase: “Suspected simple, extraarticular and not dislocated left distal radial fracture”. The phrase can also be formulated in quite a different manner. In addition to the clarity of its form, the representation shown in figure 8 also has the following benefits over the noun phrase:

Representation of processing rules Figure 8 shows concepts in a specific configuration and represents an information status at a given moment in time. Such a status is amended by a processing rule. The rule causes a specific previous status (an “if ”) to be converted to a subsequent status (a “then”). Rules are written as CM’s with operators which are allocated to the individual atoms in the CM’s. In figure 9 the “then-add” operator (underlined in green on the screen) causes the concept “malignant” to be added to the two other atoms.

Results On the basis of the CM’s described above we have created a rule editor, an inference machine and an extensive knowledge base for coding freely formulated diagnosis texts. This system (Semfinder®) permits fully automated coding (one-step coding) and was in everyday use in over 100 hospitals in Germany by the end of 2004.

References 1 Rector A, et al. A Terminology Server for Medical Language and Medical Information Systems. Methods of Information in Medicine 1995;34:147–57. 2 Sowa JF. Knowledge Representation: Logical, Philosophical and Computational Foundations. Pacific Grove: Brooks/Cole; 2000. 3 Straub HR. Wissensbasierte Interpretation, Kontrolle und Auswertung elektronischer Patientendossiers. In: Kongressband der IX. Jahrestagung der SGMI. Schweizerische Gesellschaft für Medizininformatik, Nottwil, SGMI, 1994, pp. 81–87. 4 Straub HR. Das interpretierende System – Wortverständnis und Begriffsrepräsentation in Mensch und Maschine, mit einem Beispiel zur Diagnose-Codierung. Wolfertswil: Z/I/M-Verlag; 2001. 5 Straub HR. Four Different Types of Classification Models. In: Grütter R, ed. Knowledge Media in Health Care: Opportunities and Challenges. Herskey / London: Idea Group Publishing, 2002; pp. 58–82.


MIE 2005 in Geneva

SMI 2005: Nº 57

32

nächste Ausgabe: April 2006 prochaine édition: avril 2006 Die nächste Ausgabe des Swiss Medical Informatics erscheint im April 2006 und behandelt folgendes Thema: Patient safety La prochaine édition du Swiss Medical Informatics paraîtra en avril 2006 et traitera le sujet suivant: Patient safety

SGMI News The committee is working on improving communication to members, based on a survey made recently. The results of the survey will be presented at the assembly in December. Due to the cessation of secretariat activity previously handled by Mrs Schär (VSAO), a new secretariat will be organized beginning in 2006. The committee would like to warmly thank Mrs Evelyne Schär for her great job and support during all these years. SGMI-SSIM hotspots:

Scientific Meeting in St. Gallen, on May 2/3, 2006 – Interoperability, transborder care – Call for paper deadline January 31th, 2006 Scientific day of the society on Patient Safety in Fall 2006

Events Switzerland

SGMI-SSIM 2006 – 2006 Annual Scientific meeting of the Swiss Medical Informatics Association, May 2–3, 2006, St. Gallen, Switzerland http://www.sgmi-ssim.ch Europe

EFMI Special Topic Conference “Integrating Biomedical Information: From e-Cell to e-Patient” April 4–8, 2006 in Timisoara, Romania http://medinfo.umft.ro/stc2006/

2nd International Symposium on Semantic Mining in Biomedicine April 9–12, 2006 in Jena, Germany http://supreme.coling.uni-jena.de/ content/blogcategory/32/103/ Training Course in Biomedical Ontology May 21–24, 2006 in Schloss Dagstuhl, Wadern, Germany http://ontology.buffalo.edu/06/os2/ index.html Tromsø Telemedicine and eHealth Conference 2006 June 12–14, 2006 in Tromsø, Norway http://www.telemed.no/ttec2006

MIE 2006 – The 20th International Congress of the European Federation for Medical Informatics August 27–30, 2006 in Maastricht, Netherlands http://mie2006.org/ World

Medicine Meets Virtual Reality 14 January 24–27, 2006 in Long Beach, California, USA http://www.nextmed.com/mmvr_virtual_ reality.html


Swiss Medical Informatics - SMI 57