Knowledge Online; Current Implications and Future Trends

Page 1

KNOWLEDGE ONLINE: CURRENT IMPLICATIONS AND FUTURE TRENDS

presented at the First East-West Online Information Meeting Moscow, USSR October 11-13, 1989

by: Roger K. Summit, Ph.D. DIALOG Information Services, Inc.


OUTLOOK FOR ELECTRONIC INFORMATION SERVICES Roger K. Summit, Ph.D. President Dialog Information Services, Inc. Palo Alto, California

Abstract This paper briefly reviews the history of online information retrieval, with a focus on the manner in which technological advances have stimulated the development of the online industry. Current technology is examined, as are applications and implications of the use of that technology in industry, government, and academic sectors worldwide. Emerging trends are reviewed, prospects for future development are analyzed, and the impact of these developments on the online industry are considered.

Perspective Any valid look to the future must be based on an understanding of the past. The history of online retrieval of information is a small portion of a long continuum, beginning around 3,000 BC when libraries first began to appear. One of the earliest European libraries was established at the Sorbonne in France in 1321. Although the library only contained 500 books, there were complaints that users could not locate what they needed. As a result, a librarian known only as John organized the books within 52 subject categories and chained them to the benches according to category. Even in that day books had a way of disappearing. The introduction of the movable type printing press in the 15th century revolutionized production of information in the Western world and made possible the information explosion we experience today — and made the work of librarians such as John infinitely more complex. Current online databases were developed from bibliographic files that had their origins in the 17th century, when scientific journals began. Modern-day subject-


Dr. Roger K. Summit

oriented library catalogs and related indexes followed shortly thereafter. As the number of publications multiplied, however, subject categories became overly broad, and the professional was often forced into narrower and narrower fields of specialization in order to maintain cognizance over the growing body of knowledge and literature. During the past 25 years these earlier efforts of organizing and providing access to information have become obsolete. Unprecedented advances in computer technology have allowed information access to be changed from a manual, unsophisticated process to an efficient, highly versatile procedure which goes well beyond the simple look-up capabilities of earlier processes. Online information retrieval technology is, however, far more than a faster way to do manual retrieval. By allowing queries to be formulated with combinations of natural language words and phrases, the user can construct a description of needed information irrespective of how the information may have been classified. Furthermore through iteration and query reformulation, the online search allows the user to converge on the desired result. The ability to use information retrieval systems in today's society is tantamount to literacy; for, if you cannot locate what you need, you cannot read it and, from your perspective, it might as well not exist. John, the librarian mentioned above, recognized the problem in the 14th century when he stated, "Unseen treasure and hidden wisdom, of what use is either?" Modern day, computerbased information retrieval systems were developed in large degree to enhance the usefulness of recorded knowledge through providing a tool which allows the user to become aware of the existence of needed information. The Birth of Online In the early 1960's, there were several efforts in information retrieval being undertaken in the U.S., including ventures at Systems Development Corporation, Massachusetts Institute of Technology, Lockheed, NASA, and the National Library of Medicine. Rather than describe each of these activities, I will trace the development of online using DIALOG and NASA/RECON as examples. A Space-Age Beginning Information searching as we know it today is to some degree a by-product of America's space program. In the early 1960's, the Soviet Union and the United States were making headlines with their race to conquer space. Concurrently, a new race was occurring behind the scenes — a race to assimilate and manage the massive amounts of data being created by space exploration. Research was


Dr. Roger K. Summit

being conducted at a rate unprecedented in history, producing a veritable explosion of paper. In an attempt to order the chaos, NASA developed a database of 200,000 citations describing aerospace research reports. It also developed a magnetic tape-based retrieval system which used an IBM 1401 computer to sequentially search this database. Because it was a batch search system, search time was a direct function of file size, and it was not possible to modify one's search request once the process began. Processing searches through that NASA system took approximately 22 hours. Much of the difficulty stemmed from the fact that the then available, second generation computer equipment allowed only for batch processing. The resulting system was cumbersome at best, and it was plagued by so many problems that it was generally easier, cheaper, and faster to redo scientific research than it was to determine if that research had been done before. Searching with second generation computers was not only timeconsuming; it produced uncertain results. If a search was formulated a bit too broadly, the requestor could walk away with an arm full of paper; a bit too narrowly and there would be zero hits. There was not much distance between the two. The only remedy was to formulate a new search and wait, which could easily take a week or more.

DIALOG Development A small group of us from the information processing staff at Lockheed set out to solve some of the problems inherent in batch processing technology. Fortunately, the mid-1960's saw the development of third generation computer technology typified by the IBM 360, with mass random access storage, CRT terminal inquiry, telecommunications, and interactive, multi-user programming. Compared with second generation machines, third generation computer technology was a major leap forward, and it provided the hardware necessary for the development of online information retrieval as we know it today. In 1964, we persuaded Lockheed to form the Information Sciences Laboratory, and purchase an IBM 360/30 computer. The computer included a datacell capable of storing 400 megabytes, two small disk drives, a telecommunications interface, and 30 kilobytes central storage. This was the hardware on which the DIALOG system was developed.


Dr. Roger K. Summit

We submitted an unsolicited proposal to NASA for a demonstration contract using DIALOG on the then-existing NASA database. The system we implemented could do in four or five minutes what it took the NASA 2nd generation system 22 hours to accomplish. As a result of this demonstration, NASA issued a major competitive request for proposal (RFP) in 1968 for the development of the so-called NASA/RECON retrieval system, for which we proposed an enhanced version of DIALOG. We prevailed over the competition and were awarded our first major development contract. In 1971 we developed a plan to offer a commercial information retrieval service based on DIALOG. By early 1972, the DIALOG Information Retrieval Service was underway, providing interactive online access to the ERIC and National Technical Information Service subscriber's office. Service was first offered in Europe in 1974. Today DIALOG is not only the oldest but also the largest and most complete online information service in the world, with more than 330 databases containing over 190 million references and documents, with particular emphasis on business, science, and technology. Over 100,000 customers in 97 countries throughout the world connect to DIALOG more than 6 million times each year to review: - Tens of thousands of scientific and technical journals - Up-to-date information on 1 million international and 10 million domestic companies - 14 million international patents - 8.75 million chemical substances - Real-time news from Reuters, McGraw-Hill, Knight-Ridder, and PR Newswire. Add to that the complete texts of over 800 journals and newsletters, books, reports, newspapers, conferences proceedings, and more. Only then does the vast scope of information available online through DIALOG begin to become apparent. Furthermore, the amount of information offered by DIALOG doubles every 3.5 years. Virtually every major company doing research and development, or involved in mergers, acquisitions or international trade, nearly every major academic institution, and most of the governments in the world use DIALOG. Why is this important? It is important because of the substantial impact it has had on society.


Dr. Roger K. Summit

Impact of Online The impact of online can be viewed from four perspectives: the industry, the database suppliers, the profession of librarianship, and the users.

Industry During the 24 years since the beginning of DIALOG, many other services and hundreds of databases have been developed, most in the last several years. Carlos Cuadras's report, Directory of Online Databases, listed 555 organizations in early 1989 providing retrieval services on 2,699 databases. A majority of these services serve only a limited number of customers with a small number of specialized databases. Martha Williams in her publication Information Market Indicators, which tracks online usage by U.S. libraries, estimates that there are 537 active databases with total U.S. usage of 3.1 million hours. I estimate that there are some 80 active online service organizations worldwide, and that the average annual growth in numbers of services, numbers of databases, and online search hours approximates 18-20% per year. Based on the above figures I estimate that 30 million searches per year are conducted worldwide with some 300 million items of output. If even one in ten of these items contributes to the solution of a problem, the impact on society is profound. It is hard to imagine doing business without the telephone or the Xerox machine; it is similarly becoming hard to imagine doing business or R&D without information retrieval.

Impact On Database Suppliers The success of online retrieval has stimulated the development of additional databases. DIALOG, for example, adds 20-30 databases per year to its service and pays several million dollars per year to its U.S. and international database suppliers in royalties. Online use has grown to a point that several database producers report that they are deriving more revenue from online than from their printed publications. It must be recognized however that the language of the databases is English and the medium of exchange is dollars (or other hard currency).


Dr. Roger K. Summit

There can even be international balance of payments implications, as the following example illustrates. DIALOG offers several databases originating in Great Britain which it markets on a worldwide basis and for which it pays royalties back to the producers of those databases. In actuality DIALOG pays more in royalties to Great Britain than it receives from its British customers who search DIALOG because of the global popularity of these British-created databases. Thus there is a considerable net positive flow of dollars to Britain even though it is one of the heaviest users of DIALOG. The same thing can occur with the Soviet Union when its prestigious databases become more widely available.

Impact On Libraries and Information Centers Library professionals have witnessed an enormous change in their work with the advent of online retrieval. Whereas before, library work was largely custodial with some research, professional librarianship today consists largely of research and relatively less of custodial work. Librarians and information specialists in business have become members of product planning and marketing teams, working with other members of management. Surveys report that librarians today feel more important and make greater contributions to their companies through the use of online retrieval services. Furthermore, the greater use of information stimulates the need for more information, and the demand for library services continues to grow. Impact On Users Let us examine a few very specific, actual examples of ways in which online information retrieval results have been used to give a flavor of the types of problems being solved. Searchers use online information for everything form improving their own company's bottom line to saving humanity. Tom Brazier, a partner in a large San Jose, California law firm, had just finished an online search for one of his clients, when he decided, out of curiosity, to learn a bit more about a new client, an insurance company, that had just retained his firm for several hundred thousand dollars of legal work. To his chagrin, he found his new client had recently been acquired by another company, and was near bankruptcy! His DIALOG search saved his firm from incurring large uncollectible fees.


Dr. Roger K. Summit

The library at the University of Puerto Rico used DIALOG to help the children. Their research into child care centers in colleges and universities helped their administration to plan, organize, and establish a child care facility at the university. Online services have even been used to save lives. Glenn Tisman, a southern California doctor, received a call from his hospital late one night; a post-operative patient in the Intensive Care Unit had developed a bleeding problem that was not responding to conventional treatment. A search on the MEDLINE (r) database on DIALOG produced an abstract of a German article describing treatment of an identical problem with a blood replacement product not previously tried. He relayed this information to the hospital, and by his morning rounds was delighted to find out that his patient had stopped bleeding. A MEDLINE search saved the life of a patient in Williamsport, Pennsylvania, as well. A woman had gone into anaphylactic shock. Her heart was not functioning and she did not respond to treatment. MEDLINE revealed a similar case in England a few months earlier; the same treatment was tried, and the woman came through. Thus two people who may not even be aware of online searching owe their lives to the technology. DIALOG with its directories containing data on literally millions of companies is critical in finding joint venture partners.

An Eye On the Future: Emerging Technologies Online developments are made possible in large measure by technological advances in the computer hardware on which the industry is critically dependent. Progress in this area over the past 25 years has been so phenomenal that for people using online retrieval services, reality has consistently outpaced expectations. There are a number of trends which are currently influencing developmental opportunities for information systems which I would like to review. These trends most likely portend future developments in online retrieval. Personal Computers A most interesting current trend in personal computers is that they are being used as integrated workstations. It takes only a little imagination to visualize the personal computer workstation that will incorporate voice-mail, facsimile, connection to local area networks as well as remote facilities, with optical discs for


Dr. Roger K. Summit

mass storage of frequently used, relatively static information, and automatic connection to remote information services for material which is not locally available. Electronic Mail One of the most rapidly expanding PC applications is electronic mail. International Data Corporation of Framingham, MA, reports that the number of electronic mailboxes is growing at the rate of 68% a year, and Walter Ulrich of Walter E. Ulrich Consulting, believes that "the electronic mail market is doubling every year and will become a nine-billion message market by the end of the decade." As a consequence, we at DIALOG have developed electronic mail as a vehicle for delivering the results of searches, and current awareness services, and to allow DIALOG customers to communicate with each other. Optical Storage Although CDROM has captured a great deal of attention lately, it is not the only optical medium. The storage of the full text of source documents on centrally-located optical disks for facsimile transmission to personal computers or FAX machines shows great promise in the near future for full text, source document delivery. Software Developments There is currently a flurry of effort developing within commercial software houses in an area known as similarity searching and relevance feedback. Although the concepts are not new, they have until recently been pursued largely within the academic domain by people such as Gerald Salton and Karen Sparck Jones. A company called Thinking Machines Corporation has developed a configuration of parallel processors (called the Connection Machine) which allows very rapid sequential searching of documents. Dow Jones recently paid $5 million for two such machines. Another company called PLS has produced a software program that assigns relevance weightings to records uncovered during an online search. Other companies working in this area include Third Eye, with its Elixir system, and Computer Power Ltd., with its Status/IQ system. Integrated Services Digital Network (ISDN) According to AT&T, the next three to five years will herald a new communications protocol known as ISDN. Ralph Miller of the


Dr. Roger K. Summit

10

Pacific Telesis Group notes that ISDN is poised to become to Plain Old Telephone Service (POTS) what the remote time-sharing system is to the abacus. Whereas today we generally communicate at various speeds from 300-2400 baud (bits per second) analog, ISDN defines a standard unit of 64,000 baud, error-free digital, about 50 times the speed most of us use. General acceptance of this new standard will have exciting implications for information services. High-resolution facsimile can be transmitted at a few seconds per page. Gone will be modems and noisy lines (computers like to talk digital) . Gone will be multiple telephone lines into an office for different services (such as voice, facsimile, data, etc.). One line will carry all of these services simultaneously. It will not be necessary to connect to remote services through a telephone; your workstation/PC will seem to continuously connected just as teletypes and facsimile machines are today. In short, ISDN is an application designer's dream come true. Simplified Search Techniques Online users today find searching easier, faster, and more flexible than ever before. In the mid-1980's DIALOG rewrote its search language to provide improved internal processing, to incorporate new search features requested by customers over the years, and to implement additional enhancements recommended by DIALOG staff. The new software has enabled DIALOG to introduce such popular features as multifile searching. It also allows easy changes and upgrades to the DIALOG system, and these upgrades are being implemented on a continuing basis. The development of menu-based searching is making online more easily usable by end-users. Early menus were tedious, cumbersome, and restrictive. New software developments, however, provide searchers more flexibility in search formulation. DIALOG now offers menu options on several services: DIALOG Business Connection5*1, for business professionals; DIALOG Medical Connections , for physicians and biomedical researchers; and KNOWLEDGE INDEX, an evening and weekend service for the home computer user, all of which are available worldwide. We continue to explore other such products as well. One very recently introduced is DIALOG Corporate Connection8", which provides menu access to over 200 DIALOG databases, thus enabling corporate information users to perform their own quick information searches and freeing search specialists for in-depth research.


Dr. Roger K. Summit

11

Full Text (Source Documents) Whereas early information services stored and retrieved brief citations, the decreasing cost of storage has enabled abstracts, and, more recently, the full text of documents to be included for searching and delivery. This trend will continue. DIALOG currently maintains the complete text of over 800 journals, newspapers, and newsletters online, with a continuing high priority for adding more. It is safe to assume that within the next five to ten years nearly all interesting journals and newspapers should be fully available online. Graphics Graphics and charts, often an integral part of a journal article, are not normally available through online services. We are working to solve this challenge, and have already taken a large step in this direction. In January 1988, DIALOG and Thomson & Thomson, producer of the TRADEMARKSCANR databases, announced the first availability of graphic images online. Graphic representations of U.S.-registered trademarks that have designs are now part of the TRADEMARKSCAN-Federal database. In January 19989, DIALOG and Chapman & Hall introduced chemical images to the HEILBRON database. In the near future, integrated text and graphics should be available for additional databases, as well. CDROM First introduced in 1985, CDROM technology is dramatically expanding the possibilities for libraries, businesses, and academic institutions worldwide. Organizations can access the enormous amount of information available via CDROM. A single disk contains the equivalent of 1,800 standard IBM floppy disks, 360,000 pages of information, 120 million words or 240 pounds of paper. The lease of a CDROM product even though often very costly provides the opportunity for unlimited searching at one fixed price. One disadvantage of CDROM, however, is that one must anticipate one's information needs in advance and order the appropriate CDROMs. Also, CDROMs are frequently out of date due to the production and distribution process, and thus are not really suitable for data requiring very frequent updating. •

TM

DIALOG OnDisc CDROM products allow for either command-driven or menu searching, and offer automatic connection to DIALOG'S online databases for additional or archival information. The DIALOG product is the first PC software which equals the full power of a major online retrieval system, and permits the transfer of


Dr. Roger K. Summit

12

search results to personal manipulation and analysis.

computer

software

for

further

DIALOG announced its first CDROM product in late 1986, and today offers nine database products with others being developed under the direction of our recently created and rapidly expanding CDROM division. Public Access Technologies and Gateways The expansion of the Regional Bell Operating Companies (RBOCs) in the U.S. into the information delivery arena provides a higher profile for gateway services. Gateways and Public Access Systems such as EasyNet, which offers a common control language for all database systems accessed, are putting a simplified form of online searching within reach of the general public. The Challenge We Face Knowledge is power, it is said, but specific knowledge obtainable when needed is a lubricant of the management process. Knowledge contributes to the effectiveness of the professional in product planning, manufacturing, pricing, and distribution. In health care, information retrieval saves lives. In research and development, knowledge is the stepping stone that leads to new knowledge and avoids the inefficiencies of duplication and reinvention. Given the enormous wealth of information available online, can a business reasonably elect not to perform online research? Overwhelmingly, the answer is no. Although most of the U.S. Fortune 1000 companies and many other companies throughout the world use online services, many small businesses and a large number of academics worldwide are not yet aware of online information research. These communities offer us a major challenge. We frequently hear explanations of this underutilization, such as: -

"It is too expensive" "It is too complex" "My present information resources are adequate" "Why should I pay for information I can get free from the library?"

In my opinion these are all fallacies, and making information retrieval cheaper and simpler would not alone itself dramatically


Dr. Roger K. Summit

13

popularize online searching. What is needed is education, and word-of-mouth testimony by those who use online services to those who could benefit from use of the services. About a year ago I received a letter from Professor Peter d'Errico of the University of Massachusetts at Amherst, Department of Legal Studies. Although his comments refer specifically to the academic community, they apply equally — perhaps in even greater force — to industry. Professor d'Errico notes: "Having used DIALOG in a teaching framework for about two years, I am struck by how few academics are aware of or understand online searching.... I have come to the conclusion that the deepest significance [of online searching] for scholarship has been missed. In particular, there is virtually no awareness that computer database searching is not simply a matter of doing on a machine what one could do in books.... "... I would emphasize that 'information use stimulates information need,' and I would suggest that online searching facilitates crossing of academic disciplinary boundaries and the transcending of division and categories in the arrangement of material within disciplines. Online searching enables a scholar to find information in and across fields, either without familiarity of traditional arrangements of knowledge." Our task is to eliminate the misperceptions and educate business and academic professionals in the benefits to be derived from online research. Timely access to pertinent information plays a key role in generating profits for businesses of all sizes and varieties; teaching students where to find appropriate information is more important in many cases than teaching them the information itself. Our task is the embodiment of the old Japanese proverb: "Give a man a fish and he will eat for a day. Teach him how to fish and he will eat for a lifetime."

Roger K. Summit, Ph.D. President and Chief Executive Officer DIALOG Information Services, Inc. 3460 Hillview Avenue Palo Alto, California 94304 Telephone: (415) 858-3777 FAX: (415) 858-3847


Dr. Roger K. Summit

14

References Cuadras, Carlos. Directory of Online Databases. Number 3. July 1988.

Volume 9,

Williams, Martha E. Information Market Indicators. Center/Library Market. Issue 22. Fall 1988.

Information


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.