On-Line Medical Records and Privacy

Computer Science and Telecommunication Board

NII 2000

White paper on

Effective Information Transfer for Health Care, Quality versus Quantity.

Gio Wiederhold, Stanford University, 30 April 1995, revised 9 May 95

in Lewis Branscomb et al.: The Unpredictable Certainty, Information Infrastructure through 2000; Volume 2: National Academy Press, 1997, pages 553-559.

Abstract and Problem Statement

In this note we address two related problems, created by the rapid growth of information technology. The problems are the loss of productivity due to overload on health care providers and the loss of privacy, both occurring when excessive data are transmitted and made available. These issues are jointly related to a tradeoff of quality versus quantity of medical information. If quality is lacking, then the introduction of modern communication technology will increase healthcare costs, rather than constrain them.

AB. Background and Problem Description

The on-line medical record is rapidly becoming a reality [ref.1]. The technology is available and social barriers of acceptance are disappearing. After a long delay, access to on-line patient data during a treatment episode will become routinely accepted and expected by the patient as well as by the provider. The image of an expert in the popular view is now associated with a computer-screen in the foreground, and medical experts are increasingly being included in that view. Eventually on-line validation of health care information may become mandatory. This year the first court case, where a physician had failed to use available information technology to gather candidate diagnoses, was decided in favor of the plaintiff [ref.2], presaging new criteria for commonly accepted standards of care.

The rapid growth of the Internet, the improved accessibility of on-line libraries, and the on-line medical record all provide huge increases in potential information for health care providers and medical researchers [ref.3]. However, most beneficial information is hidden in a huge volume of data, and not easily extracted. Although the medical literature, largely through the efforts of the National Library of Medicine. is better indexed than literature in any other scientific field [ref.4], the volume of publications,

the difficulty of assessing significance of reports, inconsistent use of terms, and barriers to protect privacy of patients place a new barrier on effective use, sometimes characterized as `information overload'. This overload means that diligent research for any case can require an open-ended effort, likely consuming many hours. We consider that the current and imminent presentation of information is of inadequate quality to serve the practice of health care.

Unfortunately, the pace of development of software to provide services that deal effectively with excessive, convoluted, heterogeneous, and complex data is slow. Since the problem in the preceding years has always been access, there is a lack of paradigms to deal with the issues that arise now. Where voluminous data had to be processed, intermediate staff was employed, so that the health care provider was isolated both in terms of load and responsibility. Staff at the research sites filter and digest experimental results. Staff at pharmaceutical companies filter for benefits and effectiveness. Government agencies monitor lengthy trials. Publishers require refereeing and editing. Students and interns discuss new reports in journal clubs. Collegial interactions provide hints and validations [ref.5]. But our networks encourage disintermediation, and while most of the intermediate filtering tasks are aided by computer-based tools, there is no common paradigm that assures quality of information products.

C. Analysis

In order to assess the demands placed on the National Information Infrastructure (NII) by healthcare services we consider the needs of the health care providers and their intermediaries. This analysis is hence founded on customer-pull rather than on technology-push. This is likely to lead to lower estimates than a model focusing on technological capabilities. We will assume, however, a progressive environment, where much paperwork has been displaced by the technologies that are on our horizon.

In our model, information needs are initially generated for the delivery of healthcare by the providers and their intermediaries. Pharmacies and laboratories are important nodes in the healthcare delivery system. Education for providers and patients is crucial as well, and will be affected by the new technologies. Managers of health care facilities have their needs as well, paralleled at a broader level by needs of public health agencies. Functions, as publishing the medical literature and the production of therapeutics, will not be covered here, since we expect that topics as Digital Libraries and Manufacturing in this report will do justice to those areas.

C.1 Services for the Healthcare Provider:

The initial point in our model is the interaction of the provider with the patient. Such an interaction may be the initial encounter, where tradition demands a thorough workup and recording of physical findings, it may be a visit motivated by a problem, where diagnostic expertise is at a premium; it may be an emergency, perhaps due to trauma, where the problem may be obvious but the treatment less so, or it may be a more routine follow-up visit. In practice, the majority of visits fall into this routine category [ref.6].

Adequate follow-up is crucial to healthcare effectiveness, and an area where information technology has much to offer [ref.7]. Having the right data at hand permits the charting of progress and the therapeutic adjustments needed to improve or maintain the patient’s healthcare status. Follow-up care is mainly provided locally. The majority of the consumers of such care is the older, less mobile population. It is this population that has the more complex, more long-term illnesses that require more information.

The needs for information in all these cases differ. Initial workups mainly produce data. The diagnostic encounter has the greatest access demands. Emergency trauma care may require some crucial information, but it is rarely available, so that reliance is placed on tests and asking the patient or relatives for information. Note that many visits to emergency facilities, especially in urban settings, are made to obtain routine care because of the absence of accessible clinical services. From the point-of-view of our analysis this are recategorized. A goal for healthcare modernization should be a better allocation of resources to points of need, but here we can only discuss the information needs. Follow-up visits information should summarize the patient’s history; unexpected findings will trigger a diagnostic routine.

To assess the need for data transmission we need to look both at the distance and the likely media that carry the needed information. Media differ greatly, and all must be supported. Many physical findings are described compactly, using text. Laboratory findings are compactly represented in numeric form. Sensor-based tests, as EKGs and EEGs are time series, requiring some, but still modest data volumes. Sonograms can be voluminous. The results of ultrasound scans are often presented as images. Other diagnostic procedures often produce images directly, as X-ray or CT and similar scans that are digitally represented. High quality x-rays require much storage and transmission, most digital devices have larger pixels or voxels and require more modest storage volumes [ref.8]. The practitioner typically relies on intermediate specialists to interpret the data obtained from sensors and images, although, for validation, access to the source material is also desired.

The distance that this information has to travel depends both on the care setting and the data source. In the table below we place estimates on the source of patient care information at the types of clinical encounters described. A major factor can be quantified indirectly: the majority of healthcare (more than 80%) is being delivered to the aged, but this is a population not likely to travel much [ref.9]. This means that both the new data and the retrieval requests are likely to be colocated. Furthermore, if the data are remote, and remote access is consistently costly, then modern systems will move data to a more economical location. Algorithms for distributed storage allocation are well understood, and perform well when a single access pattern dominates [ref.10].

Encounter type	text	numbers	sensor-based	images	Local/remote ratio, Encounter frequency
Work up	local collection aided by staff	area clinical laboratories	area diagnostic services	area, hospital- based services	very high, low
Diagnostic	local & remote reports, experts	area clinical laboratories	area diagnostic services	area, hospital- based services	high, modest
Emergency	local & remote histories	in-site laboratories	on-site devices	in-site services	modest, low
Follow-up	extended local histories	area clinical laboratories	area diagnostic services	area, hospital- based services	very high, high

We conclude that usage of local patient care information dominates. The requirement for remote transmission of data for individual patient care is modest. Instances will be important, as when a traumatic accident requires emergency care in a remote locale, and consultation with an expert specialist. Here again it will be quality considerations that are crucial. Getting the just the right data rapidly, rather than getting massive printouts rapidly and searching through them on site. At times images may be required as well. Most x-rays will be done locally, although one can construct a scenario where an archived image is of value. Any actual medical intervention will be controlled by local insight and information.

In addition to the requirements for access and display of individual data, as shown in the table above, there is a need for on-line access to the literature and reference material. Here issues of locality are best driven by economic considerations. If the volume-frequency product is high, the best site for access will be relatively local, if it is small it can be remote as long as access is easy and latency is small. These parameters are under technological control, and no prior assumptions need be made except that reasonable alternatives will survive and unreasonable ones will not.

C.2 Management and Public Health Needs.

The requirements of broad health care information for planning and research is huge. Local institutions must improve the utilization of the data they have in house already for better planning. New treatments must be monitored to rapidly recognize unexpected side effects. Public health officials must understand where problems exist, what problems can be addressed within their means, and what recommendations for public investment are sound.

Today effective use of available health care data is difficult. Standards are few and superficial. For instance the HL-7 standard [ref.11] does not mandate any consistency of content among institutions; only the format of the access is specified. The data collections themselves are also suspect. Private physicians have few reporting requirements except for some listed infectious diseases. If the disease is embarassing then their concern for the patient’s privacy is likely to cause under-reporting. They have little reason to trust that privacy will be maintained in the data systems. In a group practice the medical record will be shared. and potentially accessible, but the group’s motivations differ little from those of an individual physician. Physicians working in a larger enterprise, as a Health Maintenance Organization (HMO) will have more requirements placed on them and have administrators that are anxious to have adequate records. Still, little guarantee exists today that data are complete and unbiased. The local users are able to deal with the uncertainty of mixed quality of data since they understand the environment. Remote and integrated analysis is less likely to be able to use the local data resources, even when access is granted.

However, clinical data collection and use is an area where change is occurring. The increasing penetration of HMOs, the acceptance of on-line access, and the increasing entry of local data provide the foundation. When the local information feedback loops are closed, and providers see at the next encounter what information they collected, then quality can improve [ref.12]. Sharing one’s record with colleagues also provides an inducement to record the patient’s state completely and accurately. As the record becomes more complete, questions of access rights will gain in importance. We address those questions in Section C.5

Clinical data collection is broad but rarely sufficiently deep to answer research questions. Where clinicians collect data for their own research, the quality of the variables they consider crucial will be high, but scopes of most studies are narrow and not comparable among studies and institutions. Funded, multi-institutional research studies make valiant efforts to maintain consistency but rarely succeed on a broad scale [ref.13]. While such data will be adequate to answer focused research questions, little management or public-health information can be reliably extracted.

Many funded health-care and service programs mandate reporting and data-collection. But again, there is likely to be a narrow bias in collection, recording, and quality control, and except for administrative purposes, the value is minimal. Biases accrue due to desire to justify the operation of the clinics and services and if the data lead to funding, such biases are strengthened. Public health agencies are well aware of these problems and tend to fund research studies or surveys, rather than rely on existing data collections.

Quality again seems to be the main constraining factor. How can quality be improved? Mandating the collection and submission of more data to remote sites won’t work. The only option seems to be is to share data that is used locally, and abstract management and public health information from such local data. Feedback at all levels is crucial. Feedback relates encounter to encounter and compares treatments intervals in a patient’s history (especially for the aged), among similar patients, and among physicians using different approaches to practice. Again, it is the actual consumers of the information that need to be empowered first.

The desire to have all possible information will be moderated by the effort and time that healthcare providers must spend to obtain and record it. In time, intelligent software will emerge that can help select, extract, summarize, and abstract the relevant and properly-authorized information out of the voluminous medical record and bibliographic resources. Such software will be accepted by the providers to the extent that its results match their human interaction bandwidth and aids their productivity.

We see that the demands for NII services are more in making software available, providing interoperation standards, than in high-performance and remote communication. Today access is constrained by problems of interoperation, concern for privacy, and poor quality of many collections. The effort needed to overcome these barriers are major, and will need time to resolve them.

C.3 Education

The need for continuing education in the health care field has been more formally recognized than in most other areas. While an unmotivated engineer can spend many years doing routine corporate work until he finds himself without marketable skills, the health care professional is faced with medical re-certification, hospital admit privileges, and credibility. In urban areas the patient’s choices are many. Often choices are based on contacts leading to referrals. All these factors motivate continuing education. The quality of such education is decidedly mixed. Boondoggles are common, and testing for proficiency of what has been learned is minimal or absent. Few standards exist.

Here access to remote instructors and experts can be a boon. For the rural practitioner, who finds it difficult to leave the practice area, such services are especially beneficial. In urban areas, most educational services will be local. The demand on the NII is again difficult to gage but may again be modest in the aggregate. Less than 10% of our health-care providers practice in remote areas. Spending a few ours a week on remotely accessed educational services seems to be an outer limit. The services should be fast and of high quality. Since time is not of the essence, the service has to compete with printed material. It will be hard, if not impossible to match image quality; on the other hand, dynamic interaction has a role and can provide excitement.

There are natural limits to the capabilities of a human to take in information. The rate provided on a television screen is one indication of the limit; few people gain from watching multiple screens simultanously. The actual information contained in a video sequence is much less. The script for an hour’s TV episode is perhaps 200 sparsely typewritten pages [ref.14]. The play is more exciting to watch, and that has added value. The story line provides guidance to the viewer. Only a few paths are reasonable at any point in time, and that choice represents the essential information. Excessive randomness of events, MTV style, is unlikely to be informative. Eventually, the information value to be retained after an hour of educational video is likely to be much less than even the script provided.

Technology will allow interaction in the educational process, just as now some choices can be made when reading a book or browsing in a library. Again, the number of choices at each point is limited. A likely limit is the magical number 7± 2, the capacity of the interactor’s short term memory [ref.15]. Effective educational systems must keep the learner’s capabilities in mind. To what extent intermediate representation expand the material and place higher demands on network bandwidth is unclear.

C. 4 Decision support.

The essence of providing information is decision support. All the tasks we discussed, whether for the physician treating a patient, for the manager in making investment decisions, for the public health official in recommending strategies, and even for the billing clerk in collecting an overdue payment, can only be made effectively if the choices are clear. The choices will differ depending on the setting. The manager must give more weight to the financial health of the enterprise than the physician when recommending a treatment.

Customer-based capacity limits can be imposed on all the service types provided by the information enterprise, just as we sketched in the section on education (C.3). Making choices is best supported by systems which provide a limited number of relevant choices. The same magical number (7±2) raises its head again. To reduce the volume of data to such simple presentations means that processing modules, taking on the roles of intermediaries in the health care enterprise, must be able to locate likely sources and select the relevant data. Even the relevant data will be excessive. Long patient histories must be summarized [ref.16]. Similar patient courses can be compared, after synchronizing their course

based on events in their record, as critical symptoms, treatments applied, and outcomes obtained [ref.17]. It is rare that local patient populations are sufficient, so that matching has to be performed with integrated information from multiple sites, taking their environment into account.

The presentation to the customer has to be clear, and allow for explanations and clarifications. Sources of data have to be identifiable, so that their suitability and reliability can be assessed. Once such services are provided, it will be easier to close the feedback loops that in turn encourage quality data. Having quality data enables sharing and the use of the technological infrastructure being assembled.

Education and entertainment benefit from a mass market, so that the expansion of information into exciting sequences has a payoff in acceptance and markets. That payoff is much less in medical records review and analyses of disease and treatment patterns. The volume of health care information transmitted for decision support will be constrained by the filtering imposed by quality control.

C.5 Protection of Privacy:

The public is distrustful of the protection of privacy provided for healthcare data, and rightly so [ref.18]. In many cases, in order to receive health care services and insurance reimbursement, they have to sign broad releases. The information flows, with nary a filter, to the organization’s billing clerks, to the insurance companies, and, in case of conflict, to the legal professionals. While all these people have ethical constraints on the release of information little formal guidance and even fewer formal restrictions are in place. In the paper world loss of privacy was mainly an individual concern, as the potential embarrassment to a politician when the existence of a psychiatric record is revealed. In the electronic world the potential for mischief is multiplied, since broad-based searches become feasible.

The insurance companies share medical information through their Medical Information Bureau [ref.19]

This port is assumed be a major leak of private information. Unless it can be convincingly plugged it is likely that health care enterprises will have to limit the access to their data if they want to (or are forced to) protect the patient’s rights to privacy. Many health care institutions, after having appointed a Chief Information Officer (CIO) in the past decade, are now also appointing a Security Officer. Without guidelines and tools, such an officer will have the tendency to further restrict access, perhaps disabling legitimate needs of public health officials. It is unclear how such an official will deal with the leak to insurance companies and its own billing staff.

Legitimate concern for the protection of privacy is likely to hinder use of the information infrastructure. We do believe that there are technological tools that can be provided to a security officer and the CIO to make their task feasible [ref.20]. To enable the use of information management tools, the information flow within major sectors of the health care enterprise has to be understood. The value and cost of information to the institution, its major components, and its correspondents has to be assessed. Without control of quality the benefits are hard to determine, and it will be difficult to make the proper investments.

The quality and privacy concerns are likely to differ among areas, which means that those areas must be properly defined. Once an area is defined then access rules can be provided to the security officer. A barrier must be placed in the information flow if access is to be restricted. Such a barrier is best implemented as a system module or workstation owned by the security officer. That node, a security mediator consisting of software and its owner is then the focus of access requests, their legitimacy, and their correct responses. The security mediator must be trusted by the health care staff not to release private information, and must be trusted by the customers, be they public health officials, insurance providers, or billing staff, to provide complete information within the bounds of the rules that are provided.

The volume of data being transmitted out of the health care institution may be less, but the resulting information should be more valuable and trustworthy.

D. Recommendations

The sources and uses for health care information are varied. The technological capacities and capabilities are rapidly increasing. The informational needs of the health care enterprise can be defined and categorized. If the quality information can be provided, where quality encompasses relevance, completeness, and legitimacy, then the demands in the NII can be estimated, and it appears that the overall capabilities are likely to be adequate. Distribution, such as access in rural areas is still an open question. I have recommended in another paper [ref.3] that the Rural Electrification Services Authority (REA) should repeat its success of the 1930’s by focusing on the provision of information access to the same customers.

The major point to be made is, that in order to provide health care professionals with the best means for decision making, a reasoned balance of software and hardware investments is appropriate. Software provides the means to abstract voluminous information into decision sequences where, at every instant, the customer is not overloaded. There is an optimal trajectory in balancing investments in the system’s infrastructure versus software application support, but we have not spent much effort in understanding it. For health care and health-care providers the benefits are due to quality information, good enough, sufficiently complete, and relevant enough to make decisions. If the quality is absent, then the effort will be poorly rewarded and the risks of failure will be high. The recommendation from this point of view is hence to move support to the information processing infrastructure, so that relevant applications can be built easily and the customers be satisfied. A happy customer will in turn support the goals of the NII.

An area where government support can be crucial is in helping to define and validate standards. Standards setting is best performed by customers and providers, but the validation and dissemination of standards are precompetitive efforts that take much time and have few academic rewards. Academic insights can help assure coherence and scalability. Tests performed outside of vendor locations are more likely to be trusted and are easier to demonstrate. Infrastructure software, once validated, is easy to disseminate, but hard to market until the application suites that build on the infrastructure are available. The need to support the communications hardware infrastructure has been recognized. The support of software in that role may well be simpler, since its replication is nearly free.

The desired balance for health information infrastructure support can be replicated in all fields of information technology. We expect the parameters to differ for commerce, defense, education, entertainment, and manufacturing. The common principle we advocate is, that as we move from a supply limited to demand constrained information world, our analysis and actual service methods must change.

References: <Not all these references were published>

1. M.D. Computing Survey

2. Harbeson v. Parke Davis, 746 F.2d 517 (9th Cir. 1984).

3. Silva

4. Gio Wiederhold: "Digital Libraries, Value, and Productivity"; Com.ACM , Vol.38 No.4, April 1995, pages 85-96.

5. M.D. Computing article?

6. Health statistics, DHCP

7. MacDonald

8. IEEE Spectrum

9. Discussion on healthcare economics 1994

10. Oszu and Valduriez

11. HL-7 Sujanski

12. [McShaneHKF:79] D.J. McShane, A. Harlow, R.G. Kraines, and J.F. Fries: ``TOD: A Software System for the ARAMIS Data Bank''; IEEE Computer, Vol.12 No.11, Nov.1979,pages 34--40.

13. Cancer studies

14. Limelight

15. George Miller: "The Magical Number Seven +- Two"; Psych.Review, Vol.68, 1956, pp.81-97.

16. Isabelle deZegher-Geets et al ``Summarization and Display of On-line Medical Records"; M.D.\ Computing, Vol.5 No.3, March 1988, pages 38--46.

17. Jim Fries

18. Willis Ware: "The New Faces of Privacy"; Rand Corporation, P-7831, 1994

19. J. Robert Beck et al: Policy Forum; JAMIA. Vol.1 No.4, July/August 1994, pp. 313-324

20. Trusted Interopereation of Healthcare Information; software to protect patient's privacy in healthcare settings