HealthSecurity
Gio Wiederhold
Prepared for
Oct 2000
Abstract
Today, issues of privacy and confidentiality in healthcare are dealt largely informally. Little legislation exists, and the awkwardness of accessing paper records makes violations of patients’ privacy sporadic. As healthcare institutions move towards a future where all information is kept in an Electronic Medical Record (EMR), the casual attitudes that are prevalent will be in conflict with the desires and expectations of the patients. Legislation has been passed to make the holders of medical data responsible for securely protecting the patients privacy. Specific implementation guidelines are still lacking. There is much institutional resistance to the adoption of rigorous rules, but we expect that in the near future reliable procedures will have to be implemented to comply both with legal guidelines and patient’s expectations. .
After introducing the issue more precisely we provide an overview over the concepts needed to understand the roles of technology of privacy and security and the people that must manage the technology. We then discuss the components of secure EMR systems and will point out where adequate technology exists and where future improvements are essential. We conclude with some advice to healthcare management facing the demands for security and privacy that the future will bring.
This
chapter considers that in the near future most patient information will be
stored in an Electronic Medical Record (EMR) [DickS:91]. We expect that
required patient information will be rapidly and completely available to
persons who should receive that information, and not be made available to
anyone else. To achieve that simple objective many pieces of technology have to
work correctly and reliably. We will identify most of these technological
components because they are all interrelated, but not discuss many of them in
depth because they are common to all computer-based information systems. We will focus on issues that are particular
to the medical record domain. Unfortunately there are problems with medical
records that are not handled adequately by the methods that are supplied in
broad-purpose software [ClaytonEa:97]. Issues of security and privacy are not
unique to medical information, but we will see that they become more complex in
health care.
Protection of privacy depends greatly on having a secure system. Security first of all requires that persons accessing the system are properly identified, or authenticated. Once they are authenticated, they can be authorized to read or manipulate specific parts of the EMR. Healthcare records contain a wide variety of information, from relatively public demographic data to data that could be misused to embarrass a person or deny them employment, insurance, social, or residence opportunities. In between those categories is information of value to various organizations. Of primary concern is the delivery of information to a variety of caregivers, the physicians, consultants, nurses, pharmacists, etc. that depend on having complete and correct information to carry out their duties. External laboratories exchange crucial test information with the EMR. Hospital management has the duty to monitor the spread of infections within the hospital [EvansEa:96]. There are legitimate demands for certain information from public health organizations, insurance companies, and other third-party payors. There may be legal injunctions to obtain data, for instance in accident settlements. There is a need for medical treatment data and the effects of such treatments for medical research and pharmaceutical development. An important application is drug surveillance, checking if new drugs have side effects not found during the clinical trials that led to their approval. Last, but not least are the patient themselves, who have the right to know what is happening to them. However, sometimes the patient’s rights are abrogated by medical practitioners in the interest of the general well-being of a particular patient. Sometimes those rights are assigned to family members, who have assumed responsibility for the care of minors or senile patients [DonaldsonL:94].
We
see that the EMR must serve a very wide variety of purposes. At the same time
medical information cannot be as well structured as, say, banking or
merchandising records. While the IRS has legitimate rights to survey some of
our bank records, the variety of
information seen in them is much simpler. The privacy of merchandising records
is also a concern, and although the information in them rarely has the
potential to be damaging, we expect that it will not be released with any
personal identification. Unfortunately, personal identification is essential
for many of the purposes in a medical record.
For continuing treatment, for tracing the origin of an epidemic, for
understanding a delayed effect of a drug treatment, etc., personal
identification is crucial to information linkage. Even when, say for medical research, anonymous data is adequate,
investigators, legitimate or not, still have means to identify
individuals. For instance, in an
anonymized research record the dates of
visits to a clinic will be stored to understand the temporal course of a
disease. A visit pattern is likely to be unique for any individual. Matching
this pattern against the relatively public records of the clinic’s operation is
certain to identify particular patients.
Security,
and hence protection of privacy, can not be obtained unless the underlying
computer systems are reliable. When failures occur, not only availability, but
during repairs and downtimes the privacy of the records is easily compromised.
If a large number of computer specialists have access to the computers for
maintenance and repair security is easily compromised, since these wizards
rarely go through the full authentication and authorization process demanded
during normal operations. These issues are
common to all computer systems, so will not discuss them further. We do observe
that today, perhaps because of poor funding and high availability demands of
computer systems in healthcare, that many systems are not run as carefully as
they must be if security is to be assured.
System reliability is not the focus of this chapter, but a reasonable
level of reliability must be attained if security of information is to be
achieved.
Whenever data are transmitted through outside of an institution there is chance that it may be misdirected or overheard. To protect again this type of loss, the contents of any transmission to a remote site should be encrypted, and decrypted upon receipt. Encryption technology for communication is routinely available, and has only modest effects on system performance and costs [Beth:95]. On the other hand, it is not effective to store medical records for the long-term in encrypted form. The variety of accessors is such that means for decryption will have to be provided at many points, adding little protection and a high costs to assure that access is provided when and where needed.
Another aspect of reliability pertains to the stored
data themselves. Errors in data collection, data entry, filing, and data
manipulation will occur even in very well managed systems. There are some
differences between data kept on paper or handled in the EMR, but the final
error rate is not drastically affected. Few EMR system designers have included
convenient provisions to mark possible errors and eventually correct them,
whereas marking questionable entries on paper is easy. A good EMR can
distribute any corrections made automatically to all destination that have
received erroneous data. Since there is less transcription in an EMR fewer eyes
have a chance of finding errors, so that errors are less likely to be caught in
an EMR. Computer systems can automatically identify simple inconsistencies in
patients’ histories, laboratory results, and the like. Historically, physicians
are well aware of the limitations of data, and will rarely commit to procedures
based on a single indication. Now, and even more so in the future, economic
pressures are reducing the redundancy of laboratory testing and status
recording that provided a safety margin in earlier systems.
Security and privacy is indirectly affected by the
presence of errors in data records. Reporting misfiled data about a patient to
an external destination can be embarrassing and even costly. Data as well as
processing errors will be seen as failures to properly protect data.
We
will now summarize the system concepts that underlie the protection of
healthcare information. Basic to any approach is the need to define what
information is to be protected, authenticate the people that have access to the
records, and manage their authorization with respect to the data. In a
subsequent section we will introduce the technical means for achieving the
security objectives, but first we must first address how the system must deal
with people. Figure 1 illustrates the
relationship among relevant concepts.
Figure 1: Components in Security and Protection of Privacy
Authentication.
Authentication
of individual requires that some personal identification be submitted. Today, the entering of passwords into a
computer is most common way. There is
tension between having secure passwords, that are lengthy and uncommon, versus
passwords that are easily remembered.
Uncommon passwords are often written down, perhaps kept in the drawer of
the terminal desk. Common passwords, as names of family members, pets, or
birthdays are easily guessed. Even if only one user of a computer system
chooses a poor password, the contents of an entire system may be compromised
[CastanoFMS:95]..
In the near future identification cards will gain
acceptance. They are combined with simple identification numbers that require
little recall but still protect access in case of loss. Identification cards
are less likely to be misplaced or left near the computer terminal if they are
also needed to gain access to the buildings, the parking lots, etc. However, we
will also have to deal with remote accessors where issuing individual cards is
not feasible.
More stringent means of authentication employ
biometric technologies that depend on unalterable physical characteristics of
an individual. Methods that are being proposed to control authentication
include automated checking of voice prints, fingerprints, facial features,
retinal patterns, or hand dimensions [HolmesWM:91]. Devices for these methods
are becoming routinely available, although they will not be found at every site
where access to medical information is needed. For acceptance in critical
settings these devices must demonstrate a very high reliability in practical
situations, for instance the voice analysis must not deny access when
practitioners express stress in their speech. Since identification is only one
link in having secure systems a disproportionate investment in high-tech
authentication is not warranted.
It
is crucial to properly define the boundary of protection. In networked
computing, as is common now, the boundary of protection is not simply the
physical perimeter of a computer system, it extends to all the computer systems
that share a common protection system. Such a virtual perimeter is best defined
by a firewall [Cheswick:94], software that is intended to prevent both
inappropriate access and inappropriate release of information. It is at the
firewall that authentication is validated. A simple firewall may wrap one
specific record system [Venema:92], several interoperating systems, or all
computers within a healthcare enterprise. Sites outside of the perimeter may be
accessed via the Internet, increasing the need for firewalls [ChapmanZ:95]. The
complexity of providing protection depends on the scope of the system.
There may be multiple domains, each protected by
their own firewalls, in a major institution. For instance, financial
information may be segregated from patient care data. Some applications must
span the domains, for instance, to justify billings to a third party
information from the medical record is required, but the release of such
information should be mediated, since the insurance company has no right to
obtain information not related to the current case.
Within the health care system are many types of
data, but central to our concern is the medical portion of the EMR. This
portion is the most problematical, since much of its is relatively unstructured
text. The text will contains both highly private information as well as
information that must be made available for billing and external
reporting. It can refer to multiple
diseases, although the rules for release of information may differ among diseases.
For instance information about HIV infections must be dealt with more carefully
than cardiac problems. Pregnancies, diabetes, trauma, etc. all have differing
sensitivities to release of data. Keeping the information in the medical record
disjoint is not practical. For proper healthcare the total picture is
essential, so that a rigorous partitioning is inappropriate. We do want that a
nurse who takes care of a patient can be aware of any infections that can be
transmitted, even if the current task is to deal with another problem.
Within
a domain an authenticated accessor will have certain rights. Rights may pertain
to certain files, or certain records. For instance, nurses on a ward should
have access to all medical information for patients in the ward, while the
physicians will need access to the patients under their care in various
localities. The rights to append
information, as entering orders, is more restricted. Authorization to actually
change stored information is rare, since inhibits the audit trail for decisions
that have been made.
For convenience, a specific type of authorization
may be assigned to groups having specific roles, say the billing clerks in an
institution, so that the number of entries in an authorization table remains
manageable. Then there must be mapping table from individuals to such a group.
It is inappropriate to assign an identification to multiple individuals. If
that is done, and one member departs, new identification cards or passwords
have to be given to all. The attendant
costs and risky delays are worse than the cost of authenticating every
individual.
Where access is performed by a remote organization, say a clinic, the issuing of EMR identifications can be delegated to that site. Again, individuals should be properly identified, even when they share an identical authorization. Such a policy encourages responsible protection of information and is essential to provide an audit trail. Remote access does have additional risks. To mitigate them, additional constraints may be imposed. For instance, insurance companies may be restricted to have access to otherwise authorized information only during their working hours, to prevent unsupervised access. Emergency overrides will not be needed.
Authorizations follow professional conventions. Physician and nurses that are bound by a code of ethics will receive broader rights than clerical personnel [AMA:94]. Staff working within an institution and receiving guidance will have fewer restrictions imposed on them than external staff [CPRI:96]. Where data access is not urgent, say for medical research, delays due to a more careful validation are acceptable. Data for research may also be transformed to reduce the risk of inadvertent disclosure [Sweeney:96].
The information defining assigned authorizations
should be kept so that it can be easily inspected and updated when needed. That
information also must be represented in a way that computer programs which
enforce access can interpret the rights and assign them to authenticated
individuals or roles as needed [GriffithsW:76]. The table containing the rights assigned to individuals and
groups with respect to the types of data represent a major part of the
institutional policy for security and protection.
Authorized
transactions within an EMR can be easily be recorded or logged. The storage
capacity of computer systems is such that transaction logs can be quite
comprehensive, recording who accessed what, when, and how. When data leave the
institution, the actual contents should also be recorded. For periodic audit
tools can be sued to spot atypical activities, and in case of problems definite
conclusions can be drawn.
Here computer systems can perform much better than
humans, who have fallible and often opinionated recall. To encourage that all
data access are mediated by the security services, any transaction must be
allowable when essential, including exceptional requests. If systems do not
have the capability to allow exceptions, users will use improper means, such as
copying files and removing them physically, when a legitimate need exists. The
log and resulting audit trail then becomes incomplete. If problems arise,
legitimate versus wrong use of improper methods are hard to distinguish.
The
decisions that define what types information to protect from what classes of
individuals, and to what extent to invest in protection must be made a high
institutional level. It s the management of an institution who ultimately have
responsibility when access fails or when a patient’s expectation of privacy is
violated [Regan:95]. Once the policies are set, their execution is delegated to
specialists. In our description we will assume that the execution of protection
policies is delegated to an institutional security officer. Such a person
maintains the communication between management and computer and communication
technologists who manage the actual software. The translation of policies to
enforceable rules is always problematic.
Not all desirable policies can be fully implemented. For instance,
automating the policy that in an emergency case all data must be available,
requires that the computer can unambiguously recognize an emergency. The policy may be implemented to make all
information available recognize an emergency.
The policy may be partially implemented by making all information
available to emergency room personnel; but not all emergencies occur in the
emergency room.
Having an individual on duty who is authorized to
override restriction is wise. Such overrides can also be logged, so that a
complete audit trail is maintained. The security officer can establish such
rules in order to best implement the institutional policies. Today, the
responsibility for implementation is often assigned to technical personnel as a
secondary responsibility. When, for instance, the database manager is
responsible for security, very liberal rules are likely to be established,
since the primary function of this person is to make data available, not to
protect them from inappropriate access. Similar concerns arise when a
networking manager is assigned the responsibility for security and privacy,
since for that person the primary objective is to keep the system accessible,
not to protect data from inappropriate access.
In
order to provide security a number of technologies are in common use. We listed various means for authentication
above, but also have to worry that transmissions are save from intruders, that
authorizations are obeyed, and that only appropriate information is released.
We assume now that management policies are in place, and that that operational
responsibilities have been assigned to a security officer.
Transmission
of information, including the passwords or identifications needed for authentication,
can be protected through encryption. Encryption causes a message to be
transformed according to an encryption key. The encryption key can direct
shuffles, boolean transforms, reversible multiplications, and the like.
Cryptography can provide an arbitrarily high level of protection by lengthening
the key. The difficulty of breaking encrypted information increases
proportionally to the power of the size of the encryption key. Software using
<48?60>-bit keys has been in common use for a long-time [KonheimEa:80].
With current high-performance computers data encrypted with such a key are
decodable within a few days. Still, the information to be gained by breaking
into an EMR rarely warrants even that effort. Cryptographic procedures with
much longer keys are now becoming available. Existing and developing
capabilities seem to be adequate for healthcare.
Managing the keys is still a problem. The key used for encryption has to made available to the destination, so that decryption can be performed. Loss of the key makes all of the information inaccessible, and a stealing a copy of the key makes encryption meaningless. Key losses can be dealt with be depositing copies of the keys with a responsible part, an escrow agent. Law enforcement agencies have been favoring schemes where keys would always be deposited with an escrow agent, so that encrypted files could be decoded when a legal search warrant is issued. It appears unlikely that they will get their wish, since such restrictions can easily be ignored by criminals and people suspicious of the government.
Public-key encryption use two keys to overcome the
problem of key management. Data to be transmitted are encrypted with two keys,
one supplied by the sender and one by the receiver. A private version of the key
is retained locally and derived keys are made publicly available. Encryption
uses the local private and the remote public keys. Decryption requires the
remote public and the local private keys. Public-key encryption is effective
for modest data volumes, for sharing keys used to encrypt larger quantities of
data, and to authenticate remote accessors [Diffie:88].
Although cryptography is an essential tool in
protecting information from intruders, it only provides protection for
well-defined tasks, and cannot distinguish among the many types of accessors
that need to get to an EMR. All legitimate accessors to a record would need the
same encryption key, and could not be distinguished. All others are viewed as
potential enemies.
Firewall
software is now widely available, and is effective in defining the perimeter of
an enterprise. They analyze the headers
of incoming information, and sometimes outgoing, information packets and can
limit access to sites that have known Internet IP addresses. It has been hard
to protect computer systems from intruders who masquerade themselves as coming
from legitimate Internet sites.
Many products can also validate submitted authentication information. Mobile accessors, say physicians on travel, typically do not have a fixed IP addresses, and for those individual authentication is essential. Since these identifications are submitted over public pathways, it is important that the transmissions are protected, so that potential intruders cannot copy legitimate name-and-password combinations.
Firewalls do not check the specific authorizations
or contents of requests, submitted, or retrieved. For those aspects internal
software, perhaps database systems must be responsible. If a legitimate user,
either inadvertently or through subterfuge obtains inappropriate information,
the filtering provided by a firewall is of no help.
The authorization table relates accessors, be they individuals or groups, to categories of the stored data. Implicit in this approach is that the data to be presnted or retrieved are partitioned into disjoint cells, so that for every authorization types cells with the appropriate rights can be identified. The process of assigning categories to information involves every person who creates, enters, or maintains information. When there are few cells, originators of data can understand what is at stake, and can perform the categorization function adequately, although errors in filing will still occur. When there are many cells, the categorization task becomes onerous and error prone. When new applications are created, surveillance for more diseases is needed, or new collaborators must share the existing information system, the categorization task becomes impossible.
We have seen that we deal in the medical domain with many types of collaborators, all sharing access to information in the EMR. These collaborators are important in our complex enterprise, and cannot be viewed as enemies. The medical record cannot be partitioned into sections that are distinct for each group of authorized users. Such sections will overlap, and the number of possible combinations will be unmanageable [LuniewskiEa:93].
Today, security provisions for computing focus on controlling access.
Relying
on access control makes the assumption that five conditions are fulfilled
1.
Authentication
of all accessors
2.
Perimeter
control by use of a firewall or its equivalent
3.
Authorizations
that are complete and well-maintained
4.
Secure
transmission wherever physical access is not controlled
5.
Partitioning
of the information to match the authorization pattern
Unfortunately, in health care the last condition,
namely perfect partitioning of the information into cells for disjoint access,
is not realistic. We have many
accessors whose needs overlap. We cannot expect that medical staff can foresee
all the uses that medical information will serve, so that partitioning at the
time of data collection is impossible. Delays to partition data later are not
acceptable, since patient care demands that the record be accessible in a
comprehensive form and up-to-date [Rindfleisch:97]. Furthermore, performing
data partitioning to obtain security would greatly increase the cost of
healthcare.
Changing patterns of outsourcing of services imposes
exacerbates the problem. Reorganizing healthcare databases to deal with
developing needs for external access is costly and disruptive, since it will
affect existing users and their applications. The problem has been recognized,
but not yet addressed in industry; for instance, security concerns were the
cited as the prime reason for lack of progress in establishing virtual enterprises [HardwickS:96].
The
solution we provide to this dilemma is result
checking [WiederholdBSQ:96]. In addition to the conventional tasks of
access control the results of any information requests are filtered before
releasing them to the requestor. We also check a large number of parameters
about the release. This task mimics the manual function of a security officer
when checking the briefcases of collaborating participants leaving a secure
meeting, on exiting the secure facility.
Note that checking of result contents is not performed in standard security
processing. Multi-level secure systems may check for unwanted inferences when
results are composed from data at distinct levels, but rely on level
designations and record keys. Note that
result checking need not depend on the sources of the result, so that it
remains robust with respect to information categorization, software errors, and
misfiling of data.
We
incorporate result checking in a security mediator workstation, to be
managed by a security officer. The security
mediator system interposes security checking between external accessors and
the data resources to be protected, as shown in Fig.1. It carries out functions
of authentication and access control, to the extent that such services are not,
or not reliably, provided by network and database services. Physically a
security mediator is designed to operate on a distinct workstation, owned and
operated by the enterprise security officer (S.O.). It is positioned as a pass gate within the enterprise firewall,
if there is such a firewall. In our initial commercial installation the
security mediator also provided traditional firewall functions, by limiting the
IP addresses of requestors [WiederholdBD:98].
Fig.1. Functions provided by a TIHI/SAW Security
Mediator
The
mediator system and the source databases are expected to reside on different
machines. Thus, since all queries that
arrive from the external world, and their results, are processed by the
security mediator, the databases behind a firewall need not be secure unless
there are further internal requirements. When combined with an integrating
mediator, a security mediator can also serve multiple data resources behind a
firewall [Ullman:96]. Combining the
results of a query requiring multiple sources prior to result checking improves
the scope of result validation.
The
supporting database systems can still implement their view-based protection
facilities [GriffithsW:76]. These need not be fully trusted, but their
mechanisms add efficiency.
Within
the workstation is a rule-base system which investigates queries coming in and
results to be transmitted to the external world. Any request and any result which cannot be vetted by the rule
system is displayed to the security officer, for manual handling. The security officer decides to approve,
edit, or reject the information. An
associated logging subsystem provides an audit trail for all information that
enters or leaves the domain. The log
provides input to the security officer to aid in evolving the rule set, and
increasing the effectiveness of the system.
The
software of our security mediator is composed of modules that perform the
following tasks
1.
Optionally
(if there is no firewall): Authentication of the requestor
2.
Determination
of authorization type (clique) for the requestor
3.
Processing
of a request for information (pre-processing) using the policy rules
4.
If
the request is dubious: interaction with the security officer
5.
Communication
to internal databases (submission of certified request)
6.
Communication
from internal databases (retrieval of unfiltered results)
7.
Processing
of results (post-processing ) using the
policy rules
8.
If
the result is dubious: interaction with the security officer
9.
Writing
query, origin, actions, and results into a log file
10.
Transmission
of vetted information to the requestor
Item
7, the post-processing of the results obtained from the databases, possibly
integrated, is the critical additional function. Such processing is potentially
quite costly, since it has to deal thoroughly with a wide variety of data.
Applying such filters selectively, specifically for he problems raised in
collaborations, as well as the capabilities of modern computers and
text-processing algorithms, makes use of the technology feasible.
<<Historical information is important for disease management, but not for many billing tasks. It is obviously impossible to split the record into access categories that match every dimension of access. Even if that would be possible, the cost and risks to the internal operations in a hospital or clinic would be prohibitive. >>
A
rule-based system is used in TIHI to control the filtering, allowing the
security policies to be set so that a reasonable balance of cost to benefit is
achieved. It will be described in the
next section.
Having
rules, however is optional. Without rules the mediator system will operate in
fully paranoid mode. Each query and each result will be submitted to the
security officer. The security officer will view the contents on-line, and
approved, edit, or reject the material. Adding rules enables automation. The
extent of automation depends the coverage of the rule-set. A reasonable goal is the automatic
processing of say, 90% of queries and 95% responses.
Unusual
requests, perhaps issued because of a new coalition, assigned to a new clique,
will initially not have applicable rules, but can be immediately processed by
the security officer. In time, simple rules can be entered to reduce the load
on the officer.
Traditional
systems, based on access control to precisely defined cells, require a long
time to before the data are set up, and when the effort is great, may never be
automated. In many situation we are
aware of, security mechanisms are ignored when requests for information are
deemed to be important, but cannot be served by existing methods. Keeping the
security officer in control allows any needed bypassing to be handled formally.
This capability recognizes that in a dynamic, interactive world there will always
be cases that are not foreseen or situations the rules are too stringent. Keeping the management of exceptions within
the system greatly reduces confusion, errors, and liabilities.
Even
when operating automatically, the security mediator remains under the control
of the enterprise since the rules are modifiable by the security officer at all
times. In addition, logs are accessible
to the officer, who can keep track of the transactions. If some rules are found
to be to liberal, policy can be tightened. If rules are too stringent, as
evidenced by an excessive load on the security officer, they can be relaxed or
elaborated.
The
rules system is composed of the rules themselves, an interpreter for the rules,
and primitives which are invoked by the rules.
The rules embody the security policy of the enterprise. They are hence not preset into the software
of the security mediator.
In
order to automate the process of controlling access and ensuring the security
of information, the security officer enters rules into the system. These rules
are trigger analyses of requests, their results, and a number of associated
parameters. The interpreting software
uses these rules to determine the validity of every request and make the
decisions pertaining to the disposition of the results. Auxiliary functions help the security officer enter appropriate rules and
update them as the security needs of the organization change.
The
rules are simple, short and comprehensive. They are stored in a database local
to the security mediator system with all edit rights restricted to the security
officer. Some rules may overlap, in
which case the most restrictive rule
automatically applies. The rules may pertain to requestors, cliques of
requestors having certain roles, sessions, databases tables or any combinations
of these.
Rules
are selected based on the authorization clique determined for the
requestor. All the applicable rules
will be checked for every request issued by the requestor in every session. All
rules will be enforced for every requestor and the request will be forwarded to
the source databases only if it passes all tests. Any request not fully vetted is posted immediately to the log and
sent the security officer. The failure
message is directed to the security officer and not to the requestor, so that
the requestors in such cases will not see the failure and its cause. This prevents that the requestor could
interpret failure patterns and make
meaningful inferences, or rephrase the request to try to bypass the filter
[KeefeTT:89].
The
novel aspect of our approach is that security mediator checks outgoing results
as well. This is crucial since, from
the security-point-of-view, requests are inclusive, not exclusive selectors of
content and may retrieve unexpected information. In helpful, user-friendly information systems getting more than
asked for is considered beneficial, but from a security point-of-view being
generous is risky. Thus, even when the
request has been validated, the results are also subject to screening by a set
of rules. As before, all rules are
enforced for every requestor and the results are accessible only if they pass
all tests. Again, if the results
violate a rule, a failure message is logged and sent to the security officer
but not to the requestor.
The
rules invoke executable primitive functions which operate on requests, data,
the log, and other information sources.
As new security functions and technologies appear, or if specialized
needs arise, new primitives can be inserted in the security mediator for
subsequent rule invocation. In fact, we do not expect to be the source of all
primitives. We do hope that all primitives will be sufficiently simple that
their correct function can be verified.
Primitives
which have been used include:
·
Assignment
of a requestor to a clique
·
Limit
access for clique to certain database table segments or columns
·
Limit
request to statistical (average, median, ..) information
·
Provide
number of data instances (database rows) used in a statistical result
·
Provide
number of tables used (joins) for result for further checking
·
Limit
number of requests per session
·
Limit
number of sessions per period
·
Limit
requests by requestor per period
·
Block
requests from all but listed sites
·
Block
delivery of results to all but listed sites
·
Block
receipt of requests by local time at request site
·
Block
delivery of results by local time at delivery site
·
Constrain
request to data which is keyed to requestor name
·
Constrain
request to data which is keyed to request site name
·
Filter
all result terms through a clique-specific good-word dictionary
·
Disallow
results containing terms in a clique-specific bad-word dictionary
·
Convert
text by replacing identifies with non-identifying surrogates [Sweeney:96]
·
Convert
text by replacing objectionable terms with surrogates
·
Randomize
responses for legal protection [Leiss:82]
·
Extract
text out of x-ray images (for further filtering) [WangWL:98]
·
Notify
the security officer immediately of failure reports
·
Place
failure reports only in the log
Not
all primitives will have a role in all applications.
Primitives
can vary greatly in cost of application, although modern technology helps.
Checking for terms in results is costly in principle, but modern spell-checkers
show that it can be done fairly fast. For this task we create clique-specific
dictionaries, by initially processing a substantial amount of approved
results. In initial use the security
officer will still get false failure reports, due to innocent terms that are
not yet in the dictionary. Those will be incrementally added, so that in time
the incidence of such failures will be minimal.
For
example, we have in use a dictionary for ophtamology, to allow authenticated
researchers in that field to have access to patient data. That dictionary does
not include terms that would signal, say HIV infection or pregnancies,
information which the patients would not like to see released to unknown
research groups. Also, all proper names, places of employment, etc. are effectively
filtered.
Figure 2. Extract from a report to the Security
Officer
Several
of these primitives are designed to help control inference problems in
statistical database queries [AdamW:89].
While neither we, nor any feasible system can prevent leaks due to
inference, we believe that careful management can make reduce the probability
[Hinke:88]. Furthermore, providing the
tools for analysis, as logging all accesses will reduce the practical threat
[Hinke:88], [Sweeney:97]. The primitive
to enforce dynamic limits on access frequencies will often have to refer to the
log, so that efficient access to the log, for instance by maintaining a large
write-through cache for the log, will be important. Here again the function
of traditional database support and security mediation diverges, since database
transaction are best isolated, where as inference control requires history
maintenance.
Throughout,
the failures, as well as the request text and source, and actions taken by the
security officer, are logged by the system for audit purposes. Having a security log which is distinct from
the database log is important since:
·
A
database system logs all transactions, not just external requests, and is hence
confusingly voluminous
·
Most
database systems do not log attempted and failed requests fully, because they
appear not to have affected the databases
·
Reasons
for failure of requests in database logs are implicit, and do not give the
rules that caused them.
We
provide user-friendly utilities to scan the security log by time, by requestor,
by clique, and by data source.
Offending terms in results are marked.
No
system, except one that provides complete isolation, can be 100% foolproof. The
provision of security is, unfortunately, a cat-and-mouse game, where new
threats and new technologies keep arising.
Logging provides the feedback which converts a static approach to a
dynamic and stable system, which can maintain an adequate level of protection.
Logs will have to be inspected regularly to achieve stability.
Bypassing
of the entire system and hence the log remains a threat. Removal of information
on portable media is easy. Only a few
enterprises can afford to place controls on all personnel leaving daily for
home, lunch, or competitive employment.
However, having an effective and adaptable security filter removes the
excuse that information had to be downloaded and shipped out because the system
was to stringent for legitimate purposes.
Some enterprises are considering limiting internal workstations to be
diskless. It is unclear how effective this approach will be outside of small,
highly secure domains in an enterprise.
Such a domain will then have to be protected with its own firewall and a
security mediator as well, because collaboration between the general and highly
secure internal domains must be enabled.
Our
initial demonstrations have been in the healthcare domain, and a commercial
version of TIHI is now in use to protect records of genomic analyses in a
pharmaceutical company. As the
expectations for protection of the privacy of patient data are being solidified
into governmental regulations we expect that our approach will gain popularity
[Braithwaite:96]. Today the healthcare establishment
still hopes that commercial encryption tools will be adequate for the
protection of medical records, since the complexity of managing access
requirements has not yet been faced [RindKSSCB:97]. Expenditures for security in medical enterprises are minimal
[NRC:97]. Funding of adequate provisions in an industry under heavy economic
pressures, populated with many individuals who do not attach much value to the
privacy of others, will remain a source of stress.
Identifying
information is routinely deleted from medical records that are
disseminated for research and
education. However, here a gap existed as well: X-ray, MRI, and similar images
accompany many records, and these also include information identifying the
patient. We have developed software which recognizes such text using
wavelet-based decomposition and analysis, extracts it, and can submit to the
filtering system developed in TIHI.
Information which is determined to be benign can be retained, and other
text is effectively removed by omitting high-frequency components in the
affected areas [WangWL:98].
We
have also investigated our original motivating application area, namely
manufacturing information. Here the simple web-based interfaces which are
effective for the customer and the security officer interfaces in health care
are not adequate. We have demonstrated
interfaces for the general viewing and editing of design drawings and any
attached textual information. In
drawings significant text may be incorporated in the drawings themselves. When
delivering an edited drawing electronically, we also have to assure that there
is no hidden information. Many design
formats allow undo operations, which
would allow apparently deleted information to reappear.
Before
moving to substantial automation for collaboration in manufacturing, we will
have to understand the parameters for reliable filtering of such information
better. However, as pointed out initially, even a fully manual security
mediator will provide a substantial benefit to enterprises that are trying to
institute shared efforts rapidly.
However,
break-ins still occur. Most of them are initiated via legitimate access paths,
since the information in our systems must be shared with customers and
collaborators. In that case the first three technologies provide no protection,
and the burden falls on the mappings and the categorization if the information.
Once users are permitted into the system, protection becomes more difficult.
In
the near future the requirements for security of the medical record, be it on
paper or in electronic form, will be increasing. Protection of what patients perceive to be their private
information is becoming important.
Legal obligations will arise as well, but limiting to protection to what
appears to be legal minimum may well be unattractive. In any case, it will take
some time before cases law catches up with legal guidelines. The management of
healthcare institutions most be prepared to define policies and supervise their
implementation. Assigning responsibilities for security to database or network
personnel, who have primary responsibilities of making data and communication
available, will conflict with security concerns and is unwise. These people are
promoted to their positions because they have a helpful attitude and know how
to overcome problems of system failures and inadequacies. This attitude is inherently in conflict with
corporate responsibilities for the protection of data. Outside vendors of
products will not advertise the weaknesses of their approaches to security,
especially in respect to the complexity of the requirements imposed on a
medical record.
We have presented security mediation as an
architectural function as well as a specific service. Architecturally,
expanding the role of a gateway in the firewall from a passive filter to an
active pass gate service allows concentration of the responsibility for
security to a single node, owned by the security officer. Existing
technologies, as constraining authorization views over databases, encryption
for transmission in networks, password management in operating systems, etc.,
can be managed via the security mediator node.
The specific, novel service presented here, result
checking, complements traditional access control. We have received a patent to
cover the concept. Checking results is especially relevant in systems with many
types of users, including external collaborators, and complex information
structures. In such settings the requirement that systems that are limited to
access-control impose, namely that all data are correctly partitioned and filed
is not achievable in practice. Result checking does not address all issues of
security of course, as protection from erroneous or malicious updates, although
it is likely that such attacks will be preceded by processes that extract
information. A side-effect of result checking that it provides a level of
intrusion detection.
The rule-based approach allows balancing of the need
for preserving data security and privacy and for making data available. Data which is too tightly controlled reduces
the benefits of sharable information in
collaborative settings. Rules which are
too liberal can violate security and expectation of privacy. Having a balanced policy will require
directions from management. Having a
single focus for execution of the policy in electronic transmission will
improve the consistency of the application of the policy.
Result
filtering does not solve all problems, in security, of course. They rely still
on a minimum level of reliability in the supporting systems. They cannot
compensate when information is missing or not found because of
misidentification. In general, a
security mediator cannot protect from inadvertent or intentional denial of information
by a mismanaged database system.
Research leading to security mediators was supported by an NSF HPCC challenge grant and by DARPA ITO via Arpa order E017, as a subcontract via SRI International. Steve Dawson was the PI at SRI. The commercial transition was performed by Maggie Johnson, Chris Donahue, and Jerry Cain under contracts with SST (www.2ST.com). Work on editing and filtering graphics is due to Jahnavi Akalla and James Z. Wang. Some of this material has appeared in earlier publications [W:00].
[AMA:94] American Medical Association: “Confidentiality: Computers”; Code of Medical Ethics, Aamericamn Medical Assiciation, 1994.
[Beth:95] Thomas Beth: “Confidential Communication on the Internet”; Scientific American, December 1995, pp.88-91.
[CastanoFMS:95] S.Castano, M.G. Fugini, G.Martella, and P. Samarati: Database Security; Addison Wesley Publishing Company - ACM Press, 1995.
[ChapmanZ:95] D. Brent Chapman and Elizabeth D. Zwicky: Building Internet Firewalls; O’Reilly and Associates, 1995.
[CheswickB:94].William R.Cheswick and Steven M. Bellovin: Firewalls and Internet Security; Addison-Wesley, 1994.
[ClaytonEa:97] Paul Clayton (chair): For the Record; Protecting Electronic Health Information; National Academy Press, 1997.
[CPRI:96].Computer-based Patient Record Institute: Guidelines for managing Information Security Programs; Work Group on Confidentiality, Privacy, and Security, CPRI, 1996.
[DickS:91]
Richard S. Dick and Elaine B. Steen (eds.) The
Computer-based Medical Record:; An Essential Technology for Health Care; Institute
of Medicine, National Academy Press, 1991.
[Didriksen:97] Tor Didriksen: “Rule-based Database Access control – A Practical Approach”; Proc. 2nd ACM workshop on Rule-based Access Control, 1997, pp.143-151.
[Diffie:88] Whitfield Diifie: “The First Ten Years of Public-Key Cryptography”; Proc. IEEE, Vol.76 No.5, May 1988, pp.560-577.
[DonaldsonL:94] Molla S. Donaldson and Kathleen L. Lohr (eds): Health Data in the Information Age: Use, Disclosure, and Privacy; Institute of Medicine, National Academy Press, 1994.
[EvansEa:86] R. Scott Evans et al.: “Computer Surveillance of Hospital-acquired Infections and Antibiotic Use”; J. of the AMA, Vol.256 No.8, 1986, pp.1007-1011.
[GriffithsW:76] Patricia P. Griffiths and Bradford W. Wade: “An Authorization Mechanism for a Relational Database System”; ACM Trans. on Database Systems, Vol.1 No.3, Sept.1976, pp.242-255.
[HardwickS:96] M. Hardwick, D.L. Spooner, T. Rando, and KC Morris: "Sharing Manufacturing Information In Virtual Enterprises"; Comm. ACM, Vol.39 no.2, pp.46-54, February 1996.
[HolmesWM:91] J.P. Holmes, L.J. Wright, and R.L.Maxwe: A Performance Evaluation of Biometric Identification Devices; Sandia Report SAND91-0276, Sandia National Laboratories, June 1991.
[JohnsonSV:95?] Johnson DR, Sayjdari FF, Van Tassel JP.: Missi security policy: A formal approach. Technical Report R2SPO-TR001, National Security Agency Central Service, July 1995.
[KonheimEa:80] A.G. Konheim, M.H.Mack, R.K. McNeil, B. Tuckerman: The IPS Cryptographic Programs”; IBM Sys. J., Vol.19 No2, 1980, pp.302-307.
[LandwehrHM:84] Carl E. Landwehr, C.L. Heitmyer, and J.McLean: “A Security Model for Military Message Systems”; ACM Trans. on Computer Systems, Vol.2 No.3, Aug. 1984, pp. 198-222.
[LuniewskiEa:93] Luniewski, A. et al. "Information organization using Rufus" SIGMOD '93, ACM SIGMOD Record, June 1993, vol.22, no.2 p. 560-1
[QianW:97] Qian, XioaLei and Gio Wiederhold: "Protecting Collaboration"; abstract for IEEE Information Survivability Workshop, ISW'97, Feb.1997, San Diego.
[Regan:95] Priscilla M. Regan: Legislating Privacy, Technology, Social Values. and Public Policy; University of North Carolina Press, 1995.
[RindKSSCB:97] David M. Rind, Isaac S. Kohane, Peter Szolovits, Charles Safran, Henry C. Chueh, and G. Octo Barnett: "Maintaining the Confidentiality of Medical Records Shared over the Internet and the World Wide Web"; Annals of Internal Medicine 15 July 1997. 127:138-141.
[Rindfleisch:97] Thomas C. Rindfleisch: Privacy, Information Technology, and Health Care; Comm. ACM; Vol.40 No. 8 , Aug.1997, pp.92-100.
[SchaeferS:95] M. Schaefer, G. Smith: “Assured discretionary access control for trusted RDBMS”; in Proceedings of the Ninth IFIP WG 11.3 Working Conference on Database Security, 1995:275-289.
[Seligman:99] Len Seligman, Paul Lehner, Ken Smith, Chris Elsaesser, and David Mattox: "Decision-Centric Information Monitoring"; Jour. of Intelligent Information Systems (JIIS), Vol.14, No.1.; also at http://www.mitre.org/pubs/edge/june_99/dcim.doc
[Sweeney:96] Latanya Sweeney: "Replacing personally-identifying information in medical records, the SCRUB system"; Cimino, JJ, ed. Proceedings, Journal of the American Medical Informatics Association, Washington, DC: Hanley & Belfus, 1996, Pp.333-337.
[Sweeney:97] Latanya Sweeney: "Guaranteeing anonymity when sharing medical data, the DATAFLY system"; Proceedings, Journal of the American Medical Informatics Association, Washington DC, Hanley & Belfus, 1997.
[Ullman:97?] Jeffrey Ullman: Information Integration Using Logical Views; International Conference on Database Theory (ICDT '97) Delphi, Greece, ACM and IEEE Computer Society, 1997.
[Venema:92] Wietse Venema: “TCP wrapper: Network Monitoring, Access Control, and Booby Traps”; Proc.3rd Usenix Security Symp., Baltimore MD, 1992.
[WangWL:98] James Z. Wang, Gio Wiederhold and Jia Li: Wavelet-based Progressive Transmission and Security Filtering for Medical Image Distribution"; in Stephen Wong (ed.): Medical Image Databases; Kluwer publishers, 1998, pp.303- 324.
[WiederholdBC:98] Gio Wiederhold, Michel Bilello, and Chris Donahue: "Web Implementation of a Security Mediator for Medical Databases"; in T.Y. Lin and Shelly Qian:Database Security XI, Status and Prospects, IFIP / Chapman & Hall, 1998, pp.60-72.
[WiederholdBSQ:96] Gio Wiederhold, Michel Bilello, Vatsala Sarathy, and XiaoLei Qian: A Security Mediator for Health Care Information"; Journal of the AMIA, issue containing the Proceedings of the 1996 AMIA Conference, Oct. 1996, pp.120-124.
[Wiederhold:00] Gio Wiederhold: “Protecting Information when Access is Granted for Collaboration”; to appear in Springer Verlag Volume<>
------------------------------ o ---------------------------------------------------------------- o ----------------------------