TID - Trusted Image Dissemination

Image Filtering for Secure Distribution of Medical Information

Support

NSF Award 9817511, per September 18, 1998
Program: Digital Library II (DLI-II) program.
Program Manager: Stephen Griffin, IIS DIV OF INFORMATION & INTELLIGENT SYSTEMS, CSE DIRECT FOR COMPUTER & INFO SCIE & ENGINR
Start Date: January 1, 1999, Expires December 31, 2001 (Estimated) Expected Total Amt. : $519K (Estimated)

Investigator: Gio Wiederhold (Computer Science, Electrical Engineering, and Medicine) <gio @ cs. stanford. edu>

Michel Bilello, PhD EE, MD <michel @ cs. stanford. edu>
Jia Li, PhD EE (now Associate Professor of Penn State University) <jiali @ db. stanford. edu>
James Z. Wang, PhD Med. Info. Sci. (now Associate Professor of Penn State University) <wangz @ cs. stanford. edu>

David Liu (CSD and Civil Engineering) <Davidliu @ stanford. edu>
Neal Sample (CSD) <NSample @ cs. stanford. edu>

Sponsor: Stanford University
651 Serra St.
Stanford, CA 94305 415/723-2300

Image Filtering for Secure Distribution of Medical Information

Our objective is to achieve secure sharing of multimedia medical information on the Internet while avoiding violations of security or privacy.

Abstract

We are investigating, developing, and testing filtering capabilities to complement other means of controlling release of medical documents to individuals and organizations outside of the direct health-care delivery setting. Our specific focus in the TID project is the information contained in images that are part of an electronic medical record. Earlier work (TIHI) focused on protecting textual material.

An increasing amount of information being transmitted over the Internet is in image form. This trend will certainly affect medical images (used in diagnosis or research) in the near future. Such information has not been processed in the past with concern for security or privacy. While privacy and security control of textual materials has long been a focus of research activities, images present new and more challenging problems. Our approach will provide an innovative capability based on experience with image database and with protecting the privacy of information in medical databases.

Architecture

Our architectural model is based on the concept of a a software tool, called a security mediator, that enables legitimate external customers to gain remote electronic access to medical information residing in a medical institution, while inhibiting the release of content that cannot be released, even when the accessors appear to be authorized. Such a security mediator is typically placed into the firewall surounding the instutution's internal databased activities.

Nearly all approaches to security focus on controlling access. Unfortunately, controlling access only requires a perfect organization of the internal data in an enterprise. In many practical cases this requirement cannot be fulfilled, since it implies a radical restructuring of all internal information services. The cost of aligning all internal data to deal with external access privileges is not only costly from the systems point-of-view, but also for all internal users of information systems, who now must file all data according to external requirements that are normally none of their concern. Storing and securely labeling data in duplicate can solve the access problem, but not the load on the participants. For instance, in a hospital, if some X-rays are to be released for research purposes, then the X-ray images must be duplicated without identifying information.

Approach

Since we cannot simply prohibit or disable access to relevant information to legitimate collaborators we approach the issue through content-based filtering. Such filtering is based on an open set of parameters, typically including the identification of the requestor, the role of the requestor, the rights assigned to that role, the time and place where the request was issued and where the results are to be delivered, and factors that can imply improper use, as unusual access frequencies.

Filtering of images in addition to text is becoming essential, since modern computing has greatly facilitated the use of information in image form. For instance, we have encountered problems specific to the dissemination of electronic medical information for web-based education.

Technology

The image filtering that we propose will rely primarily on wavelet technology. Work has been completed at Stanford that demonstrates the capability of indexing and retrieving images by wavelet transform analysis. The wavelet approach has been demonstrated to be fast and highly reliable. Its formal basis provides stability in development over more ad-hoc approaches, and has also been easy to transfer among programming languages.

We have developed novel means of recognizing features in images, specifically in linking perceptual factors to parameters in wavelet-based analysis of images. We will further develop the existing algorithms focusing on the properties evidenced by text placed within images. We anticipate that secure transmission of electronic medical information over the Internet will be a major area of application.

We devolpoed a system, based on parameterized wavelets, that can recognize text in images and extract it from the image, for submission to the content checking rules our base TIHI system provides. We introduced Optical Character Recognition (OCR) technology to positively identify releasable contents, so that as much information as is consistent with the restrictions imposed by privacy concerns can be released. Text not vetted to be releasable is blanked out of the result.

Prior Work

Prior research work was based on the development of IMAGE Database Search technology performed for the Stanford University Libraries and extended as part of project class work (CS54I). One outcomes of this work includes (i) Recognition of objectionable images (WIPE) and (ii) Identification of objectionable (pornographic) web sites (IBCOW), currently being investyigated for adoption by some web-service providers.

We are cooperating with the medical image processing research (ICBM) at UCLA and LRI at UCSF, as well with our local Stanford Medical School CT-facilities.

Summary Points

Specific projects related to security and privacy of Images include:

Identification of the existence and location textual markings on X-rays and other medical images.
Recognition of text fragments and matching them to privacy protection rules.
The removal of textual markings that are not recognized as being releasable from medical images.
Providing the image filtering services within the broader context of a security mediator architecture.

The image processing technolgy is based on feature extraction using wavelet-based decomposition, shape moments, and the linking of semantically relevant perception parameters with the Features.

The research effort will forcus on developing further wavelet-based algorithms for searching medical image databases and retrieving relevant information from multimedia medical databases; extracting textual information from images; advancing practices for the protection of privacy and implementing a security mediator; and exploring WWW interfaces for security mediators.

Applicability

An increasing amount of information being transmitted over the Internet is in image form. Filtering of images in addition to text becomes more essential as modern computing and communications facilitates the use of information in image form. This project proposes to provide image filtering capabilities to complement other means of checking the contents of documents.

This trend includes medical images used in diagnosis and research, and other materials for which it is desirable to avoid violations of security and privacy. While the current domain of interest is electronic medical records, but the research products are expected to be generalizable to other domains of interest. For instance, in manufacturing, drawings containing proprietary data, must be edited if the decision is made to have the parts produced by an external subcontractor.

Being able to recognize text within images will also enable analysis of text now hidden in logos and and headlines of web pages. We see applications to help both in improving information retrieval and filtering access to objectionable web-sites.

Some Papers:

James Ze Wang, Michel Bilello, Gio Wiederhold: "Textual Information Detection and Elimination System for Secure Medical Image Distribution"; Proceedings of the 1997 American Medical Informatics Association (AMIA'97) Annual Fall Symposium (formerly SCAMC), Nashville, Tennessee, October 1997.source report.
James Ze Wang, Gio Wiederhold: "System for Efficient and Secure Distribution of Medical Images on the Internet"; Proceedings of the 1998 American Medical Informatics Association (AMIA'98) Annual Fall Symposium, Orlando, Florida, November, 1998. source report.
Wang, James Z., Gio Wiederhold, and Jia Li: "Wavelet-based Progressive Transmission and Security Filtering for Medical Image Distribution"; in Stephen Wong (ed.): Medical Image Databases; Kluwer publishers, 1998, pp.303-324.
Wiederhold, Gio, Michel Bilello, Vatsala Sarathy, and XiaoLei Qian: A Security Mediator for Health Care Information (updated, in Word format amia2.doc).
originally published as A Security Mediator for Health Care Information (in postscript). Presented and published in the Journal of the AMIA issue containing the Proceedings of the 1996 AMIA Conference, 27Oct1996, as Proceedings of the 1996 AMIA Conference, Washington DC, Oct. 1996, pp.120-124.
Subsequently an editorial in D-LIB magazine (in html) was published; (local copy).
Wiederhold, Gio, Michel Bilello, and Chris Donahue: "Web Implementation of a Security Mediator for Medical Databases"; in T.Y. Lin and Shelly Qian: Database Security XI, Status and Prospects, IFIP / Chapman & Hall, 1998, pp.60-72.
Wiederhold, Gio and Michel Bilello: "Protecting Inappropriate Release of Data from Realistic Databases (html); Proc. of Data and Expert Systems (DEXA) Security Workshop, IEEE, August 1998, IEEE, August 1998, pp.330-339.

For more information on follow-up research in image databases and relevant publications, please visit this link.

Back to top of page
The successor project, SAW, focuses on protecting shared manufacturing data. Image filtering is becoming relevant in manufacturing domains as well, since more and more computerized information in manufacturing and business involves images, but security of contents is not supported within the scope of most research efforts.