Database Seminar May 29 1997

Knowledge-Base Management Tools for Constructing a Knowledge Base of Biochemical Pathways

Peter D. Karp
AI Center
SRI International


EcoCyc is a biological knowledge base (KB) that describes the genome and the biochemical pathways of the bacterium E. coli. EcoCyc can be viewed as both an online scientific review article, and as a qualitative model of the cellular biochemical factory within E. coli. The KB is available via WWW as an electronic reference source for scientists, and is the foundation of a program that predicts the biochemical pathways of other bacteria from their DNA sequences. This talk will describe:

o The goals of the EcoCyc project and the contents of the EcoCyc KB.

o A biology-specific graphical interface developed for EcoCyc.

o A general-purpose tool for retrofitting the X-windows EcoCyc GUI to run through the WWW.

o A frame (object-oriented) knowledge representation system (FRS) called Ocelot, which is used to manage the EcoCyc KB. Ocelot uses a relational DBMS for persistent storage of KBs; it includes a logging system that is used for versioning, for optimistic concurrency control, and for capturing the history of schema evolution.

o A reusable KB browser and editor for FRSs called the GKB Editor. The GKB Editor is used for interactive editing of the EcoCyc KB. It provides four editing tools: a class-hierarchy browser, a semantic-network style browser, a frame editor, and a spreadsheet editor. Each tool is optimal for different browsing and editing tasks. The GKB Editor is reusable across multiple FRSs because all FRS operations are executed through a generic API for FRSs.