DB Seminar (CS 545) talk

Graphs: Large scale storage and rapid retrieval.

Robert Levinson
Department of Computer Science, University of California, Santa Cruz, CA 95064

levinson@cse.ucsc.edu

Abstract

Graph structures have great promise in many applications such natural language processing and understanding, knowledge acquisition, systems modeling, and data mining to name a few . Development and research in this domain, however, has suffered from the belief that this promising technology will not scale, even on multi processor computers. The first attempt (early 1996) at converting 100,000 English sentences to conceptual graphs (CGs) and matching the CG database against a small set of queries took about 6.5 hours on a Sun Sparc Ultra enterprise 4000. However, using a variety of techniques, including the progressive application of our associative database methods III through VI, we managed to reduce this to 19 seconds (of which 16 seconds is the overhead ontology loading and 3 seconds is the actual processing time) on the same Sparc station (early 1997). Since then we have further accelerated the process another order of magnitude and are showing that this technology is useful on real world size problems.

In order to achieve the current level of efficiency, symmetry has been exploited in multiple ways. In this talk, we first outline the ways symmetry arises in Conceptual Structures and their application and then we explain how such symmetry can be exploited in graph implementations. This talk is meant to increase deeper understanding of our previous published methods and to introduce our latest software. Prior knowledge is not assumed.