Swami Abstract

Set-Oriented Mining for Association Rules in Relational Databases

Dr. Arun Swami

Silicon Graphics, Inc.
Mountain View, CA 94120-6099
email: arun@cs.stanford.edu

ABSTRACT

Data mining is an important real-life application for businesses. It is critical to find efficient ways of mining large data sets. In order to benefit from the experience with relational databases, a set-oriented approach to mining data is needed. In such an approach, the data mining operations are expressed in terms of relational or set-oriented operations. Query optimization technology can then be used for efficient processing.

In this talk, we address the problem of developing set-oriented algorithms for mining association rules in large databases. We develop new algorithms that can be expressed as SQL queries, and discuss optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. Algorithm SETM uses only simple database primitives, viz., sorting and merge-scan join. The set-oriented nature of Algorithm SETM makes it possible to develop extensions easily and its performance makes it feasible to build interactive data mining tools for large databases.

(Joint work with M. Houtsma, Telematics Research Center, the Netherlands.)

Set-Oriented Mining for Association Rules in Relational Databases

Dr. Arun Swami

Silicon Graphics, Inc.
Mountain View, CA 94120-6099
email: arun@cs.stanford.edu

Papers related to talk

Data Mining with Silicon Graphics Technology

Using a CHALLENGE Server to Build a 200 GB Data Warehouse

Set-Oriented Mining for Association Rules in Relational Databases

Dr. Arun Swami

Silicon Graphics, Inc. Mountain View, CA 94120-6099 email: arun@cs.stanford.edu

Papers related to talk

Data Mining with Silicon Graphics Technology

Using a CHALLENGE Server to Build a 200 GB Data Warehouse

Silicon Graphics, Inc.
Mountain View, CA 94120-6099
email: arun@cs.stanford.edu