Abstract
The classical association rule mining algorithms assume a horizontal data layout, wherein each row in the database records a transcation, and the items present in the transaction. Of late there has been considerable interest in alternative vertical data representations, wherein each item is associated with a column of values representing the transactions in which it is present. The vertical mining algorithms that have been proposed show performance improvements over their horizontal counterparts, but suffer from some limitations -- they are either efficient only for certain database sizes, or assume specific characteristics of the database contents, or are applicable only to special kinds of database schemas.
To address the above limitations, we present a new vertical mining algorithm called VIPER (Vertical Itemset Partitioning for Efficient Rule-extraction). VIPER is a "general-purpose" algorithm, which makes no assumptions about the underlying database, and integrates a number of novel optimizations. We analyze the performance of VIPER for a range of synthetic database workloads. Our experimental results indicate significant performance gains, especially for large databases, over previously proposed vertical and horizontal mining algorithms.
bawa@cs.stanford.edu
11-17-1999