EROC: A Toolkit for Building Query Optimizers

Bill McKenna

Red Brick Systems

bmckenna@redbrick.com


ABSTRACT

In this talk I will present EROC (Extensible, Reusable Optimization Components), a toolkit for building query optimizers. EROC's components are C++ classes based on abstractions we have identified as central to query optimization, not only in relational DBMSs, but in extended relational and object-oriented DBMSs as well. EROC's use of C++ classes clarifies the mapping from application domain (optimization) abstractions to solution domain (EROC) abstractions, and these classes provide:

(1) complex predicate definition and manipulation;
(2) representations for common operators, such as join and group by, and associated property derivation functions, including key derivation;
(3) management of catalog and type information;
(4) implementations of common algebraic equivalence rules, and
(5) System R- and Volcano-style search strategies.

The classes are designed to provide optimizer implementors reusability and extensibility through layering and inheritance. EROC provides much more functionality than previous optimization tools because all of EROC's optimization classes are extensible and reusable, not just the search components. I will also discuss some of the reusable object-oriented design patterns (Gamma, et al.) that occur in EROC's architecture. In addition to describing EROC's architecture and software engineering, I will also show how EROC's classes were extended to build NEATO (New EROC-based Advanced Teradata Optimizer), a join optimizer for Teradata's massively parallel environment. EROC is currently being evaluated for possible use as the basis for a new optimizer for Red Brick's data warehousing DBMS.