FAQ for Assignment 5

What is the difference between "table-scan" and "index-scan"? What is the difference between "table-scan" and "one-pass-selection?"

Use these definitions:

Table-scan: read all of the blocks of a relation, one block at a time.
Index-scan: read the blocks of a relation, assisted by an index. This can be used to read all of the blocks of a relation. It can also be used to read only the blocks of a relation containing tuples that match an attribute value that is looked up in the index.
One-pass-selection: read the tuples produced by another relational operator and apply a selection predicate to weed out certain tuples.

Note: Although the problem says "intermediate results produced by one operator are written to disk" it doesn't make sense to write the results of a table-scan back to disk. You can assume that the tuples produced by a table-scan are consumed by another operator without an intermediate write-back to disk.

How can there possibly be two query pans for problem 1b?

Be creative. There are multiple query plans.

What is the difference between "heuristic optimization" and "cost-based optimization?"

"Cost-based optimization" as used in class or the book is still "heuristic" in the sense that heuristics are used to estimate the likely cost of a plan, and to choose between alternatives. Thus, pushing down selections in a logical query plan is a heuristic that tends to reduce the cost by reducing the intermediate result sizes. Similarly, estimating the result size of a join requires heuristics. In problem 4, by "heuristic query optimization" we mean the whole process of query optimization described in the book and in class; this process produces "good plans" but not guaranteed optimal plans.