CS145 - Spring 2004
Introduction to Databases
Challenge Problems #2
Due Tuesday April 20

The Problems

  1. Consider a SQL table T(K,V) where K is a key and NULL values are not permitted in either column. Consider the following three queries:
      Q1: select V from T
          where V >= all (select V from T)
    
      Q2: select V from T as T1
          where V > all (select V from T as T2 where T2.K <> T1.K)
    
      Q3: select max(V) from T
    

    (a) Are Q1 and Q2 equivalent? That is, are they guaranteed to produce the same result on every possible instance of table T? If not, show the smallest instance of T you can find for which Q1 and Q2 produce different results. (Note: All three queries produce an empty result on an empty table.)

    (b) Same as part (a), except consider equivalence of queries Q2 and Q3.

    (c) Same as part (a), except consider equivalence of queries Q1 and Q3.


  2. Provide a general algorithm for eliminating HAVING clauses in SQL queries. That is, provide an algorithm that takes as input a SQL query Q with a HAVING clause. The algorithm should produce as output a SQL query Q' without a HAVING clause such that Q and Q' are equivalent, i.e., Q and Q' produce exactly the same answer on all databases. For simplicity you may assume: Do not make any other assumptions about the input query Q. Your algorithm should be based only on the query itself, not on the schema, the state of the database, or any other external information. (Hint: The solution to this problem is not all that complex. If you are developing a very complicated algorithm then you're probably headed in the wrong direction.)


  3. Make sure you've read about outerjoins in the textbook (pages 228-230 and 272-274).
    Most join operators are associative, meaning that:
      (R1 JOIN R2) JOIN R3
      and
      R1 JOIN (R2 JOIN R3)
    
    always produces the same final result.

    (a) Is the natural-left-outerjoin operator associative? If so, briefly argue why. If not, show the simplest example you can find where:

      (R1 NATURAL-LEFT-OUTERJOIN R2) NATURAL-LEFT-OUTERJOIN R3
      and
      R1 NATURAL-LEFT-OUTERJOIN (R2 NATURAL-LEFT-OUTERJOIN R3)
    
    produce a different final result.

    (b) Is the natural-full-outerjoin operator associative? If so, briefly argue why. If not, show the simplest example you can find where:

      (R1 NATURAL-FULL-OUTERJOIN R2) NATURAL-FULL-OUTERJOIN R3
      and
      R1 NATURAL-FULL-OUTERJOIN (R2 NATURAL-FULL-OUTERJOIN R3)
    
    produce a different final result.