CS145 Lecture Notes (13) -- Recursion in SQL



Section 10.2 in the textbook covers SQL-99 recursion but assumes knowledge of "Datalog", which we have not covered. Everything you need to know about SQL-99 recursion is covered in this lecture.


Example: Ancestors

Question: Can we write the query in SQL2?


Example: Company hierarchy Question: Can we write the query in SQL2?


Example: Airline flights Question: Can we write the query in SQL2?


SQL-99 WITH statement

  WITH R1 AS (query),
       R2 AS (query),
       ...,
       Rn AS (query)
  <query involving R1, R2,...,Rn and other relations>

Idea:
  1. Compute R1, R2,...,Rn into temporary relations
  2. Evaluate the query involving R1,...,Rn and other relations
  3. Destroy R1, R2,...,Rn
Can also specify schema for Ri's:
  WITH R1(A1, A2, ..., Am) AS (query), ...
Example:
  WITH Berk AS (SELECT ID, date FROM Apply WHERE location = "Berkeley"),
       SC AS (SELECT ID, date FROM Apply WHERE location = "SC")
  SELECT Berk.ID, Berk.date AS Bdate, SC.date AS SCdate
  FROM   Berk, SC
  WHERE  Berk.ID = SC.ID
WITH statement looks like "temporary view definitions" for syntactic convenience.
But: The Ri's can be recursive or mutually recursive. Example: Find Mary's ancestors from ParentOf(parent,child)
  WITH RECURSIVE Ancestor(anc,desc) AS
         ( (SELECT parent as anc, child as desc FROM ParentOf)
           UNION
           (SELECT Ancestor.anc, ParentOf.child as desc
            FROM Ancestor, ParentOf
            WHERE Ancestor.desc = ParentOf.parent) )
  SELECT anc FROM Ancestor WHERE desc = "Mary"
(show procedural evaluation with fixpoint)




















Example: Total salary cost of Project "X"
  WITH RECURSIVE Superior AS
         ( (SELECT * from Manager)
           UNION
           (SELECT S.mgrID, M.empID
            FROM Superior S, Manager M
            WHERE S.empID = M.mgrID) )
  SELECT sum(salary)
  FROM Employee
  WHERE ID IN
    ((SELECT mgrID FROM Project WHERE name = 'X')
      UNION
     (SELECT empID FROM Project, Superior
      WHERE Project.name = 'X' AND Project.mgrID = Superior.mgrID))
(show example data and computation)
















Alternative formulation, more specific to this query:
  WITH RECURSIVE Xemps(ID) AS
         ( (SELECT mgrID AS ID FROM Project WHERE name = "X")
           UNION
           (SELECT empID AS ID
            FROM Manager
            WHERE mgrID IN (Select ID FROM Xemps)) )
  SELECT sum(salary)
  FROM   Employee
  WHERE  ID IN (SELECT ID FROM Xemps)
(show computation)







Flights example left as exercise

Nonlinear recursion

SQL-99 only requires support of "linear" recursion: each FROM has at most one reference to a recursively-defined relation.

Example: Nonlinear version of Ancestor:

  WITH RECURSIVE Ancestor(anc,desc) AS
         ( (SELECT parent as anc, child as desc FROM ParentOf)
           UNION
           (SELECT A1.anc, A2.desc
            FROM Ancestor A1, Ancestor A2
            WHERE A1.desc = A2.anc) )
  SELECT anc FROM Ancestor WHERE desc = "Mary"

Mutual recursion

  WITH RECURSIVE Hub(node)
         ( (SELECT node FROM HubStart)
           UNION
           (SELECT src AS node FROM Link L
            WHERE 3 >= (SELECT count(*) FROM Auth A
                        WHERE L.dest = A.node) ),
       RECURSIVE Auth(node)
         ( (SELECT node FROM AuthStart)
           UNION
           (SELECT dest AS node FROM Link L
            WHERE 3 >= (SELECT count(*) FROM Hub H
                        WHERE L.src = H.node) )
  <any query>

Now suppose a node can't be both hub and an authority (and assume it's already enforced in HubStart and AuthStart).

  WITH RECURSIVE Hub(node)
         ( (SELECT node FROM HubStart)
           UNION
           (SELECT src AS node FROM Link L
            WHERE 3 >= (SELECT count(*) FROM Auth A
                        WHERE L.dest = A.node) 
            AND src NOT IN (SELECT node FROM Auth) ),   ==== added
       RECURSIVE Auth(node)
         ( (SELECT node FROM AuthStart)
           UNION
           (SELECT dest AS node FROM Link L
            WHERE 3 >= (SELECT count(*) FROM Hub H
                        WHERE L.src = H.node)  
            AND dest NOT IN (SELECT node FROM Hub) )    ==== added
  <any query>
Question: What's wrong?







Final Example

Given relation P(x)
  WITH RECURSIVE R(x)
         ( (SELECT x FROM P)
           UNION
           (SELECT sum(x) FROM R) )
  <any query>
Question: What happens?