CS145 - Spring 2004
Introduction to Databases
Challenge Problems #1
Due Tuesday April 13

Submission Instructions

Guidelines for Typing Assignments

Honor Code reminder

For more detailed discussion of the Stanford Honor Code as it pertains to CS145, please see the Assigned Work page under Honor Code. In summary: You must indicate on your written and programming assignments any assistance (human or otherwise) that you received. Any assistance received that is not given proper citation will be considered a violation of the Honor Code. In any event, you are responsible for understanding and being able to explain on your own all material that you submit.

Partners are not permitted. Problems must be completed individually.

The Problems

  1. This problem explores how to systematically store relational data in XML. We will use as examples two relations:
      Prof(name, phone, email, dCode)
      Dept(code, name, building, numFac)
    
    We will use the following sample data for these relations.
    In relation Prof:
      (Widom, 123-4567, widom@cs, CS)
      (Widom, 123-4567, widom@ee, EE)
      (Ullman, 987-6543, ullman@cs, CS)
    
    In relation Dept:
      (CS, Computer Science, Gates, 40)
      (EE, Electrical Engineering, Packard, 55)
      (ME, Mechanical Engineering, MERL, 45)
    
    Do not make any assumptions about the data except as stated in the questions.

    (a) Specify a single DTD that allows data from any relational schema (one or more relations) to be stored as an XML document valid with respect to that DTD. (That is, the element structure of the XML will not be derived from the relational schema, since it needs to work for every schema.) Show how the sample data above would be encoded in XML corresponding to your DTD.

    (b) Specify an algorithm for translating a relational schema (one or more relations) into a single XML DTD, where in this case the element structure (i.e., the tags) in the XML can reflect the relational schema. Show how the sample data above would be encoded in XML corresponding to your DTD. You may specify your algorithm using any kind of high-level pseudocode that you like.

    (c) Suppose you have the additional information that every value appearing in attribute Prof.dCode appears exactly once in Dept.code. You don't have to modify your translation algorithm, but illustrate what might be considered a "better" XML structure for the sample data than the structure produced by your algorithm in (2).

  2. Consider the same relational schema as in Problem 1:
      Prof(name, phone, email, dCode)
      Dept(code, name, building, numFac)
    
    For this problem you may assume that (name,dCode) is a key for relation Prof, and code is a key for relation Dept.

    For each of the following queries, either write the query as a relational algebra expression or explain why it cannot be written in relational algebra.

    (a) Find the names of all departments where the number of professors listed for that department does not match the value in attribute numFac for that department.

    (b) Assuming the numFac attribute value is correct (i.e., ignoring the Prof relation altogether), find the name of the department with the smallest number of faculty. In the case of ties, return names for all of the smallest departments.