CS145 - Spring 2002
Introduction to Databases
Assignment #1   --   Due Wednesday April 17

Assignment Components and Grading

Exercises

  1. Do parts (a)-(i) of Exercise 5.2.4 in the textbook (pages 210-212). Write the relational algebra expressions only - you do not need to show the results on the sample data although you are free to do so if you wish. Note that the ships in the problem are all "capital" ships (as stated in the first sentence), and when a problem asks for a ship your expression need only return the ship name.

  2. Design an XML document format for encoding data about items and users in an online auction. You need not be as comprehensive as, say, eBay, but you should assume the data includes (at least): items for sale including itemID, description, minimum bid, and ending time; categories that items belong to; sellers and buyers (bidders) including userID, name, and contact information; and bidding histories including time, bidder, and price. Specify a DTD and give one small example XML document conforming to that DTD. Your encoding should use at least two levels of nesting and should include at least one attribute of type ID and at least one attribute of type IDREF or IDREFS.

Challenge Problems

  1. Consider the following relational schema:
       Name(ID, name)  // ID is a key
       GPA(ID, gpa)    // ID is a key, gpa is a key
    
    Let's keep things simple by assuming gpa's are unique, i.e., no two students have the same gpa.

    (a) Can you write a relational algebra expression to find the name of the student with the highest gpa in the database? If so, write it. If not, explain why not. You may only use the relational algebra operators covered in lecture.

    (b) Can you write a relational algebra expression to find the name of the student with the median gpa in the database? (Again, you may only use the relational algebra operators covered in lecture.) If so, write it. If not, explain why not. (Recall that the median of a set of numbers is the number in the middle, i.e., there are an equal number of higher and lower numbers. To keep things simple you may assume that there is an odd number of gpa's.)

  2. Your task is to create a schema for storing XML in a relational database system. You may consider "basic" XML only: elements, attributes, and text. You don't need to consider DTDs or ID and IDREF(S) attributes. You may assume that all data to be loaded is in a single XML document, and you may assume that the relational DBMS can generate unique values for you to use as identifiers if needed.

    (a) Describe a generic relational schema that can be used to store any XML data. Specify the schema of your relations (including keys), and briefly explain how data is translated from an XML document into the relations.

    (b) Describe how you might specify a set of relations for an XML document when you can examine the document itself in order to create the relational schema. Of course your answer to part (a) would still work, but the idea is to specify a set of relations whose schema attempts to capture any regularity that may be present in the XML document. Warning: Depending on how seriously you take it, this one can be an extremely open-ended problem. Many research papers have been published on the topic.

Project Part 1

Partners

The CS145 project may be completed individually or in teams of two, the choice is up to each student. Details of partnership rules can be found in the Project page. Summarizing: Please make sure to submit your work jointly - do not turn your project in twice, once for each partner.

Scope

If you find this first part of the project a snap, you're not overlooking something, and don't worry - it will get much more interesting as time goes by. The primary purpose of this first "warm-up" part is for us to provide you with a whole bunch of basic information, and to get everyone up-to-speed on our computing systems and the languages and tools we will be using. Those of you who have done some Web programming before, especially anyone who has used a DBMS behind a Web site, will find this project part nearly trivial. Those of you who haven't will find it merely straightforward. With all that said, please don't start at the last minute - as with all programming and systems work, everything takes a bit of time, and unforeseen snafus do crop up.

Part A

In this portion of the project you will experiment with Oracle's sqlplus command-line interface and its bulk loader. You will create a database table to contain information about Stanford courses, load the table with a small amount of (real) data we are providing, and run a few commands to get familiar with Oracle.

Part B

In this portion of the project you will build a small Web interface that connects to Oracle, queries the course information database table you created in Part A, and displays the result in HTML.

First you must decide which programming language you will use. You may choose either C (with some optional C++) or Java. You should make the choice carefully, as you will want to stick with it for the entire project. If you choose C, you will use Pro*C to interact with Oracle and CGI to implement the Web interface. If you choose Java, you will use JDBC to interact with Oracle and Servlets to implement the Web interface. You will experiment with various details of Pro*C and JDBC later in the project; in this part of the project you will focus primarily on learning CGI (for C users) or Servlets (for Java users). If you are interested in using other languages or tools for building your Web interface and interacting with Oracle, please see the important discussion and ground-rules on this topic in the Project page under Programming and System Issues.

Browser compatibility

Before submitting your project Part B please ensure that it operates correctly using the Netscape browser on the Sweet Hall Solaris workstations (e.g., saga's, elaine's, myth's). If there is a compelling reason you cannot make your Web interface work in the Solaris Netscape browser environment (e.g., you really want to exploit certain features in Internet Explorer), you must get "preapproval" from the course staff to use a different browser environment. Send an email message to cs145-staff@cs telling us precisely what browser environment you wish to use. The message must be sent by Sunday April 14 so that we have time to work things out if your browser environment poses a problem for us. You will receive a reply within 24 hours of your message, and you do need to receive a positive confirmation message before assuming that your alternate environment is okay.

When the preapproval process is not followed, projects that have problems on the Sweet Hall Solaris Netscape browsers may lose points, possibly all points if we cannot run your Web interface at all.

What to submit

Prepare a "submission directory" that contains a text file called README and contains your C and HTML files (for C users) or your entire servletdir directory (for Java users). Specifically: In the README file, copy-and-paste the transcript that you generated for Part A of this assignment. C users must also specify, in their README file, a URL for the .html file where a grader can test the project.

You will submit your entire submission directory. The instructions for doing so are the same for all project parts (although directory contents will differ) and are detailed here. While we strongly prefer that you submit only once, if you do submit multiple times then all submissions except your last will be ignored.

Please note: You must follow the directory and file naming schemes and submission instructions exactly. Because we've had significant problems in past years with students not following the submission procedures (causing immeasurable TA trauma), points may be deducted if you do not follow the submission procedures exactly as specified.