For both on-campus and SITN students: Programming work must be submitted electronically by midnight on the Wednesday that it is due. Programming work submitted after the deadline but less than 24 hours late (i.e., by Thursday midnight) will be accepted but penalized 10%, and programming work submitted more than 24 hours but less than 48 hours late (i.e., by Friday midnight) will be penalized 30%. No programming work will be accepted more than 48 hours late. One emergency chit is will be deducted automatically for programming work turned in up to 24 hours late, and two chits for programming work turned in up to 48 hours late. These chits are in addition to any chits deducted for late written work. For further details please see the official course late policy.
Student(ID, name, dept, status) // status = "grad" or "undergrad" // ID is a key RA(ID, advisor, dept) // (ID,advisor) together are a key TA(ID, course, dept) // (ID,course) together are a keyand the following query: Find the names of all graduate students who are neither an RA nor a TA.
(a) Write this query in relational algebra.
(b) Write this query in SQL.
Now consider the following query: Find the names of all graduate students who are an RA or a TA in a department other than their own.
(c) Write this query in relational algebra.
(d) Write this query in SQL.
Emp(ID, salary) // ID is a key, salary is a keyand the following two queries:
(a) Express Query 1 in relational algebra.
Hint:
Join Emp with itself to find salaries that aren't the
highest, then use the difference operator.
(b) Express Query 1 in SQL using the max aggregation operator.
(c) Express Query 1 in SQL without using the max aggregation operator.
(d) Express Query 2 in relational algebra. If needed, you may use the following programming constructs in addition to relational algebra:
(e) (optional) Express Query 2 in SQL. Do not use the programming constructs from part (d).
city1 | city2 | cost | airline |
---|---|---|---|
SF | Denver | 300 | Frontier |
SF | Denver | 350 | United |
Denver | SF | 250 | United |
Denver | SF | 250 | Frontier |
Denver | Chicago | 250 | American |
Chicago | NY | 250 | Delta |
Denver | NY | 500 | American |
Denver | NY | 400 | TWA |
SF | NY | 750 | United |
(a) Give a single SQL query that returns the cost of the cheapest nonstop flight between each pair of cities. For example, the result over the above relation instance should be:
city1 | city2 | cost |
---|---|---|
SF | Denver | 300 |
Denver | SF | 250 |
Denver | Chicago | 250 |
Chicago | NY | 250 |
Denver | NY | 400 |
SF | NY | 750 |
(b) Give a single SQL query that returns the cheapest cost of flying between each pair of cities assuming we are willing to stop up to two times en-route. For example, by stopping once (in Denver), we can get from SF to NY for $700 instead of $750. With this example data, we could stop twice (in Denver and Chicago), but that would be more expensive ($300+$250+$250 = $800).
(c) Is it possible to write a single SQL query that returns the cheapest cost of flying between each pair of cities regardless of the number of stops? If so, give the query. If not, briefly explain why.
Advised(Advisor, Student, Year)A tuple (A,S,Y) in Advised specifies that advisor A advised student S who graduated in year Y. Assume that Student is a key for this relation.
(a) Consider the following SQL query, which finds all advisors who advised a student who graduated in the same year that Hector Garcia-Molina (HGM) or Jennifer Widom (JW) graduated.
SELECT Advisor FROM Advised WHERE Year IN (SELECT Year FROM Advised WHERE Student = "HGM" OR Student = "JW")Try to write an equivalent SQL query that does not use any subqueries. If there are any circumstances in which your "equivalent" query can produce a different answer from the query above, please explain.
(b) (optional) Write a SQL query that finds the advisor(s) with the longest advising span, i.e., with the longest period from their earliest advisee to their latest advisee.
A | B |
---|---|
1 | 2 |
1 | 2 |
3 | 4 |
5 | 6 |
5 | 6 |
5 | 6 |
(a) Suppose we are interested in selecting all but one copy of each tuple appearing in R. For example, on the above data the query result would be:
A | B |
---|---|
1 | 2 |
5 | 6 |
5 | 6 |
Can you write this query in SQL? If so, give the query. If not, briefly explain why.
(b) Suppose instead we are interested in actually deleting one copy of each tuple in R. That is, we want to write a SQL DELETE statement such that after the statement executes, table R has the contents:
A | B |
---|---|
1 | 2 |
5 | 6 |
5 | 6 |
Can you write such a DELETE statement in SQL? If so, give the statement. If not, briefly explain why.
(a) Familiarize yourself with the Oracle relational DBMS by reading the document Getting Started With Oracle (make sure to read the most recent version), logging into Oracle, changing your password, trying some of the examples in the document, and experimenting with the help command. You don't need to turn anything in for this part.
(b) Create relations for your PDA based on your final relational schema from PDA Part 2. In addition to creating the appropriate attributes and types, please declare keys for your relations; see Creating a Table With a Primary Key from Getting Started With Oracle. Many of the attribute types supported by Oracle are listed on page 286 of the textbook. If you have an attribute that represents a date and/or time, you may want to look at our help page on Oracle Dates and Times.
Turn in a script showing an Oracle session in which your relations are created successfully. Also show, for each relation, the result of the sqlplus DESCRIBE command once the relation has been created: for a relation T, type "DESCRIBE T;". Please see Creating scripts and what to turn in below for details.
(c) For each relation in your PDA, create a file containing a few (approximately 5-10) records of "realistic" data. Then use the Oracle bulk-loading facilities to insert those records as tuples into your relations. Refer to the document Using the Oracle Bulk Loader for file format and how to load records into Oracle. (Here too make sure to look at the most recent version.)
Turn in a listing showing the contents of the files you created, the successful loading of the data into Oracle, and the execution of "SELECT *" commands to show the contents of each relation. (Again, information is available in Getting Started With Oracle.)
(d) Write a program in any programming language you like that creates large files of records for each of your PDA relations. If you have available real data for your PDA, then your program will need to transform the data into files of records conforming to your PDA schema and to Oracle's load format. The rest of you will need to write a program to fabricate data: your program will generate either random or nonrandom (e.g., sequential) records conforming to your schema. Note that it is both fine and expected for your data values - strings especially - to be meaningless gibberish. The point of generating large amounts of data is so that you can experiment with a database of realistic size, rather than the small "toy" databases often used in classes. The data you generate and load should be on the order of:
If your application naturally includes relations that are expected to be relatively small (e.g., schools within a university), then it is fine to use some small relations, but please ensure that you have relations of the sizes prescribed above as well. When writing a program to fabricate data, there are two important points to keep in mind:
Turn in your program code for generating or transforming data, a small sample of the records generated for each relation (5 or so records per relation), and a script showing the successful loading of your data into Oracle.
Components (b), (c), and (d) of this project part each tell you what should be recorded in the script that you turn in. In this and all subsequent project parts, the material you turn in should be clearly formatted and delineated, and should include comments for any aspects that are not crystal clear. Poorly assembled or documented material will not receive full credit, even if it is correct. You also will not receive full credit if you turn in your entire large data files (or large query results in later assignments) when we ask for small samples. Other than comments, truncation, and simple formatting, it is an Honor Code violation to edit scripts before turning them in. Please see http://www.stanford.edu/class/cs145/submit-README.txt for details on electronic submission.
For the duration of the project, we suggest that you establish some kind of routine that includes reloading your database from the files created in this project part each time you want to get a "fresh" start with Oracle. Remember to delete the contents of each relation (or destroy and recreate the relations) before reloading. Otherwise, unless there is a declared key (or you take APPEND out of your control file), Oracle will happily append the new data to your old relation, causing your relation size to double, triple, quadruple, etc. To get rid of a table called T, issue the command:
drop table T;If you want to get rid of all tuples in T without deleting the table itself, issue the command:
delete from T;