CS145 Assignment #3
Due Wednesday, October 21, 1998
Step 3 of Your PDA (Personal Database Application)
Note1: see Recording
Your Session in the on-line Getting Started With Oracle
document for a guide to preparing output to hand in with your assignment.
It will be useful for this and subsequent PDA parts.
Note2: Oracle is not being backed up. You need to save anything you need
long-term in the leland file system.
- (a)
-
(10 pts.)
Reconsider your relation schema in the light of the theory of normalization
and BCNF. Remind us of your chosen database schema. For each of your current
relations, tell whether its relation schema is in BCNF. If not, then either
redesign your schema so the relation is in BCNF, or give a rationale for
leaving in in non-BCNF form (e.g., the amount of redundancy introduced is
minimal, and splitting the relation would cause some reasonable queries to
become multirelational). Indicate your final choice of design, whether or not
you choose to decompose one or more relations.
- (b)
-
(15 pts.)
Write an SQL database schema for your PDA, using the CREATE
TABLE commands described in the handout
Getting Started With Oracle.
Pick suitable datatypes for each attribute.
Page 286 of the text gives you the principal options regarding types.
Hand in a printout of the commands you use to create your database
schema (it is a good idea to keep this file for the balance of the
course).
Show the response of sqlplus to a request to describe each of your
relation schemas.
For example, to see the schema for relation Foo type
DESCRIBE Foo;
- (c)
-
(10 pts.)
Execute about five INSERT commands to insert tuples into one of
your relations.
Show the response of sqlplus and the relation that results when
you issue a SELECT * command.
Again, the information on how to do this step is in
Getting Started With Oracle.
- (d)
-
(25 pts.
Develop a substantial amount of data for your database and load it into
your relations using the SQL load command. See
The Oracle Bulk Loader for
information on how to bulk-load data.
To create the data,
write a program in any programming language you like that
creates large files of records in a format acceptable to the Oracle bulk
loader,
then load the data into your PDA relations. If you are using real
data for your PDA, your program will need to transform the data into
files of records conforming to your PDA schema. The rest of you will
write a program to fabricate data: your program will generate
either random or nonrandom (e.g., sequential) records conforming to
your schema. Note that it is both fine and expected for your data
values--strings especially--to be meaningless gibberish. The point
of generating large amounts of data is so that you can experiment with
a database of realistic size, rather than the small ``toy'' databases
often used in classes. The data you generate and load should be on
the order of:
- At least two relations with thousands of tuples
- At least one relation with hundreds of tuples
If the semantics of your application includes relations that are
expected to be relatively small (e.g., schools within a university),
it is fine to use some small relations, but please ensure that you
have relations of the sizes prescribed above as well. When writing a
program to fabricate data, there are two important points to keep in
mind:
- (1)
- Although you have not (yet) declared keys in your
relations, in many cases you probably know that an attribute or set of
attributes in a relation will serve as a key. If so, be sure not to
generate duplicate values for these attributes.
- (2)
- Your PDA almost certainly includes relations that are
expected to join with each other. For example, you may have a
Student relation with attribute courseNo that's expected to
join with attribute number in relation Course. In
generating data, be sure to generate values that actually do
join--otherwise all of your interesting queries will have empty
results! One way to guarantee joinability
is to generate the values in one
relation, then use the generated values in one relation to select
joining values for the other relation.
For example, you could generate course
numbers first (either sequentially or randomly), then use these
numbers to fill in the courseNo values in the student
relation.
Turn in your program code for generating or transforming data, a small
sample of the records generated for each relation (5 or so records per
relation), and a script showing the loading of your data into Oracle.
Problem Set
You will be working with the following schema for the rest of this
assignment.
Employee(SSN, name, salary, DNo)
Department(DNo, DeptName, MgrSSN)
Project(PNo, location, ProjName)
HourLog(SSN, PNo, hours)
The Employee relation provides a list of employees with their SSN, name,
salary, and department number (DNo). The SSN is unique for each employee.
Each employee belongs to only one department. The Department relation
contains a list of the departments for the company. Its schema includes a
unique department number called DNo. It also includes the name of the
department (DeptName) and the social security number of the department's
manager (MgrSSN). Each department has a only one manager. The Project
relation includes a unique project number (PNo), location and the project
name (ProjName). An employee can be assigned to any number (including
zero) projects. Each project has at least one person assigned to it.
Finally, the HourLog relation lists for each project the number of hours
of work for each employee who is assigned to that project. The key of
this relation is SSN and PNo.
All 10 queries are 4 points each.
PART A: Write the following queries in Relational Algebra. You may use assignment
of intermediate results for long queries.
-
Find the name and the SSN of everyone who works more than 100 hours on
one particular project.
-
Find the name and SSN of everyone who works for department number 1
and also work on project number 2.
-
Find the name and the SSN of everyone who works on at least two
projects.
-
Find the SSN of everyone who is not working on any project.
-
Find the name and the SSN of everyone who works on every project.
-
Find employees and their managers who work the fewest hours
for each project. A tuple of (EmployeeSSN, EmployeeName, ManagerName, PNo)
should be listed for every project.
PART B:
Write queries 1-4 from above in SQL: