CS145 Introductory Information

Index

Finding This Document
Course Goals
Time and Place
Course Personnel and Office Hours
Prerequisites
Course Texts
Email Help
Course Mailing List
Course Project
Homework
Exams
Grading Policy
The Honor Code

Finding This Document On-Line

The class Web page is http://www.stanford.edu/class/cs145. You can find a link to this document and all other course material there. This introductory material can be obtained directly as: http://www.stanford.edu/class/cs145/intro.html Note that this and the outline page are "living documents"; they can be expected to change during the quarter, and the "official" contents are whatever the on-line version has, not the hardcopy handed out at the beginning of the course.

This and other documents may be available in hardcopy form at the "handout hangout" between the A and B wings on the 4th floor of Gates. Whatever is not picked up in class will appear there, but we shall not restock the supply once it is gone.

Course Goals

CS145 is an introduction to the design and use of database systems --- systems that manage very large amounts of data. There are two important approaches to organizing and querying (asking questions about) data: the "relational model," which uses a two-dimensional table (relation) as its primary structure, and the "semistructured model," which uses trees as its fundamental structure. The relational model underlies the major commercial database systems. We cover relational design using the entity-relationship model, followed by an overview of the relational model, how to convert E/R models to relations, and how one uses a relational database system to create a database. SQL (Structured Query Language), the standard query language for relational databases, will be learned and experienced.

We shall also learn some other database languages, both concrete and abstract, including relational algebra, Datalog, ODL/OQL PSM (really Oracle's procedural PL/SQL), and JDBC (the Java interface to SQL databases). In addition, we study recent object-oriented influences on the relational model, including the object-oriented database standard ODL/OQL. The semistructured model is newer, but beginning to have significant influence, especially as people try to integrate data and share data over the Web. We shall learn XML, the standard for structuring data as trees. We also shall meet XPath, a rudimentary query language for XML data, and XQuery, a new, more SQL-like query language for XML. It is not our goal to study database system implementation (e.g., how to build a system that processes SQL queries efficiently). Study of that very important subject begins in CS245.

A Course Outline is available.

Time and Place

The class meets Tuesdays and Thursdays, 2:45 - 4PM, Skilling Aud.

Course Personnel

PersonRoleOfficePhoneOffice HoursEmail
Brian Babcock TA 492 Gates (650) 723-2048 Tuesday 12:30-2:30, Wednesday 1:00-3:00 babcock@cs.stanford.edu
Alan Beck OTC Staff TBD (650) TBD N/A alanlbeck @ yahoo.com
Rohit Varma TA B26B Gates (650)736-1817 Monday 2:30-4:30, Thursday 12:15-2:15 rvarma @ cs.stanford.edu
Anand Rajaraman Instructor 433 Gates (650) 725-4802 After classes taught, until 5:15PM anand @ db.stanford.edu
Jeffrey D. Ullman Instructor 433 Gates (650) 725-4802 After classes taught, until 5:15PM ullman @ cs.stanford.edu
Sarah Weden Course Administrator 419 Gates (650) 725-3358 N/A sweden @ db.stanford.edu

Prerequisites

CS107 (programming languages) and CS103 (introductory CS theory) are expected. Please discuss the matter with the instructor if you do not have something like these courses.

Programming assignments will use the Oracle relational database management system and the C or C++ programming language. Java is an alternative. The Oracle system can be accessed via any of the Unix workstations on the second floor of Sweet Hall, e.g., the "elaines" or "epics." To open an account on these machines, type open at the login: prompt and follow the instructions.

We shall assume that students already are proficient with Unix and C.

SITN students can access the Unix workstations remotely via dial-in (try 650-325-1010) or telnet. If you have access to an Oracle-9 system including PL/SQL and Pro*C, you may use that. We have to be sticky about what system you use not because we love Oracle, but because we are going to be exploring some very specific capabilities of this system, and it will present problems for you and us both, if you do not have all these features. We cannot make any exceptions for problems incurred by using your own computing facilities rather than those provided by Stanford.

Everyone must have a leland account in order to use the class Oracle database system for the PDA. To obtain a leland ID, telnet to open.stanford.edu and use login name open. If you are an SITN student but do not yet have a Stanford ID, you need to talk to your SITN contact and get one before trying to open a leland account.

Textbooks

The text for the course is Database Systems: A First Course (Second Edition) J. D. Ullman and J. Widom. However, if you are planning eventually to take CS245, you should instead get Database Systems: The Complete Book by H. Garcia-Molina, J. D. Ullman, and J. Widom. The former is the first 10 chapters of the latter.

The first two chapters can be Downloaded for Free if you are not sure you are going to take the course.

Since we are going to be using the Oracle system, you may also wish to purchase one of several popular Oracle manuals. Example:

Students may also wish to purchase manual for the SQL standard, although this SQL is not quite identical to the version of SQL supported by Oracle. Three recommended books are:

  1. SQL:1999 - Understanding Relational Language Components J. Melton, A. R. Simon, and J. Gray, Morgan-Kaufmann, 2001.

  2. A Guide to the SQL Standard C. J. Date and H. Darwen, Addison-Wesley, 1999. It is more succinct than the Melton-Simon-Gray book, but I personally find it a more useful summary of the SQL language.

  3. SQL3 Complete, Really Peter Gulutzan and Trudy Pelzer, CMP Books. This book is really fat, but it is fairly complete.

Class Mailing List

Stanford will set up a list cs145-aut0304-students @ lists.stanford.edu. If you are registered for CS145, you should appear on this list automatically. If you want to get class announcements but are not registered for the class, send mail to majordomo @ lists.stanford.edu with a body:

subscribe cs145-aut0304-guests

Email Consulting

If you need a quick answer to a question, try sending email to cs145-aut0304-staff @ lists.stanford.edu. This list forwards to the TA's and instructors, and with luck you'll get a reply in a few minutes.

However, problems and bug reports regarding the OTC system mechanics (not question contents) should be sent to otchelp @ db.stanford.edu.

Please do not use cs145-aut0304-students for questions.

Course Requirements

Project

A feature (or bug?) of CS145 is that everyone writes their own "personal database application" (PDA). You do some work on the project each week, beginning with selecting your application, designing the database, obtaining and loading your data into a real database management system, and finally writing a number of SQL queries, C programs with embedded SQL queries, and exercising other features of SQL.

The first PDA assignment will be due Thursday, Oct. 10, but must be preceded by a review of your design by one of the course staff. Subsequent parts will generally be due on Thursdays, with the exception of Thanksgiving.

No late work will be accepted. However, each student is allowed one extension of at most 48 hours. This amount of time cannot be divided among assignments; it applies to one assignment only.

OTC Homework

We are going to use the On-Line Testing Center (OTC) to give periodic assignments. These will be either multiple-choice questions to answer, or later in the course, SQL, relational algebra, JDBC, or XQuery queries to write. You will be given a week or more to log in and do each assignment. There may be an assignment due before each class; they tend to be small, so the total work should not overwhelm you.

OTC homework is somewhat different from what you may be used to. Although you see multiple-choice questions, there is really an underlying "long-answer" question behind each. You should work out the long-answer question and have the answer in front of you. Sometimes, the "long answer" is really an algorithm for solving instances of the question quickly. The particular choices of answer that you get in effect form a sample of the long answer, so we can check you really were able to work out the problem.

You should work an assignment until you get a perfect score. Each time you open the assignment, you get a different set of answers from which to choose, and the order of the questions may differ. However, if you have worked out the underlying long-answer problem, you should be prepared to identify the correct answer quickly.

Getting Your OTC Account

In order to use OTC, you need to sign up for a user ID. At the first class, we'll pass around sheets that have a selection of ID's and "tokens" (initial passwords). Pick an unused one, cross it out, and write your name clearly next to it. (If you are concerned about privacy, use a nickname that you can give us if we ever need to remind you of your ID.) Then, immediately log on to OTC and change your password. Also, choose a "nickname," which will become your login name. Typically, your first or last name will do, but we cannot allow two people to choose the same nickname, so "first-come-first-served." Note passwords need to be at least 10 letters and have at least one digit.

If you miss signing up in class, look for a sign-up sheet on the door of 433 Gates.

If you are a TV student, send email to or call Ms. Weden, and she will send you the ID and token.

Mechanics of OTC Homework

To find out what assignments are due, and when, either log into OTC or check the Class Home Page. You use your nickname and chosen password the second and subsequent times that you log on to OTC. Look in the frame on the left for Homeworks (later, we'll also be looking for Labs and Tests). Clicking on Homeworks gets you the Homeworks screen, a list of all currently available homework assignments. Choose "Open Homework" for the assignment you want to do.

You may wish to try the homework once, not worrying about what you get right or wrong. When you submit your homework, you generally get some advice about the questions you got wrong, e.g., an explanation of why your answer is wrong, a hint, or an outline of the solution process. We only record the score from the last time you submit the homework, so there is no harm whatsoever in trying and not doing well. We only want you (eventually) to understand how to do all the questions. After the due date, you can look at your most recent submission (choose "View Submissions" from the Homeworks screen) and see a general explanation for the problem, as well as information about anything you still got wrong.

Be advised, however, that there is a 10-minute interval between openings. Thus, you cannot blast away at random, hoping eventually to get the questions all correct.

Exams

Midterm: The midterm will be on November 4. We shall also (probably) use the OTC for the midterm. The format will be somewhat different from homeworks, and everyone will get the same questions and the same choices (although not necessarily in the same order). You can take the exam from wherever you wish, but it must be during the class period. Final: The final exam is on December 10, 7-10PM, at a place yet to be determined but quite possibly Skilling Aud. It will be written, not OTC. All students will have to come to campus, with the exception of remote SITN students, i.e., those whose place of work is more than about 50 miles away (Livermore is "local"; Santa Rosa is "remote," e.g.).

Grading Policy

The approximate weights of the four components are:

ComponentWeight
Project35%
OTC Homework15%
Midterm15%
Final35%

Honor-Code Policy

The basic presumption is that the work you do is your own. Occasionally, especially when working problem sets or writing programs (but never on exams!), it may be necessary to ask someone for help. You are permitted to do so, provided you meet the following two conditions.
  1. You acknowledge the help on the work you hand in.

  2. You understand the work that you hand in, so that you could explain the reasoning behind the parts of the work done for you by another.

Any other assistance by another person constitutes a violation of the honor code and will be treated as such.

If you have any questions about what this policy means, please discuss the matter with the instructor now. We shall ask everyone to acknowledge that they have read the above material on the first homework.