CS346 - Spring 2011
Database System Implementation

RedBase Project
Project FAQ (look here first!)

Project PartHandoutDue Date
Paged File Component PF Specification supplied
Record Management Component RM Specification Sunday April 10
Indexing Component IX Specification Sunday April 24
System Management Component SM Specification Sunday May 1
Query Language Component QL Specification Sunday May 22
Personal Extension EX Specification Proposal due Wed. May 11
Demos Thu.-Fri. June 2-3

Supporting Documents
Logistics: Setting Up, Testing, Submission Process, and Grading
Using Valgrind
Policy on memory use
RedBase Statistics Tracker (optional)

Project Overview
The focal point of the course is the RedBase project. RedBase stands for Relational Database, and also alludes to Stanford's color. (We know, Stanford's color is really Cardinal, but CardBase doesn't have as much of a ring to it.) RedBase is a complete single-user relational database management system. It involves a significant amount of coding, and the project must be completed by each individual student -- teams are not permitted. The project is highly structured, but there is enough slack in the specification so that creativity is both allowed and required. The basic project is divided into four parts:

  1. The Record Management (RM) Component: In this part you will implement a set of functions for managing unordered files of database records. This component will rely on a Paged File (PF) component that we will provide. The Paged File component performs low-level file I/O at the granularity of pages.

  2. The Indexing (IX) Component: In this part you will implement a facility for building indexes on records stored in unordered files. Your indexing facility will be based on B+ trees. The Indexing component will rely on the Paged File component.

  3. The System Management (SM) Component: In this part you will implement various database and system utilities, including data definition commands and catalog management. The System Management component will rely on the Record Management and Indexing components from Parts 1 and 2. It also will use a command-line parser, which we will provide.

  4. The Query Language (QL) Component: In this part you will implement RQL -- the RedBase Query Language. RQL consists of user-level data manipulation commands, both queries and updates. The Query Language component will rely on the three components from Parts 1-3, and it will use the command-line parser that we are providing.
In addition to the basic project, each student will design and implement a significant extension to RedBase. We expect that students will get ideas about extensions as the course progresses. Possibilities include aspects of record management, long fields (BLOBs), object management, text management, sorting, indexing, join algorithms, clustering, statistics and query optimization, query language extensions, OLAP, XML, concurrency control, recovery, security and authorization, compression, networking, versioning, external functions, stored procedures, views, integrity constraints, triggers, user and application interfaces, web integration, etc. (We're certainly open to additional suggestions.) Each student will submit a proposal for their project extension. Students will get feedback on their proposal, then will implement their extension as the fifth and final part of the project. Complete projects will be demonstrated to the instructors during finals week.

RedBase I/O Efficiency Contest
As the old saying goes, the three most important aspects of a database management system are efficiency, efficiency, and efficiency. To encourage you to take efficiency into consideration as you develop your RedBase system, we will be conducting a RedBase Efficiency Contest when the QL component is complete. While there are several important efficiency measures in a DBMS, we will focus on I/O performance. We will measure each student's RedBase system on a set of benchmark queries and updates in the RQL language and will count the number of I/O's -- the fewer the better, of course. All students enter the contest automatically when they submit their QL component, unless they prefer to be excluded. The prizes are:

Late Policy
The late policy follows. There will be absolutely no exceptions to this late policy, so please don't even ask! It's crucial that students stay on schedule in this course -- RedBase is a very big project.

Computer Accounts
You will implement RedBase using the Linux machines on the second floor of Sweet Hall (the corn's, myth's, etc.). Directory /usr/class/cs346 will contain files and subdirectories for the class.

Students with access to their own workstations or Linux PCs are welcome to try to use them, but you will need to copy all provided software from the Stanford workstations. While we will do our best to ensure that the code we provide is portable, we cannot guarantee portability across all platforms. Likewise, while the we may attempt to help with platform-specific problems, our focus will be on the Linux machines.

Your programs will be submitted electronically and they will be tested on a corn machine. It will be your responsibility to ensure that your programs compile and run correctly on that platform before submitting them.

More on Programming
We will provide code for the Paged File (PF) component of RedBase and for some commonly-used routines in other components. We will also provide a command-line parser that you will use for Parts 3 and 4 of the project. Specifications for the code we provide, along with specifications for each component that you will implement, will be given as object-oriented interfaces in the C++ programming language. We will help you get started with your programming by providing sample Makefiles, header files, etc. In addition, for some of the project parts we will provide test suites in advance of the due date, although these tests will not be comprehensive.

wordpress hit counter