CS346 - Spring 2011
Database System Implementation

RedBase Part 2: The Indexing Component
Due Sunday April 24

Introduction
The second part of the RedBase system you will implement is the Indexing (IX) component. The IX component provides classes and methods for managing persistent indexes over unordered data records stored in paged files. Each data file may have any number of (single-attribute) indexes associated with it. The indexes ultimately will be used to speed up processing of relational selections, joins, and condition-based update and delete operations. Like the data records themselves, the indexes are stored in paged files. Hence, in implementing the IX component you will use the PF component similarly to the way you used it for Part 1. In the overall RedBase architecture, you can think of the IX and RM components as sitting side by side above the PF component.

The indexing technique you will implement in the IX component is B+ trees. B+ trees will be reviewed in class, they were covered in CS245, and they are discussed in detail in most comprehensive database textbooks. Because a "perfect" implementation of B+ trees turn out to be quite complex, we are allowing some simplifications as discussed in the Implementation Details section below.

All class names, return codes, constants, etc. in this component should begin with the prefix IX. Each B+ tree index can be stored in one paged file from the PF component. Some specific implementation suggestions are given later in this document, but you should be aware that we're giving away fewer details for this component than we did for the RM component.

Note: You can certainly find pseudocode and perhaps even software packages for B+ trees available publicly. In fact, a previous CS346 TA, based on his work for the class, wrote a paper specifying B+ tree deletion algorithms (pdf). You are welcome to use anything you find, as long as you provide proper acknowledgment when you turn in this part of the project. However, we do warn against simply copying available code and then trying to modify it to fit the RedBase specification. That approach is very likely to be more difficult than using available code or pseudocode for algorithmic ideas and reference.

IX Interface
The IX interface you will implement consists of three classes: the IX_Manager class, the IX_IndexHandle class, and the IX_IndexScan class. In addition, there is an IX_PrintError routine for printing messages associated with nonzero IX return codes. To obtain an initial header file with the public method declarations for this interface (along with links for some other files mentioned below) run the setup script described in the
RedBase Logistics document with argument "2" (for project part 2). As usual, all IX component public methods (except constructors and destructors) should return 0 if they complete normally and a nonzero return code otherwise.

IX_Manager Class

The IX_Manager class handles the creation, deletion, opening, and closing of indexes. Your program should create exactly one instance of this class. All necessary initialization of the IX component should take place within the constructor for the IX_Manager class. Note that this constructor takes as a parameter the instance of the PF_Manager class, which you should already have created (refer to the PF and RM documents). Any necessary clean-up in the IX component should take place within the destructor for the IX_Manager class.
class IX_Manager {
  public:
       IX_Manager   (PF_Manager &pfm);              // Constructor
       ~IX_Manager  ();                             // Destructor
    RC CreateIndex  (const char *fileName,          // Create new index
                     int        indexNo,
                     AttrType   attrType,
                     int        attrLength);
    RC DestroyIndex (const char *fileName,          // Destroy index
                     int        indexNo);
    RC OpenIndex    (const char *fileName,          // Open index
                     int        indexNo,
                     IX_IndexHandle &indexHandle);
    RC CloseIndex   (IX_IndexHandle &indexHandle);  // Close index
};

RC CreateIndex (const char *fileName, int indexNo, AttrType attrType, int attrLength)

This method creates an index numbered indexNo on the data file named fileName. You may assume that clients of this method will ensure that the indexNo parameter is unique and nonnegative for each index created on a file. Thus, indexNo can be used along with fileName to generate a unique file name (e.g., "fileName.indexNo") that you can use for the PF component file storing the new index. The type and length of the attribute being indexed are described by parameters attrType and attrLength, respectively. As in the RM component, attrLength should be 4 for attribute types INT or FLOAT, and it should be between 1 and MAXSTRINGLEN for attribute type STRING. This method should establish an empty index by creating the PF component file and initializing it appropriately.

RC DestroyIndex (const char *fileName, int indexNo)

This method should destroy the index numbered indexNo on the data file named fileName by destroying the PF component file used to store the index.

RC OpenIndex (const char *fileName, int indexNo, IX_IndexHandle &indexHandle)

This method should open the index numbered indexNo on the data file named fileName by opening the PF component file used to store the index. If the method is successful, the indexHandle object should become a handle for the open index. The index handle is used to insert into and delete entries from the index (see the IX_IndexHandle methods below), and it can be passed into an IX_IndexScan constructor (see below) for performing a scan using the index. As with RM component files, clients should be able to open an index more than once for reading using a different indexHandle object each time. However, you may make the assumption (without checking it) that if a client is modifying an index, then no other clients are using an indexHandle to read or modify that index.

RC CloseIndex (IX_IndexHandle &indexHandle)

This method should close the open index referred to by indexHandle by closing the PF component file used to store the index.

IX_IndexHandle Class

The IX_IndexHandle class is used to insert and delete index entries, and to force pages of an index's files to disk. To perform these operations, a client first creates an instance of this class and passes it to the IX_Manager::OpenIndex method described above.
class IX_IndexHandle {
  public:
       IX_IndexHandle  ();                             // Constructor
       ~IX_IndexHandle ();                             // Destructor
    RC InsertEntry     (void *pData, const RID &rid);  // Insert new index entry
    RC DeleteEntry     (void *pData, const RID &rid);  // Delete index entry
    RC ForcePages      ();                             // Copy index to disk
 };

RC InsertEntry (void *pData, const RID &rid)

For this and the following two methods, it is incorrect if the IX_IndexHandle object for which the method is called does not refer to an open index. This method should insert a new entry into the index associated with IX_IndexHandle. Parameter pData points to the attribute value to be inserted into the index, and parameter rid identifies the record with that value to be added to the index. Hence, this method effectively inserts an entry for the pair (*pData,rid) into the index. (The index should contain only the record's RID, not the record itself.) If the indexed attribute is a character string of length n, then you may assume that *pData is exactly n bytes long; similarly for parameter *pData in the next method. This method should return a nonzero code if there is already an entry for (*pData,rid) in the index.

RC DeleteEntry (void *pData, const RID &rid)

This method should delete the entry for the (*pData,rid) pair from the index associated with IX_IndexHandle. Although clients of the IX Component typically will ensure that DeleteEntry is not called for entries that are not in the index, for debugging purposes you may want to return a (positive) error code if such a call is made.

RC ForcePages ()

This method should copy to disk all pages associated with the IX_IndexHandle. The index page contents are forced to disk by calling PF_FileHandle::ForcePages for the index file.

IX_IndexScan Class

The IX_IndexScan class is used to perform condition-based scans over the entries of an index.
class IX_IndexScan {
  public:
       IX_IndexScan  ();                                 // Constructor
       ~IX_IndexScan ();                                 // Destructor
    RC OpenScan      (const IX_IndexHandle &indexHandle, // Initialize index scan
                      CompOp      compOp,
                      void        *value,
                      ClientHint  pinHint = NO_HINT);           
    RC GetNextEntry  (RID &rid);                         // Get next matching entry
    RC CloseScan     ();                                 // Terminate index scan
 };

RC OpenScan (const IX_IndexHandle &indexHandle, CompOp compOp, void *value, ClientHint pinHint = NO_HINT)

This method should initialize a condition-based scan over the entries in the open index referred to by parameter indexHandle. Once underway, the scan should produce the RIDs of all records whose indexed attribute value compares in the specified way with the specified value. Parameters compOp and value are exactly as in the RM_FileScan::OpenScan method (including the possibility that compOp=NO_OP and value is a null pointer, indicating a complete scan); please refer to the RM Component document for details. The only exception is that for B+ tree scans, you may choose to disallow comparison operator NE_OP (not-equal). You will need to cast parameter value into the appropriate type for the attribute (or, in the case of an integer or float, copy it into a separate variable to avoid alignment problems), as in the RM component. Also as in method IX_IndexHandle::InsertEntry, if the indexed attribute is a character string of length n, then you may assume that value is exactly n bytes long. As in RM component file scans, optional parameter pinHint is included so that higher-level RedBase components using an IX component index scan can suggest a specific page-pinning strategy for the IX component to use during the index scan, to achieve maximum efficiency. Exploiting this parameter, either now or later, is entirely optional.

RC GetNextEntry (RID &rid)

This method should set output parameter rid to be the RID of the next record in the index scan. This method should return IX_EOF (a positive return code that you define) if there are no index entries left satisfying the scan condition. You may assume that IX component clients will not close the corresponding open index while a scan is underway.

RC CloseScan ()

This method should terminate the index scan.

IX_PrintError

void IX_PrintError (RC rc);

This routine should write a message associated with the nonzero IX return code rc onto the Unix stderr output stream. This routine has no return value.

Implementation Details

Miscellaneous