CS145 Lecture Notes -- Indexes

Primary mechanism for users/applications to get improved performance on a database
Many interesting implementation issues (CS245, CS346)

Singular: "Index"
Plural: "Indexes" or "Indices"

Index on attribute R.A:

Creates additional persistent data structure stored with the database
Can dramatically speed up certain operations:
- Find all R tuples where R.A = v
- Find all R and S tuples where R.A = S.B
- Find all R tuples where R.A > v (sometimes, depending on index type)

(picture of unordered relations and indexed attributes)

Example

    SELECT *
    FROM Student
    WHERE name = 'Mary'

Indexes are built on single attributes or combinations of attributes.

Question: What data structures are used for indexes?

1.

2.

Example

    SELECT *
    FROM Student
    WHERE name = 'Mary' and GPA > 3.5

Could use:

Indexes can also speed up joins.

Example

    SELECT * 
    FROM Student, Apply
    WHERE Student.ID = Apply.ID

Could use:

Question: What are the disadvantages of creating an index?

1.

2.

3.

Choosing which indexes to create is a difficult and very important design issue. The decision depends on size of tables, data distributions, and most importantly query/update load.
DBMS vendors are introducing "physical design advisors."
- Input: database and workload
- Output: suggested indexes
Generally work very well.

For one index on R.A:

  CREATE INDEX IndexName ON R(A)

For one index on (R.A1, R.A2, ..., R.An):

  CREATE INDEX IndexName ON R(A1, ...,  An)

To destroy an index:

  DROP INDEX IndexName

Will enforce R.A as a key:

  CREATE UNIQUE INDEX IndexName ON R(A)