CS145 Lecture Notes -- Indexes
- Primary mechanism for users/applications to get improved performance on a database
- Many interesting implementation issues (CS245, CS346)
Plural: "Indexes" or "Indices"
Index on attribute R.A:
(picture of unordered relations and indexed attributes)
- Creates additional persistent data structure stored with the database
- Can dramatically speed up certain operations:
- Find all R tuples where R.A = v
- Find all R and S tuples where R.A = S.B
- Find all R tuples where R.A > v (sometimes, depending on index type)
WHERE name = 'Mary'
- Without index: Scan all Student tuples
- With index: Go "directly" to tuples with name='Mary'
Indexes are built on single attributes or combinations of
Question: What data structures are used for indexes?
WHERE name = 'Mary' and GPA > 3.5
- Index on Student.name
- Index on Student.GPA
- Index on (Student.name,Student.GPA)
Indexes can also speed up joins.
FROM Student, Apply
WHERE Student.ID = Apply.ID
- Index on Apply.ID
- Index on Student.ID
- Both indexes together (in certain cases)
Question: What are the disadvantages of creating an index?
- Choosing which indexes to create is a difficult and very
important design issue. The decision depends on size of tables, data
distributions, and most importantly query/update load.
- DBMS vendors are introducing "physical design advisors."
Generally work very well.
- Input: database and workload
- Output: suggested indexes
For one index on R.A:
CREATE INDEX IndexName ON R(A)
For one index on (R.A1, R.A2, ..., R.An):
CREATE INDEX IndexName ON R(A1, ..., An)
To destroy an index:
DROP INDEX IndexName
Will enforce R.A as a key:
CREATE UNIQUE INDEX IndexName ON R(A)