Why MySQL?
The accounts we were given on the Oracle server had a 50MB quota.
Unfortunately, the database I wanted to load consisted of
about 100 works, at 2 to 5 MB each. Factoring in indicies and such, the
total disk space required to store this data was well over 1 GB.
Figuring that the CS department might balk at a request for so much
space for a class project, I decided to run the project on my own
box. And not wanting to pay for an Oracle license, I decided to run an
open-source alternative, MySQL and
port my SQL code to work under MySQL.
I originally thought that the task of porting to MySQL would be relatively simple. Unfortunately, MySQL does not support some essential functionality. Specifically related to this project, MySQL doesn't support subqueries, views, stored procedures, triggers, and foreign keys. Some workarounds are discussed in MySQL's list of missing functionality. Additionally, I was unable to find a way to hint the MySQL planner, so certain operations that should have used indicies apparently were not doing so, causing severe performance problems. As a result, I was forced to change the database schema slightly, and the MySQL schema I used for creating the web application was as follows:
CREATE TABLE Author ( name varchar(128) NOT NULL PRIMARY KEY, birth integer NOT NULL, death integer NOT NULL, description text ); CREATE TABLE AuthorCriticized ( url varchar(255) NOT NULL, name varchar(128) NOT NULL, PRIMARY KEY (url, name) ); CREATE TABLE Criticism ( title varchar(255), critic varchar(128), url varchar(255) NOT NULL PRIMARY KEY, synopsis text, publication varchar(255), keywords varchar(255) ); CREATE TABLE PageContainsWord ( word mediumint NOT NULL, page mediumint NOT NULL, PRIMARY KEY (word, page) ); CREATE TABLE Period ( region varchar(64) NOT NULL, era varchar(32) NOT NULL, PRIMARY KEY (region, era) ); CREATE TABLE PeriodLived ( region varchar(64) NOT NULL, era varchar(32) NOT NULL, name varchar(128) NOT NULL, PRIMARY KEY (region, era, name) ); CREATE TABLE PeriodWritten ( region varchar(64) NOT NULL, era varchar(32) NOT NULL, work integer NOT NULL, PRIMARY KEY (region, era, work) ); CREATE TABLE Syn ( root integer NOT NULL, child integer NOT NULL, PRIMARY KEY (root, child) ); CREATE TABLE UniquePage ( id mediumint NOT NULL PRIMARY KEY, work smallint NOT NULL, page smallint NOT NULL, global_start integer NOT NULL, global_end integer NOT NULL ); CREATE UNIQUE INDEX uniquepage_idx1 ON UniquePage(global_start); CREATE TABLE Word ( id integer NOT NULL PRIMARY KEY, word varchar(32) NOT NULL ); CREATE INDEX word_idx1 ON Word(word); CREATE TABLE WordDetails ( gpos integer NOT NULL PRIMARY KEY, context char(24) ); CREATE TABLE WordInstance ( word mediumint NOT NULL, gpos integer NOT NULL, prev mediumint NOT NULL, PRIMARY KEY (word, gpos) ); CREATE TABLE Work ( id integer NOT NULL PRIMARY KEY, title varchar(255), year integer, length integer, gutenberg_id varchar(20), include integer ); CREATE TABLE WorkCriticized ( url varchar(255) NOT NULL, work integer NOT NULL, PRIMARY KEY (url, work) ); CREATE TABLE WrittenBy ( work integer NOT NULL, name varchar(128) NOT NULL, PRIMARY KEY (work, name) );
In retrospect, I probably should have used PostgreSQL, which is also open-source, and supports subqueries and most of the functionality that Oracle supports. But I didn't know about it at the time, so I ended up doing the MySQL port.