|
|
Bulk-loading refers to the process of loading data specified
in lines of a file directly into a database system, rather than
executing a large series of INSERT statements. We
first describe the data file format for bulk-loading into SQLite
databases, then the loading process.
The data file consists of a sequence of lines, each one specifying
one tuple to be loaded into an existing table of the database. Each
line lists values for the attributes of the table, in the order the
attributes were declared when the table was created. Any string can be
used as the separator between attribute values; by default, it is
"|".
As an example, suppose we are bulk-loading into a table
Student(ID,name), where ID is an integer and
name is a string. If we specify "|" as our
separator, our data file (call it students.dat) might look
like:
123|Alice
456|Bob
789| Carol
As the result of loading file students.dat, the following
tuples are inserted into table Student:
(123,'Alice') (456,'Bob') (789,' Carol')Warning: Notice that the third line of students.dat has a blank after separator "|". This blank is not ignored by the loader, i.e., string "Carol" is loaded with a leading blank, as demonstrated by the third tuple above. It is a common (and frustrating!) mistake to leave blanks before or after separators and then wonder why string values are not matching as you expect.
Bulk-loading can be done via the SQLite special command .import, as follows:
.separator <separator>
.import <loadFile> <tableName>
For example, if we want to load a file name.dat into table Name,
where attributes are comma-separated, we would execute:
.separator ,
.import name.dat Name
Keep in mind that you should use an attribute separator string that will never
appear in your data, so separators aren't confused with data fields.
Note that primary keys and other constraints on the table are enforced during bulk-loading; if any tuples violate these constraints, the load will fail.
Unfortunately, there is no way to specify null values in a load file such that SQLite will actually load a null value. Thus, loading data with null values must be done in a two-step process: first load the data with some stand-in value for null (e.g. 'NULL'), then update the table to replace instances of 'NULL' with null.
Consider a Name table with attributes first, middle, and last, and suppose we want to bulk-load the names "Mary J. Blige" and "Lady Gaga". We create our load file name.dat as:
Mary|J.|Blige
Lady|NULL|Gaga
where "|" is our attribute separator. We then load this data into
table Name and set null values properly by executing:
.separator |
.import name.dat Name
update Name set middle = null where middle = 'NULL';