(a) Since the blocks are stored sequentially, we only need a pointer to
the
first block. The first block of the index will have a block pointer to the
first file block and key values. The rest of the index block will only
have
key values.
We can fit
ë(2048-8) / 12)
û = 170
keys in the first index block and
ë2048/12
û = 170
keys in the remaining blocks.
The total number of blocks we need is
1+
é(500000-170)/170
ù=2942 blocks.
(b) Since the blocks are not contiguous, we need a block pointer for
every block. Each entry will be
8 bytes(pointer) + 12 bytes(key) = 20 bytes.
We can fit
ë(2048 / 20)
û = 102 key pointer pairs in a
block.
The total number of blocks we need is
é500000 / 102
ù=4902 blocks.
(c) Since blocks are not contiguous, we need a pointer for each block. We
can
fit
ë(2048 / 20)
û = 102 key pointer pairs in a
block.
The total number of blocks we need is
é4902/ 102
ù=49 blocks.
(d) This question can be answered in two ways depending on whether you
want
a block pointer or a record pointer for each key. If duplicate keys
exists, it is
better to use record pointers instead of block pointers for simplicity in
implementation.
Solution with block pointers:
Size of entry = 12 (key) + 8 (pointer) = 20 bytes
ë2048 / 20)
û = 102 key pointer pairs in a
block.
Number of blocks needed:
é10000000/ 102
ù=98040 blocks.
Solution with record pointers:
Size of entry = 12 (key) + 9 (pointer) = 21 bytes
ë2048 / 21)
û = 97 key pointer pairs in a
block.
Number of blocks needed:
é10000000/ 97
ù=103093 blocks.
(e) Since the index blocks are contiguous, we only need a pointer to the
first block and the key values.
We can fit
ë(2048-8) / 12)
û = 170
keys in the first index block and
ë2048 / 12
û = 170
Solution with block pointers:
The total number of block we need is
1+
é(98040-170)/ 170
ù=577 blocks.
Solution with record pointers:
The total number of block we need is
1+
é(103093-170)/ 170
ù=607 blocks.
Note 1: Diagrams by courtesy of Frank Luo.
Note 2: Alternate solutions are possible depending on how the keys are redistributed after splitting a node. All valid solutions were given full credit.
(a) Root should have at least 2 children. Each child are now leaf and has at least ë(n+1) / 2û record pointers.
Minimum number of record pointers = 2 * ë(n+1) / 2û
(b) Again, root has 2 children. Each non-leaf node has at least é(n+1) / 2ù pointers. So there are 2 * é(n+1) / 2ù leaf nodes. Each leaf node has ë(n+1) / 2û record pointers.
Minimum number of record pointers = 2 * é(n+1) / 2ù * ë(n+1) / 2û
(c) Similar to (b), with j levels,
Minimum number of record pointers = 2 * ( é(n+1) / 2ù ) j - 2 * ë(n+1) / 2û
(d) From (c), a B+ tree with j levels has at least 2 * ( é(n+1) / 2ù ) j - 2 * ë(n+1) / 2û records.
r ³ 2 * ( é(n+1) / 2ù ) j - 2 * ë(n+1) / 2û
j – 2 £ ( log r – log 2 – log ( ë(n+1) / 2û ) ) / log ( é(n+1) / 2ù )
j £ 2 + ( log r – log 2 – log ( ë(n+1) / 2û ) ) / log ( é(n+1) / 2ù )
Common errors:
* Think the root is like normal non-leaf nodes and has é(n+1) / 2ù pointers, or it can have only 1 pointer to a
child node.
* Calculation error. Especially: log 2 or ln 2 are not 1. Though
log2 2 is, the 1 should not be dropped -- log (a * b) = log a +
log b.