|
Solutions for Section 19.1
Solutions for Section 19.2
Solutions for Section 19.3
Solutions for Section 19.4
Solutions for Section 19.5
Solutions for Section 19.6
Solutions for Section 19.7
If locking is strict, then the two unlocks must occur, in either order, after the second write action. Thus, there are 6 orders for strict, 2-phase locking.
If locking is not strict, the unlock of B must occur after w1(B), but then the unlock of A can occur in any of the three positions after w1(A). Thus, there are 9 orders if strictness is not required, and 3 of these must be nonstrict.
Each transaction reads only one element, so the question is, does it read from the other? If so, then in order for the schedule to be recoverable, it must commit after the other as well. That is, the two constraints on the schedule can be expressed:
The partial orders that violate (1) look like:
o->o->o->o | ^ v | o->o->o->o
With a little effort, we can find that there are 12 orders for the eight nodes involved.
The orders that violate (2) are similar; they have the constraints:
o->o->o->o | ^ v | o->o->o->o
There are 10 of these. Since there are {8 choose 4} = 70 orders in all, 48 of them are recoverable.
_____________________ | | A | v B T_0 ----> T1 T2 ----> T3 ----> Tf | | A ^ ^ | |__________________| | | A | |______________________________|
Above is the polygraph. We have labeled with A the arcs due to the reading of A, and likewise for B. The two unlabeled arcs are second-phase requirements that the other writers of B --- T1 and T2 --- not interfere with the reading of B. by the final ``transaction'' Tf.
Clearly, T3 must come last in an view-serializable schedule, but there is no required ordering of T1 and T2. Notice that conflict-serializability would require T1 to precede T2, because their writes of B cannot swap places.
Start with the three writes. We can only place r1(A) prior to w1(B). Next, we can place r2(A) in any of the three positions prior to w2(B). At the last step, there are now five places where we can put r3(A) prior to w3(B). Thus, there are 15 conflict-equivalent schedules.
For view-serializable orders, there are the above 15, and another 15 in which w1(B) is placed after w2(B). Intuitively, we can do so because neither T1 nor T2 have their value of B read, so the order in which they write doesn't matter.
T3 ----> T2
At the next step, T2 must wait for T1, which has a lock on C. The waits-for graph is:
T3 ----> T2 ----> T1
Next, w4(A) causes T4 to wait for T1. There is still no cycle, and the waits-for graph is:
T3 ----> T2 ----> T1 <---- T4
Finally, T1 tries to write D. It cannot get the exclusive lock on D that it needs, because T3 holds a shared lock on D. If we allowed T1 to wait, the graph would be:
T3 ----> T2 ----> T1 <---- T4 ^ | |___________________|
Thus, T1 must abort and relinquish its locks. At that time, T2 and T4 can each get the locks they need. When T2 finishes, T3 can proceed.
Initially, T1 locks A and T2 locks B and then requests a lock on A. T2 therefore waits for T1. Meanwhile, T3 starts and requests a lock on B. Thus, T3 waits for T2, and the waits-for graph is:
T3 ----> T2 ----> T1
When T1 requests a lock on A, it would cause a cycle, so T1 is rolled back. Now, T2 completes and releases its locks. The lock on A is given to T1 and the lock on B is given to T3. Next, T3 requests a lock on A, but has to wait for T1. Also, T4 starts up and requests a lock on B, thus being forced to wait for T3. The waits-for graph is now:
T4 ----> T3 ----> T1
The above graph is just like the first, except T4 and T3 have replaced T3 and T2, respectively. As long as there is a supply of transactions like T2, T3, and T4, they can prevent T1 from finishing ever.
Notice that the transactions in the example above request locks on A and B in different orders. Perhaps forcing transactions to request locks in a fixed order will solve the problem. Indeed, if we follow the policy that when a lock becomes available, it is given to a waiting transaction on a first-come-first-served basis, then any transaction that is waiting for a lock will eventually become the transaction that has been waiting for the lock the longest. At that point, the next time the lock becomes available, it will be granted to that transaction, which thus makes some progress. The above observation is the germ of an inductive proof that requesting locks in order plus first-come-first-served will allow every transaction eventually to complete.
However, if the scheduler is able to give locks to any transaction it wishes, then it is easy to make a transaction starve. Suppose T1 wants a lock on A when some other transaction has that lock. As long as there is always another transaction besides T1 waiting for a lock on A, and the scheduler always chooses to give the lock to some transaction other than T1 when the lock becomes available, T1 never makes any progress.
Last, let us consider deadlock prevention by timeout. If the duration before a timeout can vary even slightly, then the example given initially for deadlock prevention by avoiding cycles will work. That is, even though T2 has been waiting slightly longer for a lock than T1, if we allow that T1 might timeout first, then the sequence of events described above could also occur with timeout-based deadlock prevention.
If timeouts occur for any transaction at exactly t seconds from when it first started to wait, then we need a simple modification of the above example. Assume that T1 is faster at requesting its second lock than any of the other transactions. Then, after T1 gets a lock on A and T2 gets a lock on B, T1 next requests its lock on B and starts to wait. Slightly later, T2 requests a lock on A. T3 requests its lock on B just before T1 times out, so when that timeout occurs, T2 completes, gives its A-lock to T1 and its B-lock to T3. When T2 next requests a lock on B and starts to wait, this cycle can repeat with T3 and a new T4 in place of T2 and T3.
Now, consider the value of creating a copy of R at site i. On the positive side, the queries generated at site i that used to have to be answered remotely, at a cost of 10cqi, can now be answered for cqi, a positive benefit of 9cqi.
However, the additional copy of R has to be updated. The cost is dui for the updates generated at site i and U - 10dui for the updates generated at all other sites. Recall that U represents the cost of a remote update for all generated updates, so if we subtract 10dui we get just the cost of the updates that do not originate at i.
In order for the benefit of creating a copy at i to outweigh the additional update cost, we must have
Thus, the optimal selection of sites at which to maintain copies of R is all sites for which the above inequality holds. Roughly, we favor those sites at which there is a lot of activity, but the larger U is (i.e., the greater the update rate), the less likely we are to want to create many copies.
There is one special case: what if the inequality fails for all i; that is, it is not cost effective to create any copies. As long as all the qi's are zero, that's OK. The proverbial ``write-only database'' needs no copies! However, in realistic cases with a heavy update rate, we shall be forced to create one copy. The correct choice is whichever i has the largest value of cqi + dui.
In the second phase, the coordinator can send commit messages to the other two sites in either order. There are thus 6 * 2 = 12 possible message sequences.
just as for the shared + increment system discussed in the section. We also cannot allow a shared and exclusive lock at the same time, which tells us:
Likewise, we cannot have an exclusive and increment lock at the same time, so:
Finally, we cannot have a shared and increment lock at the same time, so:
For example, we could require that any type of lock requires a majority. We cannot allow both s and i to be n/2 or less, because of the last inequality. But we could let one of s and i be 1, while the other, and x, are both n.
To see that this compensation works, we can consider all the possible cases above, and check that the file system, after compensation, is the same as would be the case had f never been written. If f was overwritten before compensation, then we surely leave the file system in the same state as if f were never written. If f is not overwritten, then we have correctly left the file with that name being either f', if it existed, or left the file system with no file of that name.
There are two interesting issues. First, in order for there to be a compensating transaction, it seems we must both remember the time at which the installation occurred (so we can tell whether the file f has been overwritten when the uninstallation occurs), and we must also remember the file f', at least until such time as the file is overwritten, and the new value is itself guaranteed never to be uninstalled (i.e., the saga that wrote it has finished). These features are not commonly available.
Second, note that file systems do not support serializability. Thus, it is entirely possible that the uninstalled file f has been read, and its contents have influenced other files or the behavior of the system.