Solutions to Sample 3


Problem 2
---------
  (a) The outcomes are, in order, ABORT, IMPOSSIBLE, ABORT, COMMIT, COMMIT, COMMIT, IMPOSSIBLE.
  (b) The outcomes are, in order, BLOCK, COMMIT, ABORT, IMPOSSIBLE.


Problem 3
---------
   The problem statement states that the hash function partitions the relations into "exactly 
   4 parts". It is not clear whether this only means that there are non-zero number of R and S 
   tuples sent to each Ji or whether R and S are in fact split equally into 4 pieces. I am going
   to assume that the latter is the case.

  (a) For the worst case, we assume that the tuples of R and S are ordered in such a way that
      all tuples that hash to a given value are bunched together. In other words, in step 1,
      C1 first sends 1000 tuples one after another to J1, 1000 tuples to J2, and so on. Similarly 
      with step 2. In addition, we assume that the join degrades into a cartesian product at each 
      of the Ji's. Specficially, we assume that the 1000 tuples of R that hash to a given Ji have 
      identical join attribute values and that this value is identical to the join attribute values
      of the 100 tuples of S that hash to the same Ji. Then,
  
      For step 1: Each tuple gets sent from C1 to the Ji's at the rate of 1 every 2 msec, i.e., at
                  the rate of the slower Ji's.  So 2 * 4000 = 8000 msec.

      For step 2: Each tuple of S takes 2 msec to be received, hashed, and checked against the hash
                  table. In addition, because of our assumption, each tuple generates 1000 matches.
                  So an additional 1000 msec to send answers to C2. So 1002 msec per S tuple or 
                  1002 * 400 = 400800 msec overall.

      Hence, 400800 + 8000 = 408800 msec.

  (b) For this case, we assume that the tuples are ordered in such a way that C1 sends tuples in
      round-robin fashion to J1, J2, J3 and J4. In other words, tuples number 1,5,9, etc. hash to 
      J1; tuples 2,6,10, etc. hash to J2; and so on. We assume this for both R and S. In addition, 
      we assume that the result of the join is empty (note that we can assume whatever we want about
      join selectivites); so each probe of the hash table using an S tuple generates no matches. Then,
 
      For step 1: The Ji's can together receive tuples at the rate of 4 * 1/2 = 2 per msec whereas
      C1 can only pump out tuples at 1 per msec. So 1 * 4000 = 4000 msec.
   
      For step 2: Once again, since the answer set is empty, each Ji can receive at 1 tuple per 2 msec
      which means that the Ji's together can receive at 4*0.5 = 2 per msec. However, C1 can only
      send out tuples at 1 per msec. So 1 * 400 = 400 msec.

      Hence, 4000 + 400 = 4400 msec.