From szhan@stanford.edu Wed Jul 12 01:09:55 2000
----------------------------------------------------------

CS245  Summer 2000
 
Assignment 1 Solution
due in class on Monday July 10th
 
State all assumptions.
 
Problem 1.  (40 points)
Consider a 5.25 inch disk with 16 double-surfaced platters rotating at
5280 rpm.  It has a usable capacity of 16 gigabytes (2**34 bytes) stored on
1024 cylinders. Assume 12% of each track is used as overhead.
 
a. What is the burst bandwidth this disk could support reading a
single block from one track?
   
   #Bytes/Track = Capacity / (#cylinders * #platters * 2)
                = 2**34 / (2**10 * 2**4 * 2)
                = 2**19 Bytes
   Time/revolution = 1/5280 minute = 1/88 second

   Suppose a sector has X bytes.
     #sector/track = 2**19 / X
     total time spent per sector+gap =  (1/88 second) / (2**19/X) 
                               = X / (88 * 2**19) second
     time spent per sector = 0.88 * total time for a sector and a gap
                           = X / (100 * 2**19) second
     BB = bytes per sector / time spent per sector
        = X / (X / (100 * 2**19))
        = 100 * 2**19 B/s
        = 100 * 2**19 / 2**20 MB/s
        = 50 MB/s
          -------
b. What is the sustained bandwidth this disk could support reading an
entire track?

    SB = bytes per track / time per revolution
       = 2**19 bytes / (1/88) seconds
       = 2**19 * 88 B/s
       = (2**19 * 88) / (2**20) MB/s
       = 44 MB/s 
        ---------
c. What is the average rotational latency, assuming it is not
necessary to start abt the beginning of the track?
 
    Avg. rotational latency = 0.5 * time per revolution
                            = 0.5 * (1/88) seconds
                            = 5.68ms
                             --------
d. Assuming the average seek time is 10 ms, what is the average time
to fetch a 4-Kbyte (2**12 bytes) sector?
 
    Transfer time = (2**12)/(100 * 2**19) seconds = 0.078 ms

    Fetch time = seek time + avg. rotational time + transfer time
               = 10 + 5.68 + 0.078 
               = 15.758 ms
                -----------

Problem 2. (15 points)
Consider the "new" Megatron 747 disk, whose properties are defined
in Examples 2.1 and 2.3 in the textbook.
Suppose that we know that the last I/O request accessed cylinder 2000.

a.  What is the expected (average) number of cylinders that will be
traveled due the very next I/O request to this disk?
      
    The next IO request could be any of the cylinders 1,2,...8192 with
    equal liklihood. The number of cylinders traveled in these cases would
    be 1999, 1998, ..., 1, 0, 1, 2, ...6192.
    Hence the average number of cylinders traveled would be
       
          (1999 + ... + 1 + 0 + 1 + ... + 6192)/8192
        = ((1999*2000/2) + (6192*6193/2))/8192
        = 2584.54
          -------
b.  What is the expected block access time, again given that the
head is on cylinder 2000 initially?
 
      seek time = 1 + 2584.54/500 = 6.17ms
      access time = seek time + avg. rotation time + avg. transfer time 
                  = 6.17 + 7.81 + 0.5 
                  = 14.48ms
                    -------

Problem 3. (15 points)
Suppose that we are scheduling I/O requests for the new Megatron 747 disk
(Examples 2.1, 2.2). Recall that the average seek, latency and transfer
times are 6.5, 7.8, and 0.5 milliseconds.
Initially the head is at cylinder 4000, and then the following requests come in:
   time =  0; request for block on cylinder 4000 arrives
   time =  2; request for block on cylinder 6000 arrives
   time =  12; request for block on cylinder 1000 arrives
   time = 17; request for block on cylinder  7000 arrives
   time = 26; request for block on cylinder  3000 arrives

a. If we use the elevator scheduling algorithm, at what time
is each request serviced completely?
 
   avg. latency + transfer time = 7.8 + 0.5 = 8.3 ms
   seek time =  1 + n/500 where n = #tracks to travel

   Request 1: Cylinder 4000

              access time =  8.3ms
                since the disk head is already on cylinder 4000
                no seek is needed.
              completion time = 8.3 ms
                                ------
   Request 2: Cylinder 6000

             access time = seek time + 8.3 
                         = 1 + 2000/500 + 8.3   
                         = 13.3 ms
             Completion time = 8.3 + 13.3 = 21.6 ms
                                            -------
   At this time, Request 4(Cylinder 7000) and Requst 3(Cylinder 1000) are
   pending. The disk header will continue outward to serve Request 4(Cylinder 7000)
   and defer serving Request 3(Cylinder 1000) till the reverse pass.
   Request 4: Cylinder 7000

             access time = seek time + 8.3
                         = 1 + 1000/500 + 8.3
                         = 11.3 ms
             completion time = 21.6 + 11.3 = 32.9 ms
                                             -------
   Request 5: Cylinder 3000

            access time = seek time + 8.3
                        = 1 + 4000/500 + 8.3 = 17.3 ms
            completion time = 32.9 + 17.3 = 50.2 ms
                                            -------
   Request 3: Cylinder 1000

            access time = seek time + 8.3
                        = 1 + 2000/500 + 8.3 = 13.3ms
            completion time = 50.2 + 13.3 = 63.5ms
                                            ------
b. If we use a first-come-first-served scheduler,
at what time is each request serviced fully?

   Request 1: Cylinder 4000

              access time =  8.3ms
                since the disk head is already on cylinder 4000
                no seek is needed.
              completion time = 8.3 ms
                                ------
   Request 2: Cylinder 6000

             access time = seek time + 8.3 
                         = 1 + 2000/500 + 8.3   
                         = 13.3 ms
             Completion time = 8.3 + 13.3 = 21.6 ms
                                            -------
   Request 3: Cylinder 1000

            access time = seek time + 8.3
                        = 1 + 5000/500 + 8.3 = 19.3ms
            completion time = 21.6 + 19.3 = 40.9 ms
   
   Request 4: Cylinder 7000

             access time = seek time + 8.3
                         = 1 + 6000/500 + 8.3
                         = 21.3 ms
             completion time = 40.9 + 21.3 = 62.2 ms
                                             -------
   Request 5: Cylinder 3000

            access time = seek time + 8.3
                        = 1 + 4000/500 + 8.3 = 17.3 ms
            completion time = 62.2 + 17.3 = 79.5ms
   
   

Problem 4.  (10 points)
Discuss the pros and cons of using fix-format fix-length records. 
Give at least 2 points for each and justify the points you give.
  
   (issues involved with comlexity of implementation, space usage and 
    efficiency, flexibility etc.)

Problem 5.  (20 points)
You are designing a file system for a medical application.  Each
patient record has 10 fields that always occur (e.g., name, patient
number) and 40 fields that may or may not be relevant or known for a
patient (e.g., number of children given birth to, cholesterol level).
Assume that each of the optional fields is relevant or known for a
particular patient with probability p.  The values for all fields are
a fixed size of 10 bytes.

You are considering two options:
  (i)  A fixed format record.
  (ii) A variable format record where all fields are tagged.
       Each tag is N byte.

a. What is the expected size of a record for each option?
   Your answer may be a function of p and N.
     
      Fixed format record:  (10+40)*10 = 500Bytes
      Variable format record: 10*(10+N)+40*p*(10+N) Bytes
      
b. Given that  N=2 byte, for what range of p values is the fixed-format
   format option best?
        500 < 10*12+40*p*12 
     -> 500 < 120 + 480*p
     -> (380/480) < p <= 1
     ->  19/24 < p <= 1

   Given that  p=0.75, for what range of N values is the variable-format
   option best?
 
        500 > 10*(10+N) + 40*0.75*(10+N)   
     -> 500 > 100 + 10*N + 300 + 30*N
     -> 100 > 40 * N
     -> 100/40 > N
     -> 2.5 > N
     Therefore, N=1 or N=2
 
(c. Think of a third option to group the fixed/optional fields into a
   record and specify your design. ?  Hybrid format...) -> needed?

----------------------------------