Stanford Distributed Systems and Networks Quals
1996 Problem Set
Problem 1 (30 Points)
Let's say you've been given a bunch of wireless network
devices. The wireless devices communicate via radio
frequencies. The radios communicate peer-to-peer only. Multiple
pairs of radios near each other can communicate simultaneously
without much interference, since each conversation may use a
different frequency. The radios synchronize with each other by
scanning frequencies listening for beacons from other radios
that contain information about what frequencies to use to
communicate with them. There is no shared broadcast frequency,
so only one radio at a time can hear another radio. The
frequency a radio uses is not fixed, and in fact it changes
periodically to avoid hogging a particular frequency for too
long or to avoid interference on a frequency already in
use. Radios send data to each other in the form of packets, and
each radio packet has a header that specifies the source and
destination physical network addresses. A radio will ignore an
incoming packet that doesn't have its physical address as the
destination.
Some researchers have already written device drivers
for the radios so that you can run TCP/IP over them in a manner
similar to running TCP/IP over an Ethernet subnet. However, they
have been unable to implement ARP and need your help. ARP (the
Address Resolution Protocol) provides a mechanism for a host to
figure out another host's physical network address given its
Internet (IP) address. This is done via a broadcast on an
Ethernet subnet. The broadcasted request packet contains the IP
address. The response contains the appropriate physical network
address and can be sent directly back (unicast) to the
requesting host. Generally the host whose IP address is
provided is the one to answer the request, but you can change
this if you'd like.
PART A: Without broadcast ability, how would you
implement ARP over this radio network? Explain the motivation
behind your design choices. You can assume the following things:
- Each host knows its own IP address and the physical address
of its radio interface.
- Each host is able to obtain from its radio a list of the
hardware addresses of other radios that are within hearing
range. It cannot by itself obtain addresses that are out of its
range.
- Hosts can forward packets to other hosts.
- All the radios are on the same IP subnet.
- You can't change how the radios operate.
PART B: How would you implement ARP if a host were not
able to obtain a list of the hardware addresses of its
neighbors? For this question assume that all radios have the
hardware address of a particular host, and that all of them are
within range of this particular host. Explain the motivation
behind your design.
Problem 2 (25 Points)
Consider the following two distributed file systems. Each has a
server and some workstation clients of the file server. In both
systems, clients may cache file data on a block-by-block basis
in their main memories.
Cache coherency for file system A:
If a client wants to read data for a file and it does
not have the data cached, then it must execute a read operation
from the server. If the data is cached, then it may or may not
have to reread the data from the server, as follows. The client
will check when it last obtained the file's "last-modified
date" from the server. If it has been over 30 seconds since
the client obtained this information, then it re-requests it
from the server. If the file's last-modified date is more recent
than when the client last refreshed its cached data for the
file, then the client invalidates that data in its cache and
rereads the data from the file server. (The client will also
have to check the last-modified data if it wants to overwrite
only part of the data it has cached for a file, since it may
need to re-read the cached data before modifying it.)
Any data the client writes to a file is cached but is
simultaneously written through to the server's disk. The write
does not complete until the data has safely reached the server's
disk. The server updates the file's last modified data when it
does the write. The server is stateless, in that it does
not keep track of any information about which clients are
caching which files.
Cache coherency for file system B:
The server is stateful, in that it does
keep track of information about which clients are caching which
files. The server implements "multiple readers or single
writer" cache coherency amongst clients using tokens, as
follows. A client with a write token is allowed to read
and write a file in its cache. A client with a read token
is allowed to read a file from its cache. A client without a
token must obtain the correct type of token from the server
before it may cache a file. If a client wishes to read a block
of a file, and the file is not in its cache, the client makes a
read request of the server. The server checks if some client has
a write token for the file. If so, it revokes the client's write
token and asks that client to flush back any modified data it
has cached for the file. The server can then respond to the
client asking to read the file. It gives the reading client a
read token and sends it the data it desires. The client may
continue to read from the data it has cached for the file for as
long as it has the read token for that file.
If a client asks to write a file, and it does not have a
write token, it makes a write request of the server. As above,
the server then checks to see if any client has a write
token. If so it asks that client to flush back its modified data
and revokes its write token. The server also revokes any read
tokens for the file. Clients whose read or write tokens are
revoked must invalidate their cached data for the file. The
client receiving the write token can then read and write the
file in its cache for as long as it has the write token. Data it
modifies is not written from its cache unless the server revokes
its write token or unless the client runs out of room in its
cache and needs to write data back to the server before removing
it from the cache. The server permits multiple read tokens to
exist for a single file. However, only one write token for a
file may exist. While a write token exists, then no read token
may exist for that file.
Compare the following aspects of the two file
systems:
- The amount of cache coherency provided: Will clients always
see the most recent data that has been written for a file
someplace in the system, or is it possible for a client to read
stale data?
- Performance: How do the systems compare in terms of latency
of file I/O operations? Consider the amount of write-sharing of
files between clients in the workload.
- Crash recovery overhead: What must the server do when
recovering from a crash in each of the systems to continue
providing whatever level of cache coherency is provided under
normal operation?
Problem 3 (10 Points)
This question looks at what layer of a system is appropriate for
performing certain tasks. For each of the following cases,
please comment on whether you feel the work is being performed
at the appropriate level of the system and why. If you think the
answer is "it depends," then explain what it depends
upon and why.
Case 1: You've got a cellular telephone and you'd like
to protect your top-secret calls form eavesdroppers. The
cellular telephone has an encrypt button that encrypts data over
the link radio link from the phone to the base station.
Case 2: You're using some radio network devices that
may lose lots packets depending on the amount of radio
interference in the area. To handle this, a radio acknowledges
every packet it receives. If the sending radio doesn't receive
an acknowledgement, it re-sends the packet.
Case 3: You're using the TCP protocol to transfer large
files. TCP guarantees reliable, sequential packet delivery. If
packets are lost due to congestion, the protocol will retransmit
them. Packets have sequence numbers so that they can be
ordered. The application you're using also has a file checksum
that it transfers with the file, so that the destination
application can compare this checksum with the file it
receives. If the checksums don't match, the file may be
corrupted and the application can ask for the file to be
retransmitted.