Threads and Input/Output in the Synthesis Kernel Henry Massalin
and Calton Pu
One-line summary: Repeated applications of Amdahl's law (optimize the
common case) leads to a fast system.
- Principle of Frugality: use the least powerful solution to a given
problem. (Occam's razor for OS.)
- Quaspace: quasi-address space - threads of execution live in these.
They are subspaces of the physical address space, protection done by kernel
with MMU. No virtual memory!
- Quaject: quasi-object - encapsulation of procedures and data.
Threads, hardware abstractions, device servers, etc. Composed of monitors,
queues, schedulers, switches, pumps, and guages, connected with "quaject
interfacer" and created with &quo;tquaject creator" (allocation,
factorization, and optimization).
- Kernel code synthesis: kernel code is dynamically synthesized on a
per-thread basis to be specialized and thus short and fast.
- Synchronization: Synthesis kernel is highly concurrent, so optimize
the common case of synchronization.
- Code Isolation - synthesize code for manipulating parts of data
structures; each thread has its own code for updating its own thread-table
entries, thus no synchronization is needed.
- Procedure Chaining - serializes execution of conflicting threads by
changing return addresses on stack. Eg - signal arriving in middle of
interrupt handler. (Q - how do you know when something is potentially
- Optimistic Synchronization - similar to RAS, but with actual rollback
instead of atomic commit. Make changes assuming no synchronization needed,
if detect conflict at the end (how??) then rollback the changes (how??) and
retry (then what about starvation/perpetual conflict??)
- Optimistic queues - optimize for specific context.
- Single-producer/single-consumer: synchronization only needed if queue is
- Multiple-producer/single-consumer: need synchronization across producers
only. Atomic multiple-insertion done by reserving space in queue atomically,
then filling in with data and setting a valid flag for consumer.
- Single-producer/multiple-consumer: similar
- Multiple-producer/multiple-consumer: need full synchronization, so do it
- Threads: threads carry with them synthesized kernel code - custom
I/O (open), interrupt handlers, error trap and signal handlers. User threads
actually execute in the kernel on kernel calls - quaspaces used to isolate
them. Waiting threads are procedure chained to the appropriate event code
(such as tty driver queue) instead of having a general blocked code that needs
traversal. (Interaction with priorities??)
Context switches - don't save floating point registers unless thread
actually uses them. (Illegal instruction traps to detect this, restart context
switch code in this case.) Lack of a thread dispatcher - chain together
threads directly in a circular queue, using procedure chaining on thread's
synthesized context switch procedure. Vector tables associated with threads
for interrupt handlers.
Signals and interrupts - alter threads TTE to make thread jump to signal
handler then reactived. Error traps - must be synchronous, so actually have
error trap handler copy a kernel stack frame onto the user stack, modify
return address of kernel stack to point to user error signal procedure, and
execute a return from exception.
Scheduling is done by changing the CPU quantum assigned to a thread, and
reordering the ready queu (the circular TTE queue).
- I/O:Data flows from hardware devices to quaspaces along a logical
channel called a "stream". Multiple stages of device servers and filters are
connected with this abstraction, which is a highly-optimized producer/consumer
using optimistic queues.
Signals and interrupt
handling - synthesized interrupt handlers that run in context of currently
executing thread (!). Signals are delayed using procedure chaining.
- passive consumer, active producer (or vice/versa): procedure calls, no
- both passive: need pump quaject
- both active: optimistic queue to mediate.
- Measurements: implemented SunOS emulator on top of synthesis,
compared real SunOS with emulated SunOS. Synthesis was much faster (big
surprise). Synthesis had a much smaller kernel (another big surprise) except
when many threads were running.
RelevanceThis paper proves that Amdahl's law can be taken to an
- Kernel optimized for low-memory, few processes - not relevant for modern
workstations or PCs.
- Comparison with SunOS unfair, as SunOS can do much more than Synthesis
(eg. virtual memory....)
- Kernel written in 680x0 assembly as "fast prototyping language" - poor
methodology, not at all a prototyping language. Makes for a very delicate
- This paper was written in a world unto itself - look at references