next up previous
Next: Stanford University Infrastructure Up: Projects Previous: TDS Health Care System

Digital's Virtual Memory System (VMS)

VMS ([#!IB-G874037!#], [#!IB-86414!#], [#!IB-G874038!#]) is an operating system build for the (Virtual Address eXtension) VAX instruction set hardware in the late 1970s. The operating system had a strong following in the 1980s with a decline of usage in the 1990s. The original team was very small, only a dozen or so very experienced individuals. The team quickly grew to several hundred, and remained at that level for over a decade. For performance reasons, the product was built in highly-structured assembly code. VMS was materialized by tens of millions of lines of code and took days to compile. VMS was built to be the leading-edge technology operating system targeted at leading-edge technology customers. People would say, ``Digital was an engineering company, run by engineers, for engineers.'' One only had to make a technology argument to add a new feature to VMS. The Digital engineers built the system for themselves and then shared the system, for a fee, with others.

VMS was one of the first major operating systems to support virtual memory, 32 bit address space, 128 bit floating point number precision, and a complex instruction set.

The software engineering methodology was best-first similar to the WaterSluice. The requirement committee would prioritize requirements and establish a list of requirements to be placed in the next version. For the most part, this list would be placed under change control. The architecture group, the small core of original developers, would establish the design and high level development plans.

The engineering teams would implement the architecture in a two week cycle. One week, called the red week, new development would be added to the code base. On Friday, the code would be frozen and over the weekend the regression suites would test the new features and make sure that already established features would not break. The next week, the blue week, the engineering teams concentrated on bug fixes. This short cycle consisting of build-a-week, followed by test-a-week, would quickly converge to the next version of the operating system.

A large regression test suite was maintained. Not only testing for operating system features, but the regression testing of all applications, numbering several hundred, were also included. One of the best indications of a stable operating system are stable applications on top of the operating system.

The engineering team would always be using their current build. Several months before customer alpha release, the corporation would be placed on the new release. An alpha customer would get a product that has been deployed for several months to thousands of machines. The beta release would include bug fixes to the alpha release. By the time the new version was released it was already relatively high in quality.

If you include development and internal testing, both regressing testing, and internal operational testing, each year thousands of person years were involved in VMS releases and development.

The version release of the product reflected the weekly two phase cycle. Versions ending in even numbers, 1.0, 1.2, 1.4, etc., had new features while versions ending in odd numbers, 1.1, 1.3, 1.5, etc., had bug fixes. The odd versions actually included the new code for the next even version, but this code was disabled or running in shadow mode only. Just the presence of new code, even disabled, would introduced bugs associated with memory management, locking, and race conditions.

There were several decisions which eventually lead to the decrease in popularity of VMS. First, the VAX instruction set was a member of the complex instruction set families. In the late 1970s, instruction sets were getting more and more functionality. This made the compilers easy to write. A compiler was not much more then a pattern matcher with rewrite rules. As compilers got smarter, this allowed for instruction sets to become simpler, and a new family, the RISC, of simple instruction sets was introduced. This was welcome news to the hardware makers because it let the hardware developers concentrate on speed and performance and let the compiler handle complexity of translating algorithms and data structures into sequences of simple instructions.

Second, VMS was written in VAX assembly language. This made translating the millions of line of code a daunting task.

Third, architecture design of VMS had a major flaw. An operating system needs to lock critical sections. In fact, the correctness of just about every line of code is highly influenced by correct locking. VMS used a highly non-standard feature of interrupt priority levels to get the affect of locking. There were only 32 such levels, leading to only 32 major locks. The physical lock table was only 32 by 32 while the effective lock table was more like thousands by thousands. Every lock was highly overloaded in meaning. This did not scale and made VMS more complex. After a decade of development, VMS was moved to the alpha chip set, but only after the alpha chip set was modified to support interrupt priority levels. Had the original team separated out the locking mechanism to a more general architecture, this dependency could have been avoided.

Fourth, only a few people understood the total picture of the locking scheme. The effective lock table for VMS was in their heads. This knowledge was never really captured in writing and really reflected years of experience of working with the architecture of VMS.

See tables 7.5 and 7.6 starting on page [*].

Table 7.5: Survey Part 1: Basic Properties VMS
Question Response
Analysis? none..poor..fair.. good..great
Design? none..poor..fair.. good..great
Implementation? none..poor..fair.. good..great
Testing? none..poor..fair..good.. great
Cycles of ADIT? one..few..several.. frequent
Priority? no.. yes
Versions? no.. yes
Change Order Control? no.. yes
Internal Prototype? no.. yes
External Prototype? no.. yes
Alpha Release? no.. yes
Beta Release? no.. yes
Duration? 20 years
Effort? greater than 10,000 person years

Table 7.6: Survey Part 2: Change Control VMS
Question Response
Did a new introduced requirement ever negatively affect accomplished work? never.. seldom..often..frequent
Did an architectural change ever negatively affect accomplished work? never.. seldom..often..frequent
Are new features introduced up to product release? never.. seldom..often..frequent
Is there a dedicated period of quality assurance before the product is released? no.. yes

next up previous
Next: Stanford University Infrastructure Up: Projects Previous: TDS Health Care System
Ronald LeRoi Burback