Spring (operating system)

From Wikipedia, the free encyclopedia

Spring was an experimental microkernel-based object oriented operating system developed at Sun Microsystems in the early 1990s. Using technology substantially similar to concepts developed in the Mach kernel, Spring concentrated on providing a richer programming environment supporting multiple inheritance and other features. Spring was also more cleanly separated from the operating systems it would host, divorcing it from its Unix roots and even allowing several OS's to be run at the same time. Development faded out in the mid-1990s, but several ideas and some code from the project was later re-used in the Java programming language libraries.

Contents

[edit] History

Spring started in a roundabout fashion in 1987, after Sun decided to move to UNIX System V from their earlier BSD Unix-based SunOS. Seeing the need to provide some sort of compatibility for their existing customers, Sun started looking at ways to merge the two systems together. During their licensing discussions, they found AT&T was equally interested in a version of Unix combining the two. Both companies decided it was also a good opportunity to bring the system "up to date" as well, and decided to form a joint project to create an object-oriented version of Unix. However, after only a few meetings, the project died.

Sun decided to keep the team together and instead explore a system on the leading edge. In addition to combining Unix flavours, the new system would also be able to run just about any other system as well, and do so in a distributed fashion. The system was first running in a "complete" fashion in 1993, and produced a series of research papers. In 1994 a "research quality" release was made under a non-commercial license, but it is unclear how widely this was used. The team broke up and moved to other projects within Sun, using some of the Spring concepts on a variety of other projects.

[edit] Background

The Spring project started soon after the release of Mach 3. In earlier versions Mach was simply a modified version of existing BSD Unix kernels, but in Mach 3 the Unix services were separated out and run as a user-space program like any other, a concept Mach referred to as a server. Data that would normally be private in the kernel under a traditional Unix system was now passed between the servers and user programs using an inter-process communication (IPC) system, ending in ports that both programs held. Mach implemented these ports in the kernel, using virtual memory to move data from program to program, relying on the memory management unit (MMU) and the copy on write algorithm to do so with reasonable performance.

In its ultimate development, an OS on Mach would consist of a number of such servers, each handling a specific task. Examples would include the file system or network stack. The operating system server in such a system would be quite small, providing services unique to that OS, and forwarding most other calls to other servers. Since the OS was running on top of single set of common servers, several OS servers could be run at the same time, allowing a single system to "natively" support DOS, Unix and other operating systems at the same time.

This capability was particularly exciting to companies like IBM, who were already supporting several different systems, and saw Mach as a way to combine these with common underlying code. In fact this was not so easy. Mach made several decisions at a low-level that made any system running on it Unix-like to some degree. Most notable was a security system that was modelled on fairly inflexible inherited model of Unix programs. Additionally the IPC system proved to be a major performance problem, although the nature of this issue didn't become clear until later. The performance was so poor that many commercial projects to port existing operating systems to Mach, notably IBM's Workplace OS, were eventually abandoned.

[edit] Rationale

Although Sun was also interested in supporting multiple operating systems, their needs were nowhere as pressing as IBM or Apple. By this point in time they had already moved platforms from their early 68k-based machines to their SPARC-based lineup, and their UNIX System V-based Solaris operating system was taking over from their BSD-based SunOS. Sun's concerns were somewhat more subtle, keeping developers interested in what was really "just another Unix", and allowing their system to scale downwards onto smaller devices such as set-top boxes. A microkernel-based system would be particularly useful in this latter role.

Spring concentrated on "programmability"; making the system easier to develop on. The primary addition in this respect was the development of a rich interface definition language (IDL), which exported interfaces with considerably more information than the one used in Mach. In addition to functions and their parameters, Spring's interfaces also included information about what errors can be raised and the namespace they belong to. Given a proper language, programs, including operating system servers, could import multiple interfaces and combine them as if they were objects native to that language -- notably C++. Some time later the Spring IDL was adopted with minor changes as the CORBA IDL.

Spring also explored a number of specific software advances in file systems, virtual memory and IPC performance. The result was a single Unix-like system with much better performance than Mach. Some of these changes are detailed below.

[edit] Description

It should be noted that the Sun engineers used non-standard terminology for a number of common components, which makes discussing the system somewhat confusing. For instance, Mach tasks are referred to as domains, ports as doors and the kernel as the nucleus.

[edit] The Nucleus

The Spring kernel was divided into two parts, a virtual memory system, and the nucleus. Although the nucleus represents only one portion of the Mach kernel, the two are analogous enough to consider to be the same thing.

The Spring kernel includes only the most basic functionality and state needed to support user-side applications. Primarily this includes state to maintain lists of running programs (domains) and their threads, as well as the communications links between them (doors).

The Spring kernel is not multi-threaded. Normally this would preclude it from use in realtime settings, but it is not clear that is the case. Normally kernels need to be threaded in order to ensure a long-running task, say disk access, won't tie up the system when a more important call is received. Under Spring the kernel almost immediately hands off the vast majority of requests to the servers, so under this model it is only the servers that, in theory, need to be threaded.

[edit] IPC Model

One major difference between Mach and Spring was the IPC system. In Mach, the system was arranged as a set of one-way asynchronous pipes between programs, ports, similar to Unix pipes in concept (which is where it came from). In programming, however, the most common method of communications is the procedure call, or call/return, which Mach did not support directly. Call/return semantics could only be supported via additional code in higher-level libraries based on the underlying ports mechanism, thereby adding complexity.

Spring instead directly supported call/return semantics in the basic communications system. This resulted in a change of terminology from ports in Mach, to doors in Spring. Doors were known to the kernel only, programs were handed a "handle" to the door with an identifier that was unique to that program. The system worked similarility to ports for the initial message; messages set to a door were examined by the nucleus in order to find the target application and translate the door handle, but then the nucleus also wrote down small amounts of information from the caller in order to be able to return data quickly. This sped up the return by about 40%.

Additionally the Mach model was asynchronous -- the call would return if and when the server had data. This followed the original Unix model of pipes, which allowed other programs to run if the server was busy. However for a call/return system this has serious drawbacks, because the task scheduler has to run to select the next program to be serviced. Hopefully this was the server you were trying to call, but it didn't have to be. Under Spring, IPC is synchronous, control is immediately passed to the server without running the scheduler, improving the round trip time in the common case when the server can immediately return.

Under Mach, the virtual memory system, supported by the memory management unit (MMU), was expected to provide a lightweight solution to "copying" data, by simply mapping the same data in memory into the two programs. In reality this solution was not at all efficient, as many MMU's had design features that made this mapping either slow, or simply impossible.

Unlike Mach's one-size-fits-all solution to IPC, Spring used a variety of methods to physically pass data between programs. One of these, the bulk-path, was basically identical to Mach's ports and messages, but in practice the bulk-path was the least common message type. For smaller messages Spring provided the vanilla-path, which directly copied the data from one space to another, something that proved to be faster than memory mapping in the real world for less than 5k of data.

Most interesting of all was the fast-path, which allowed for extremely fast invocations -- at least when running on SPARC-based platforms. The fast-path used a unique "half-trap" to avoid much of the context switching overhead that plagued Mach systems. Instead of saving out all of the processor state, the normal procedure in the case of a trap into the kernel, Spring only saved out the top 16 SPARC registers -- a number that was defined by specific implementation details of the SPARC. The other portions of the register stack were rendered invisible to the receiver using the SPARC's WIM instruction, providing some level of security. The fast-path is considerably more similar to a classic procedure call within a single application, which uses register windows on the SPARC, adding some MMU work to move the context from one program to another.

The fast-path was only available for calls passing simple values that didn't have to be translated, no door references for instance, with up to 16 values in total. Although this would seem to be quite limiting, the fast-path was actually used by the vast majority of calls in Spring, generally over 80% of the calls and about 60% of the returns. Returns often respond with large blocks of data, for instance, a disk block, explaining why the returns more often used the other IPC systems.

On 32-bit SPARC V8 systems, a complete round-trip call using the fast-path took just over 100 instructions, making it many times faster than a typical Mach call. It remains unclear whether or not the fast-path could be implemented on other machines, so the overall performance improvement of Spring is difficult to compare with Mach, which was typically measured on IA-32 systems. Specifically a full syscall took under 20 µsec on a 486DX-50 for existing BSD Unix systems, and 114 µsec under Mach. This led to a performance hit of 50% or more, and doomed most Mach projects. In contrast, Spring boasted a IPC time of only 11 µsec on a SPARCstation 2, using the fast-path.

[edit] Virtual Memory

Another key area of improvement in Spring was the implementation of the virtual memory (VM) system, also part of the kernel. Virtual memory is a system that ties together the physical RAM in a machine, the MMU and the disk system to create the illusion that every program on the system has its own block of RAM equal to the maximum the machine can support. For the majority of recent history that has been 2 or 4 GB (31 or 32 bits’ worth), and only recently has it been practical to buy a machine with that much physical RAM. The VM system creates the illusion of more by using the hard disk as a backing store, an area of much slower memory used to offload inactive portions of RAM.

In traditional Unix systems VM is a part of the kernel, as is the disk and memory handlers it ties together. Under Mach the decision of where to place the VM system is not so obvious - although the kernel is in control of RAM and the MMU, the disk handlers are part of external client programs. To solve this problem Mach 3 introduced a new two-layer VM system, with control of the actual VM system in the kernel, who would then ask an external client-space pager to interact with the disk system to physically copy memory around. Unfortunately this proved to be a serious performance issue, requiring several trips in and out of the kernel (and the context switches along with it) as the various layers of the VM system called each other.

Spring had the advantage of being able to examine what went wrong with the Mach model and fix it. The result was a much more cleanly separated system of address spaces in programs, mapped by the VM into various memory objects, which were in turn managed by a pager for backing store handling. When a program made a request for data the request was passed to the VM system in the kernel, which would find the appropriate pager and ask it to create and set up an appropriate memory object. In exchange the pager was passed a cache manager from the VM, which was responsible for keeping track of clean/dirty status of the local cache of that memory object. Implementation details added considerable complexity to this model, but most of this was hidden. In the end the basic system had pagers that were in charge of the memory, and address spaces which were in charge of the caches. The two had well defined interfaces allowing them to pass commands back and forth to keep their data in sync.

This split in duties led to one very real performance improvement. Since programs could share the memory objects, and microkernel systems like Spring are based on the idea of copying memory around, Spring allowed programs sharing memory in this fashion to share it in the VM system as well. Thus under Mach if a network file server is handing data to a program both programs will end up using up memory in the VM system, whereas under Spring the two would naturally share the same memory objects, as the pager implementing that memory object would simply return another handle to the same memory. Only inside the VM would they be considered different objects, and would be handled by separate cache managers. Therefore the data would only be cached in RAM once. In theory this could lead to considerably better real-world RAM usage.

Additionally, the use of external pagers with a well defined API allowed the system to be cleanly separated when this was needed. Spring also allowed programs themselves to state which pager would be best suited to their needs, including themselves, allowing Spring programs to easily implement private VM systems for known workloads. For applications like file servers, web servers and database management systems, custom VM's and file systems often lead to dramatically improved performance.

[edit] Name Service

Most operating systems include a variety of naming services. The most basic example is a file system, in which the files are internally referred to by a "handle", a small number, while a separate directory gives the files names that the users interact with. The same name/identifier split is true for many other parts of the typical Unix system; printers are named in the etc/printcap file, small numbers and strings in the environment variables, and network locations in DNS. Each of these systems provided its own names, with a custom API, making the different objects appear completely different even in concept.

Other systems had attempted to add naming systems to existing Unix systems, but generally these were "covers" over the existing functionality that simply collected up all the names from these various services and presented them in one collection. Due to the fact they relied on knowing about the underlying system layout they tended to be rather inflexible, not making it easy for new services to be added. These seem to have seen little use.

Only in a completely new operating system could one hope to provide a universal service. For instance, Plan 9 used the file system as a universal naming service; everything from printers to windows could be accessed by name through the file system. This is an extension of the original Unix concepts, one that had slowly disappeared as more and more functionality had been added over the years.

Interestingly Mach did not have a naming service of any sort for its ports. This proved to be a serious problem, because programs had to know in advance what servers they had to call in order to ask the kernel to provide a port. This meant that replacing functionality was much more difficult than it should have been; a new printer server needed to sit on the same ports as the old one for instance: there would be no way to run two side-by-side for development. If ports were instead referred to by name, servers could sit on different ports and simply use the same name. Needless to say, the addition of a name server was considered highly important under Spring.

Spring's approach essentially inverted the Plan 9 system: under Spring the file system was one example of a server that used the single unified name service. The same service could be used to name files on disk, environment variables, hardware devices, programs and even objects inside programs. The system was hierarchical, only the system namespace was directly supported, by a server that started at boot time. Other servers would then "bind" the names they knew into the system, the printer server would produce a list of printers, the file system would bind in the directories of attached disks. In this way a mapping of all the objects on the system was built up, potentially at runtime, and could be accessed in a file-like fashion very similar to Plan 9. All of these could be accessed using a single API, although the system also provided a variety of stub libraries to make it appear as classical services as well, notably in the Unix emulation server.

The name service was also the central location for security and permissioning. Since doors, the real accessors in Spring, were handed out by the name service, the server included a complete access control list-based permission checking system. So in addition to providing permissions on the file system, under Spring any object could be controlled using the same set of permissions and user interface. Contrast this with Windows NT for instance, which includes about a dozen permissioning systems (file system, DCOM, SQL access, IIS, etc.), all of which have to be set up separately. In order to improve performance, the system included the concept of trust, allowing nameservers to assume requests from other servers were valid. For instance, if a user asked the file server to access a file, the system nameserver would pass along the request to the file system, which would immediately honor it. However since the user was not known, the ACL's would be check against the file being accessed.

Groups of related names were known as contexts. Contexts were also names, and thus similar to the file system concept of a directory. Users could build their own contexts out of seemingly unrelated objects; printers using completely separate drivers (servers) could be collected into a single list, a file could have different names in different places (or for different users), or more interestingly a single domain could be built up containing every personal file in it for searching purposes. In this manner Spring allowed file directories to be "unioned", a useful feature lacking from traditional Unixen.

Spring did not include a built-in object persistence system, but the name service was persistent and could be used to find objects in this sort of manner. To some degree the series of servers started during boot time provided a persistent name space that survived boots, as they copied their names into the same server. In theory the system could allow the name server to provide a "lazy launch" system, not starting the networking server until someone requests it for instance, but it does not appear it included this functionality. In fact the separation of name spaces would allow this to be separated out to the service that actually implemented the naming of doors, making imlementation considerably easier.

[edit] File System

Recall that the Spring VM allowed any program to define what pager it should use. Additionally the Spring system was based on a single universal naming system. These two concepts were combined to produce the Spring file system.

Key to the Spring file system's operation was tight integration with the VM system. Since it was "known" that the VM system would be managing the local cache of the data from the file system, the file system was reduced to a command structure only, and was its own pager.That is, the file system was responsible for loading and saving data from memory objects when needed, but caching of that data would be handled for it by the VM. As mentioned before, this means that under Spring a file only exists in RAM in one place, no matter how it is being shared by the programs in the system.

Spring used two sorts of file systems, a local file system which was similar to most common Unix systems, as well as a caching file system for network devices. The caching system demonstrates the utility of Spring's VM/pager split, using the same physical memory from the VM that it would have to normally, the CFS short-circuited all read requests to the local cache, and did lazy write-backs every 30 seconds to the source file system. This would be particularly notable if common Unix directories were being loaded over the network, the normal setup for labs of workstations. Most Unix systems use similar caching mechanisms for the same performance reasons, but would end up using RAM twice, once in the cache, and again in the programs using it. The CFS also cached names from the remote system, making the initial directory traversal and open requests much faster.

The Spring file system is also a name service context provider, lazily mapping directories from the on-disk structure into new contexts in the name service. These could then be accessed using the universal naming API, or alternately via a Unix emulation library that presented them as a traditional unix file system.

Note that Spring's use of the term file system is somewhat confusing. In normal usage the term refers to a particular way to physically store files on a disk.

[edit] Unix emulation

Spring also needed to support existing Unix applications, the basis of Sun's business. To do this, Spring also shipped with two key extensions, a Unix process server that mimicked a full Unix, and a re-write of the standard libc library called libue that redirected Unix kernel requests to various servers. For instance, a Unix application that required file or network services would be directed to the associated Spring server, while one that wanted to list the currently running programs would be directed to the Unix process server. The process server was also responsible for handling signals, a concept that had no analog under Spring -- nor did it really need it, signals are essentially an inflexible single-purpose IPC mechanism.

Running Unix applications under Spring required that they be re-linked against libue, and the system shipped with the majority of basic Unix utilities and an X11 server pre-rolled. However this process was neither invisible nor guaranteed to work; Spring documents note that "many" applications will run unmodified, but fails to mention what sort of problem areas the developer should expect.

[edit] Subcontracts

Although not directly related to Spring per-se, the Sun engineers working on the project found that existing mechanisms for supporting different flavors of calls was not well defined. In order to provide a richer interface, they developed the concepts of subcontracts.

[edit] Other Systems

Sun have added a "Unixified" version of Doors to Solaris.

In the years since the Spring system work ended, work on operating systems in general has essentially ended. With the market quickly stratifying into a Windows and Linux dominated world, there appears to be only niches open to any other system. Additionally the poor performance of Mach 3 seems to have taken the wind out of the sails of many projects.

Nevertheless there have been some newer systems. One in particular, the L4 microkernel, shares a number of features with Spring's kernel. In particular they also use a synchronous call/return system for IPC, and have a similar VM model. L4 has, so far, concentrated almost solely on the kernel itself though, there is nothing analogous to Spring's naming service, security model or file system.

[edit] References

In other languages