Talk:Microkernel

From Wikipedia, the free encyclopedia

This article is within the scope of Computing WikiProject, an attempt to build a comprehensive and detailed guide to computers and computing. If you would like to participate, you can edit the article attached to this page, or visit the project page, where you can join the project and/or contribute to the discussion.
??? This article has not yet received a rating on the quality scale.
??? This article has not yet received an rating on the importance scale.

Contents

[edit] Quote

Is this quote true?

"However, the total amount of effort in maintaining any kernel code is sufficiently high that the improvements in maintainablity generated by a microkernel architecture are marginal."

It seems to me that maintainability trades off with performance. Maintainability does not trade with the microkernel design, unless you're trying to pretzel the code into giving unnaturally good performance.

That quote seems a little NPOV, especially considering I run an OS using the Mach microkernel -- Darwin. While I know Darwin is not pure microkernel (other parts of the kernel share address space with Mach), it does not seem to have an adverse effect on kernel maintainability.

(I would have to say, "no" regarding that quote, having used QNX and written drivers for it. The QNX kernel changes very little from year to year, while new drivers and services are added regularly. For example, FireWire and USB support were added with no kernel changes.

I've recently updated the QNX article to add some material on QNX's technology, which has some subtle design features that have escaped the academic microkernel authors. One thing that's very clear about microkernels - the initial design really matters. If you get that wrong, the project never recovers. If you get it right, you don't have to change the kernel much thereafter. --Nagle 06:45, 6 March 2006 (UTC))

[edit] AIX is not microkernel based

AIX is not based on a microkernel. The closest commercial server UNIX to a microkernel is tru64, and that's EOLed.

There was quite a bit of discussion about this at www.realworldtech.com, and all the big commercial UNIXes (solaris, hpux, aix, irix) are NOT based on microkernels.

[edit] Merging with "Kernel" article

A merging of the article "microkernel" with the "kernel" article was proposed because of the possibility of producing "redundant information" between the two. Redundancy did exist at the time of the proposal. Moving the material permanentyl to "kernel" would have resulted in a redirect from "microkernel" and avoidance of such redudancy.

However, many agreed the microkernel concept is "best discussed as a topic by itself" and warranted "independent discussion". One was hesitant to adding material on microkernels to the "kernel" article because it consequently "would seriously weigh down" the latter.

The article originally also had material that slighted microkernels, rather than explaining. Since then, more descriptive material on microkernels found on the L4 kernel and Mach kernel pages has been transferred to the microkernel article. The material at "Kernel" has been moved to the "microkernel" article, and reduandant edits should now be avoided.

The merger has been averted for now, but the article still needs other work.


[edit] Article perspective

IMHO, the article seems to be written from a very anti-MK perspective. The system is more stable "in theory" (but not, presumably, in practice), the file abstraction does not work with restartable servers (so therefore having separate servers must be the wrong approach (and getting rid of an abstraction that doesn't work is out of the question)), and you have to use (obviously horribly complicated) database techniques, etc. Not to mention that microkernels "generally underperform" and even "sometimes dramatically". At the least, I think whoever makes that sort of claims should be able to back them up. Really the only thing I would keep is the list of MK-based operating systems. The rest should be wiped out, and the article restarted. But please keep the article, maybe downgrade to a stub, and save the trouble of undeleting it later. magetoo 22:04, 20 November 2005 (UTC)

I'm going to try to rewrite this, a little at a time. I agree it needs work. Please watch and let me know what you think. --Nagle 03:15, 7 March 2006 (UTC)

The original discussion in the article probably deserves being preserved and kept alonside the helpful technical information you're contributing. Note the article's condition from 20 Noveember, 2005 is much different than now. --71.161.197.151 03:48, 7 March 2006 (UTC)

I've done some rework; about 2/3 of the original material is still there, although I've added subheads and moved it around. I took out the section on "UNIX programs being constructed out of little components with pipes", which really isn't relevant to the microkernel issue. I left in the negative comments on microkernel performance; although I don't generally agree with them, they are conventional wisdom.

Is there anything from November 2005 that needs to come back?

What I'm trying to do here, as someone who's worked with and on microkernels, is to clarify how we got to where we are, why microkernels have problems, and how those problems can be solved.

The style now seems a little choppy. --Nagle 05:02, 7 March 2006 (UTC)

The Unix commentary was useful to explaining the ideological (as opposed to technical) motivations for microkernels. Its inclusion may have been a result of the article's Mach-bias. I'm going to bring back some of it. --65.19.87.53 05:09, 16 March 2006 (UTC)

Today's rearrangement isn't bad, but now the first paragraph of "Background" needs work. A good introductory background paragraph is needed.

The UNIX pipe issue is interesting. Actually, UNIX pipes are a good example of how not to do interprocess communication. You can make one-way pipelines with them, but that's about it. They're badly matched to client-server communication, because they're unidirectional. Classic microkernel design mistake: "What the application needs is a subroutine call. What the operating system usually provides is an I/O operation". The single most important feature of a microkernel is a good mechanism for local client/server communication.

Pipes in UNIX work the way they do because, in PDP-11 UNIX, often only one program could fit in the machine at a time, and pipes were real files, 4K long, treated as circular buffers. When you only have 256K of memory on the machine, you have to make some compromises.

Have another go at "Background", please. If you're stuck after a few days, I'll take a look at it again. --Nagle 06:27, 16 March 2006 (UTC)

Minor revert to change QNX-related interprocess communication note from "performing quite favorably" back to "incurring some extra copying costs", for a more Wikipedia:Neutral point of view. Personally, as a QNX programmer, I'd agree with "performing quite favorably", but that's an opinion. --Nagle 08:42, 17 March 2006 (UTC)

The performance anti-bias has been here for a long time now, I think it ought to be rectified. As was stated, the article seems very anti-MK biased, while MKs like coyotos are addressing many of the IPC-related performance concerns that people have been fighting with. Also, as a side note, the note in the QNX article about QNX 'proving that microkernels will always be outperformed by monokernels' needs to be removed. QNX does implement good synchronous IPC for real-time applications, but justifying any of this would require a treatment on scheduling strategies. Also, while QNX does strict copying for IPC, L4 and lately coyotos (which adopted many L4 strategies for performance) use either registers or in the case of coyotos a sort of 'dedicated sharing buffer' whose ownership is passed back and forth, however only doing this when a capability is called which requires I/O. In fact, I think most of the references to mach (which is widely known in the MK community for its 'let's write in everything for everyone' approach is notorious for security holes and even more widely known for its award winning lack of performance) with more information on capability based MKs and L4. Caps/L4 represent a more accurate view of MK paradigms and the advantages of microkernels than does mach. Mach was designed to be replace monokernels in systems without modification, and as such is practically a monokernel itself. Yes, it deserves mentioning as it is an example of a microkernel, however because it strays so far from MK theory it should NOT be the prime example. Final note, singularity may merit a mention, however it uses self-modifying code which greatly limits its portability (pipelining issues with SMC, and SMC/executable data structures are pretty worthless on RISC), not to mention the fact that as far as I know, MS never really did anything with it. --24.98.124.237 11:55, 9 December 2006 (UTC)Haplo

Hi, guys. A lot has changed since November 2005, it seems. The fact that my own comments don't even make any sense anymore proves that, I think. I just want to thank you all for improving the article to the point where it's actually pretty interesting to read. Thanks! -- magetoo 15:24, 21 January 2007 (UTC)

As said by 24.98.124.237 on December 06, there's still need to discuss MK based on capability-based addressing.--BMF81 13:02, 16 July 2007 (UTC)

[edit] Adding Singularity

Shouldn't Singularity be on the list of microkernels? 7-nov-2005 20:33 CET


I would concur with the addition of Singularity and the merging of the topics until the size of the content dictates otherwise. dru

[edit] Online reference

I STILL don't know which template to invoke, so I apologize in advance. This article was directly quoted in another article at the following web page: http://lowendmac.com/musings/05/1214.html --JohnDBuell 17:06, 15 December 2005 (UTC)

[edit] Relationship with hypervisors

This article and the hypervisor article ought to say more about each other's subjects. IBM's VM and Xen are really microkernels. (Not sure about Parallels Workstation). Also, what Xen calls "paravirtualization", Parallels calls a "hypercall", and IBM VM calls a "DIAGNOSE code" are all really the same thing, a system call to the operating system below.

Nope, neither VM nor Xen are microkernels, they are hypervisors. And para-virtualization isn't the same as a hypercall (although it tends to employ hypercalls). I have added a brief explanation of the similarities with hypervisors at the beginning (and will remove the nonsense about KeyKOS (neither widely deployed nor ever to my knowledge on IBM mainframes) and VM (not a microkernel). Heiser 01:56, 6 August 2007 (UTC)

Actually, KeyKOS did run on IBM mainframes (the 370 family).

[edit] Loose references to Mach in the article

Hi Guys, this is my first comment in such a discussion list. So, sorry if i did it wrong.

I've noticed that the article has some loose references to the Mach system. Later I've discovered that the information under the article about Mach was meged to the Microkernel one. So, probably those references remained out of context.

Regards,

Giulliano

I tried getting them all when I merged, but I'm sure I missed a few. Thanks. --71.241.136.108 07:04, 22 March 2006 (UTC)

The "Microkernel servers" section needs a rewrite in more generic terms. Now the references to Mach are gone, but the Mach approach remains, as if the only one. The description of "ports" is entirely Mach-specific. Not all microkernels do their interprocess communication through the file system; many don't even have a file system inside the kernel. --Nagle 07:55, 22 March 2006 (UTC)

Yes, that section does need that work. In my move, I assumed the introductory material on Mach was the paradigm for all microkernels. --71.241.136.108 16:19, 22 March 2006 (UTC)

Thanks. We may need more "compare and contrast" material. Some elements are common to all microkernels, and some aren't. Then there's the terminology problem. As I mentioned above under "Hypervisors", the same concepts appear in different systems under totally different names. A good goal for Wikipedia articles in areas like that is to talk about the concepts in a unified way, while pointing out the differences between implementations. That's useful to readers. It's tough getting such an overview from vendor literature. --Nagle 18:22, 22 March 2006 (UTC)

I agree. I hope the article is in a form that people would be willing and could easily contribute such material. I don't even think the "Dinosaur book" has good information on microkernels, and there are no textbooks on microkernels. --71.241.136.108 20:27, 22 March 2006 (UTC)

There's Tanenbaum and Woodhull's "Minix 3" book. But really, the best available material on how to design a good microkernel is the documentation for QNX. That's not widely distributed, and it's not heavy on the theory. Getting scheduler and interprocess communication right is absolutely critical to performance. Most of the academic microkernels don't get it right, but QNX does, which is why QNX is used successfully in time-critical real-time systems. Yet about the only reference to this I can give is 'install QNX, start up "help", and look under "synchronous message passing".' I've written a little on this in the QNX article. --Nagle 06:09, 25 March 2006 (UTC)

[edit] Edits by User:Uncle G

Uncle G (talk · contribs) has been adding {{citeneeded}} and {{original research}} tags all over the Microkernel article. The first time he did this, I added cites for most of his tags, and removed the ones where you could find a cite by following nearby links. Then he added even more such tags. This I reverted as vandalism. What do others think? Do we need still more citations? --John Nagle 15:35, 11 July 2006 (UTC)

  • Untrue. You added no citations. You simply removed the tags with edit summaries saying that you had added citations when in fact you had not. Here is one such edit where you did this. This article requires citations. It is not exempt from the Wikipedia:Verifiability and Wikipedia:No original research policies any more than any other article is. There is a lot of analysis and discussion in this article that needs to be sourced. That you consider the requesting of cited sources, on an article that cited no sources at all, to be "vandalism" indicates that you have completely the wrong idea. Please familiarize yourself with the policies and with Wikipedia:Cite sources. Uncle G 16:41, 11 July 2006 (UTC)
Even in the diff you quote, I added two citations, as the record shows. I also removed some material there for which I couldn't immediate find a citation. This explicitly answered all your complaints, as the diff makes clear. The article as a whole has dozens of citations. I'm not happy with the "security" section, which does need cites and a rewrite, so I left your tag on that one.
As a specific example of improper use of a {{citationneeded}}, see your edit to the section on "Kernel bloat". The actual numbers on the sizes of various operating systems are given in the article Source lines of code, in a table, and there's a source for that table (Tenenbaum's book) in that article. Even after I'd added that citation, you put a {{citationneeded}} tag back in. You need to follow the links and read the linked articles before inserting tags demanding citations. Thanks. --John Nagle 17:09, 11 July 2006 (UTC)

The references section is still a mess because of the vandalism, but we'll deal with that. --John Nagle 20:42, 11 July 2006 (UTC)

Still, I agree with John Nagle that {{original research}} is unwarranted: there are piles of literature on these topics out there. Therefore I am replacing the last two such occurences (in sections Performance and Security) with {{citations needed}}. DomQ 09:52, 9 January 2007 (UTC)

[edit] Kernel code size

For example Linux 2.6 contains about 2.5 million lines of source code in the kernel (of about 30 million in total), while Windows XP is estimated at twice that.

What about ReactOS? It would be interesting to compare different implementations of the NT kernel. - Sikon 15:57, 28 August 2006 (UTC)

[edit] Advantages/disadvantage contradiction?

The article says:

However, part of the system state is lost with the failing server, and it is generally difficult to continue execution of applications, or even of other servers with a fresh copy. For example, if a server responsible for TCP/IP connections is restarted, applications could be told the connection was "lost" and reconnect to the new instance of the server.

That sounds like a contradiction to me. Can anyone explain/correct? -- Beland 19:19, 21 March 2007 (UTC)

[edit] Image improvement

  • The kernel should be at the bottom of the diagram, since it is the "low level" code.
  • Shouldn't the thick white arrows should be labelled "IPC" as well? It's not clear what the dotted black arrow is supposed to connect, as one of its endpoints is floating between objects.
  • There should be more than one server represented.
  • It would be useful to put functional labels on the various boxes or subsections thereof, such as "web browser", "network device server", "filesystem server", "scheduler", "memory management", etc.
  • Adding actual system hardware components as separate boxes "below" the kernel would illustrate that applications cannot talk directly to the hardware, but that their access is mediated by the servers and kernel.

-- Beland 20:36, 21 March 2007 (UTC)

I've replaced the image with a more suitable one, in line with rewriting the initial paragraphs. Heiser 23:41, 5 August 2007 (UTC)

I put the old image farther down in the article, simply because it matches the style of the images in articles such as Monolithic kernel and Exokernel. It would probably be good for them to be standardized- it's easy to tell the difference between structures when you're looking at the diagrams side by side. Ideally, the diagrams on the mentioned articles ought to be changed to look like the new one here, but it hasn't happened yet. 130.101.100.104 (talk) 15:35, 7 December 2007 (UTC)

I'm sorry, but this image adds nothing but confusion.

What is it supposed to show? My interpretation of this image (does anyone read it differently?) is that most data exchange is between the kernel and servers, and there is much less between the servers and "software". And what is "software" anyway? The kernel is software, so are "servers", so what does "software" refer to? Even if "software" was replaced by "applications", the image simply sends the wrong message.

In a reasonably-designed microkernel system (and this includes at least L4, QNX and Integrity) there is in fact very little data transfer between the kernel and anything else. The kernel for the most part simply acts as a message-passing engine, i.e. passes data between user-land address spaces. I suspect that the author of the image had the misconception that I/O happens inside the kernel (thus demonstrating that they don't understand microkernels). Microkernel I/O happens in user-level device drivers (which constitute a subset of the "servers").

The image is technically highly faulty and misleading and should be removed.

-- Heiser (talk) 07:36, 8 December 2007 (UTC)

I undid the change as per reasoning above. FWIW, the diagrams on the monolithic kernel and exokernel pages are pretty nonsensical too.

-- Heiser (talk) 05:06, 11 December 2007 (UTC)

[edit] Performance Section

I've recently done some work on the IPC section which had the effect of getting rid of the tags there. Now the only tagged section is performance. However, this needs a more significant overhauld, essentially a re-write IMHO. It contains some outright nonsense. This is probably going to be a couple of full days' work, so I'm not sure when I'll find the time. However, before doing so I wanted to ensure I'm not stepping on someone's toes. -- Gernot Heiser 3:35, 6 June 2007 (UTC)

As threatened I've re-written the performance section. I removed a lot of irrelevant stuff (a microkernel has no device drivers and therefore doesn't page to disk, that's the job of user-level servers; the Unix pipe/sockets stuff is totally irrelevant, and that QNX choses simplicity over performance isn't relevant here either). Heiser 06:06, 6 August 2007 (UTC)

User 134.96.184.219 has (without discussion) replaced my careful "the proof is still in the pudding" statement by the unsubstantiated claim that QNX and Integrity are "high-performance multi-server systems". I'd like to see proof of that. Just because they are used commercially, doesn't mean they are "high-performance" (in the sense that they have essentially the same performance as single-server systems). Furthermore, those systems exhibit very coarse granularity (according to the "Servers" section, QNX as a server for "file systems"). Such coarse granularity proves very little.

To my knowledge of the literature, a demonstration of a high-performance multi-server system is still outstanding. The closest to my knowledge was the IBM SawMill project, but that died before delivering something that could be properly analysed. Heiser 09:45, 11 August 2007 (UTC)

[edit] Security

This section, while well written for an editorial or a school paper, is not very neutral or objective. This section reads more like a report which is attempting to convince the reader of the better security and stability of micro vs monolithic. I don't see how this section can be repaired as regrettably the entire section leads to prove the thesis of micro's dominance in the security realm. While it states potential "downsides" to micro, the section quickly dismisses them and then goes on to argue micro strengths at great length.


I disagree with the above (anonymous) comment. The basic argument in the article is right: from the security point of view a microkernel-based system (in Liedtke's sense) is an implementation of the principle of least authority (POLA). I have other problems with that section: it is wordy, convoluted and repetitive, and full of irrelevant stuff. It could/should be written in about 1/4 of the words. However, as it's essentially correct I won't bother trying to re-write it myself. Heiser 22:23, 5 August 2007 (UTC)

I have moved an earlier paragraph discussing security issues into this section, where it makes more sense (and corrected it in the process). The rest of the section should be dramatically cut down to a paragraph or two, but I leave this to someone else. Heiser 06:24, 6 August 2007 (UTC)

Thinking about it some more, I propose to really junk most of this stuff, and instead reference more up-to-date security work (kernels with formal API descriptions and proofs of implementation correctness). Heiser 22:51, 6 August 2007 (UTC)

As there were no comments, I've done as proposed. Heiser 00:54, 10 October 2007 (UTC)

[edit] Inter-process communication

I've re-written the last paragraph, as it didn't make too much sense. However, I feel that the references to UNIX sockets and POSIX IPC are completely irrelevant to this page and should be removed. Heiser 01:18, 6 August 2007 (UTC)

I've revised the sync/async discussion and removed the irrelevant Unix/Posix stuff Heiser 17:10, 5 October 2007 (UTC)

[edit] Drivers

I've moved this material from Performance (where it didn't fit at all) into a separate section, also removed a lot of irrelevant material. I'm highly sceptical about the claim that MTS had user-level drivers, if someone really thinks this is true they should provide a reference. (The web page linked from the MTS article contains no indication of this. And it's credibility isn't very high anyway, given the blatantly incorrect claim that this was the first operating virtual-memory system.)

That dubious claim about user-level drivers in MTS has been tagged for two weeks without a response. Unless someone comes up with some indication of its veracity, I'll remove it, as I strongly suspect that it is wrong. Heiser 04:44, 25 August 2007 (UTC)

Before deleting "irrelevant material", it is helpful to read up on the subject. I've cited the paper on the Michigan Terminal System from the 1972 Spring Joint Computer Conference. "Using this technique it was possible to completely rewrite and install the DSR (Device Service Routine) for magnetic tapes without any time on a dedicated machine; all development was done under regular production MTS without any adverse effects on the rest of the system." I actually have a hard copy of those proceedings, but you could have found that with Google. The key to this was that IBM I/O channels supported the concept of giving an application restricted control of a peripheral. For example, a program might be limited to certain disk cylinders. --John Nagle 07:21, 25 August 2007 (UTC)
All I asked was for backing up that claim (eg the MTS wikipedia page contains nothing about it). Thanks for adding the reference. However, it isn't fair to say that it could be found with Google. The title yes, but not the actual paper -- in fact, I looked around a fair bit and couldn't find it. Probably requires a physical trip to the library. The 1975 paper in Proceedings of the IEEE is easier to obtain, but doesn't explicitly state that drivers are user level (although reading it one suspects that they are). Anyway, thanks for adding this useful info. Heiser 06:42, 26 August 2007 (UTC)

[edit] Too much L4?

There's quite a bit of material about L4 in the article, which is a research OS used by almost nobody. L4 is interesting, but not that notable. --John Nagle 16:33, 6 October 2007 (UTC)

Used by almost nobody?
L4 (in the OKL4 version) is commercially supported and deployed, running in 10s of millions of mobile phones. Its also used for teaching in many universities (incl it seems just about every Korean university).
Not noteworthy?
L4 has defined the state of the art in microkernels for the last 15 years, and is continuing to do so. Can you name one innovation in microkernels in the last 15 years that didn't originate in L4?
Heiser 00:35, 7 October 2007 (UTC)
Also, the article says that the design of QNX IPC follows that of L4. Actually, QNX dates from the 1980s, and L4 from the 1990s, so it's the other way round. --John Nagle (talk) 00:59, 10 April 2008 (UTC)
QNX was around longer than L4, but fast IPC was pioneered by Liedtke and L4. At the time, L4's IPC was five times faster than QNX's (see Table 2 in [Liedtke 93], note that it was still called L3 then, but it's really L4). Presumably, QNX has learned from L4 in the 15 years since. The description of QNX IPC in the QNX page (not invoking the scheduler) is exactly the scheme introduced by Liedtke under the name "direct context switch" in the 1993 SOSP paper, as one of several mechanisms to make IPC fast. It was considered novel enough to be accepted in the most prestigeous OS conference. Heiser (talk) 01:25, 14 April 2008 (UTC)

[edit] OS - small programs called servers

"This allows the operating system to be built from a number of small programs called servers" We need a more details on this. Anyone? Amberved (talk) 15:42, 21 March 2008 (UTC)

[edit] Cleanup required?

The article was tagged with "cleanup" since July last year. It has undergone massive revisions since. I have now removed the cleanup tag. If someone thinks that further cleanup is required I suggest they are more specific. Heiser 00:58, 10 October 2007 (UTC)

[edit] What's "dubious"

I'm removing the "dubious" tag in the performance section, as I don't see what the issue is with this perfectly obvious statement. Heiser (talk) 22:03, 9 April 2008 (UTC)

For one thing, the actual paper cited [1] doesn't say that. In fact, in section 2.1, the cited paper says that "grafting" extra functionality onto a monolithic kernel can be 6 to 80 times worse than L4 interprocess commmunication. --John Nagle (talk) 00:55, 10 April 2008 (UTC)
For one thing, the paper you link isn't the paper cited in that paragraph of the article. What you linked is a paper from HotOS'97, the one cited is from the Sep'96 issue of CACM.
That paper doesn't explicitly state it either, for the simple reason that it's totally obvious. The explanation is actually given in the article: two mode switches (monolithic kernel) vs four mode switches and two context switches. 2*A is always less than 4*A+2*B for positive A, B. It's so obvious that even Tanenbaum (Modern OS, 3rd ed) doesn't bother to state it explicitly. The closest I could find is the sentence "the main problem with this approach, and with microkernels in general, is the perfromance hit all the extra context switches cause."
The other part is also obvious: the kernel always has access to all memory, hence a monolithic system can directly access client buffers. In a microkernel system, the client buffer is in a different address space than the server, and not normally accessible.
So I'm really at a loss what's "dubious" here. Whoever put that tag there in the first place obviously has problems with the most basic OS concepts.
Heiser (talk) 01:38, 14 April 2008 (UTC)

[edit] Motivations

I'm just doing some revision for an OS subject, and as far as accessibility goes, I found it hard to work out why you would chose to implement a system with microkernels. After reading through the article it seems that security is the main reason, but it took me a while to work that out and I can't tell if there are any other advantages. Motivations/advantages should at least be mentioned in the introduction and possibly have its own section if there are enough. —Preceding unsigned comment added by 128.250.99.115 (talk) 01:43, 11 June 2008 (UTC)