Operating system-level virtualization

From Wikipedia, the free encyclopedia

Operating system-level virtualization is a server virtualization method where the kernel of an operating system allows for multiple isolated user-space instances, instead of just one. Such instances (often called containers, virtualization engines (VE), virtual private servers (VPS) or jails) may look and feel like a real server, from the point of view of its owner.

On Unix-like operating systems, this technology can be thought of as an advanced implementation of the standard chroot mechanism. In addition to isolation mechanisms, the kernel often provides resource management features to limit the impact of one container's activities on the other containers.

Uses

Operating system-level virtualization is commonly used in virtual hosting environments, where it is useful for securely allocating finite hardware resources amongst a large number of mutually-distrusting users. System administrators may also use it, to a lesser extent, for consolidating server hardware by moving services on separate hosts into containers on the one server.

Other typical scenarios include separating several applications to separate containers for improved security, hardware independence, and added resource management features. The improved security provided by the use of a chroot mechanism, however, is nowhere near ironclad.[1]

OS-level virtualization implementations that are capable of live migration can be used for dynamic load balancing of containers between nodes in a cluster.

Overhead

This form of virtualization usually imposes little or no overhead, because programs in virtual partition use the operating system's normal system call interface and do not need to be subject to emulation or run in an intermediate virtual machine, as is the case with whole-system virtualizers (such as VMware ESXi and QEMU) or paravirtualizers (such as Xen and UML). It also does not require hardware assistance to perform efficiently.

Flexibility

Operating system-level virtualization is not as flexible as other virtualization approaches since it cannot host a guest operating system different from the host one, or a different guest kernel. For example, with Linux, different distributions are fine, but other OS such as Windows cannot be hosted. This limitation is partially overcome in Solaris by its branded zones feature, which provides the ability to run an environment within a container that emulates an older Solaris 8 or 9 version in a Solaris 10 host. (a Linux branded zone was also announced and implemented for some Linux kernels, but later abandoned).

Storage

Some operating-system virtualizers provide file-level copy-on-write mechanisms. (Most commonly, a standard file system is shared between partitions, and partitions which change the files automatically create their own copies.) This is easier to back up, more space-efficient and simpler to cache than the block-level copy-on-write schemes common on whole-system virtualizers. Whole-system virtualizers, however, can work with non-native file systems and create and roll back snapshots of the entire system state.

Implementations

Mechanism Operating system License Available since/between Features
File system isolation Copy on Write Disk quotas I/O rate limiting Memory limits CPU quotas Network isolation Partition checkpointing
and live migration
Root privilege isolation
chroot most UNIX-like operating systems varies by operating system 1982 Partial[2] No No No No No No No No
iCore Virtual Accounts Windows XP Proprietary/Freeware 2008 Yes No Yes No No No No No ?
Linux-VServer
(security context)
Linux GNU GPL v.2 2001 Yes Yes Yes Yes[3] Yes Yes Partial[4] No Partial[5]
LXC Linux GNU GPL v.2 2008 Partial[6] Partial. Yes with Btrfs. Partial. Yes with LVM or Disk quota. Yes Yes Yes Yes No No[7][8][9]
OpenVZ Linux GNU GPL v.2 2005 Yes No Yes Yes[10] Yes Yes Yes[11] Yes Yes[12]
Parallels Virtuozzo Containers Linux, Windows Proprietary 2001 Yes Yes Yes Yes[13] Yes Yes Yes[11] Yes Yes
Solaris Containers Solaris and OpenSolaris CDDL 2005 Yes Partial. Yes with ZFS Yes Partial. Yes with Illumos.[14] Yes Yes Yes[15] No[16] Yes[17]
FreeBSD Jail FreeBSD BSD 1998 Yes Yes (ZFS) Yes[18] No Yes[19] Yes Yes No Yes[20]
sysjail OpenBSD, NetBSD BSD no longer supported as of 03-03-2009 Yes No No No No No Yes No ?
WPARs AIX Proprietary 2007 Yes No Yes Yes Yes Yes Yes[21] Yes[22] ?
HP-UX Containers (SRP) HPUX Proprietary 2007 Yes No Partial. Yes with logical volumes Yes Yes Yes Yes Yes ?
Sandboxie Windows Proprietary/Shareware 2004 Yes Yes No ? ? ? ? ? ?
Docker Linux (using LXC) Apache License 2.0 2013 Yes Yes Not directly Not directly Yes Yes Yes No No

See also

References

  1. "How to break out of a chroot() jail". 2002. Retrieved 7 May 2013. 
  2. Root user can easily escape from chroot. Chroot was never supposed to be used as a security mechanism.
  3. Utilizing the CFQ scheduler, you get a separate queue per guest.
  4. Networking is based on isolation, not virtualization.
  5. 14 user capabilities are considered safe within a container. The rest may cannot be granted to processes within that container without allowing that process to potentially interfere with things outside that container. Linux-VServer Paper, Secure Capabilities.
  6. Due to lack of user name space separation in the Linux kernel it is currently possible to evade from linux containers
  7. The root user inside a container has unrestricted access to anything accessible within the container, including files in the /sys and /proc file systems Ubuntu Wiki LXC Security, Gentoo Wiki LXC.
  8. LXC must be combined with other technologies for mandatory access control to improve isolation between containers .
  9. Ubuntu 12.04 improves isolation using AppArmor on the containers, but does not claim improved security "the goal of the Apparmor policy is not to stop malicious actions but rather to stop accidental harm of the host by the guest" Ubuntu Server Guide 12.04, Section 5.8 (page 359).
  10. Available since kernel 2.6.18-028stable021. Implementation is based on CFQ disk I/O scheduler, but it is a two-level schema, so I/O priority is not per-process, but rather per-container. See OpenVZ wiki: I/O priorities for VE for details.
  11. 11.0 11.1 Each container can have its own IP addresses, firewall rules, routing tables and so on. Three different networking schemes are possible: route-based, bridge-based, and assigning a real network device (NIC) to a container.
  12. Each container may have root access without possibly affecting other containers. .
  13. Available since version 4.0, January 2008.
  14. Pijewski, Bill. "Our ZFS I/O Throttle". 
  15. See OpenSolaris Network Virtualization and Resource Control and Network Virtualization and Resource Control (Crossbow) FAQ for details.
  16. Cold migration (shutdown-move-restart) is implemented.
  17. Non-global zones are restricted so they may not affect other zones via a capability-limiting approach. The global zone may administer the non-global zones. (Oracle Solaris 11.1 Administration, Oracle Solaris Zones, Oracle Solaris 10 Zones and Resource Management E29024.pdf, pages 356--360. Available within archive)
  18. Check the "allow.quotas" option and the "Jails and File Systems" section on the FreeBSD jail man page for details.
  19. "Hierarchical_Resource_Limits - FreeBSD Wiki". Wiki.freebsd.org. 2012-10-27. Retrieved 2014-01-15. 
  20. "3.5. Limiting your program's environment". Freebsd.org. Retrieved 2014-01-15. 
  21. Available since TL 02. See Fix pack information for: WPAR Network Isolation for details.
  22. See Live Application Mobility in AIX 6.1

External links

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.