OpenVZ
From Wikipedia, the free encyclopedia
Developer: | Community project, supported by SWsoft |
---|---|
OS: | Linux |
Platform: | x86, x86-64, IA-64, PowerPC, SPARC |
Use: | OS-level virtualization |
License: | GNU GPL v.2 |
Website: | openvz.org |
OpenVZ is an operating system-level virtualization technology based on the Linux kernel and operating system. OpenVZ allows a physical server to run multiple isolated operating system instances, known as Virtual Private Servers (VPS) or Virtual Environments (VE).
As compared to virtual machines such as VMware and paravirtualization technologies like Xen, OpenVZ is limited in that it requires both the host and guest OS to be Linux (although Linux distributions can be different in different VEs). However, OpenVZ claims a performance advantage; according to its website[1], there is only a 1-3% performance penalty for OpenVZ as compared to using standalone servers.
OpenVZ is a basis of Virtuozzo, a proprietary software product provided by SWsoft, Inc. OpenVZ is licensed under the GPL version 2.
The OpenVZ is divided into a custom kernel and user-level tools.
Contents |
[edit] Kernel
The OpenVZ kernel is a Linux kernel, modified to add support for OpenVZ Virtual Environments (VE). The modified kernel provides virtualization, isolation, resource management, and checkpointing.
[edit] Virtualization and isolation
Each VE is a separate entity, and behaves largely as a physical server would. Each has its own
- Files
- System libraries, applications, virtualized
/proc
and/sys
, virtualized locks etc.
- Users and groups
- Each VE has its own root users, as well as other users and groups.
- Process tree
- VE only sees its own processes (starting from init). PIDs are virtualized, so that the init PID is 1 as it should be.
- Network
- Virtual network device, which allows a VE to have its own IP addresses, as well as a set of netfilter (
iptables
) and routing rules.
- Devices
- If needed, any VE can be granted access to real devices like network interfaces, serial ports, disk partitions, etc.
- IPC objects
- Shared memory, semaphores, messages.
[edit] Resource management
OpenVZ resource management consists of three components: two-level disk quota, fair CPU scheduler, and user beancounters. These resources can be changed during VE runtime, eliminating the need to reboot.
[edit] Two-level disk quota
Each VE can have its own disk quotas, measured in terms of disk blocks and inodes (roughly number of files). Within the VE, it is possible to use standard tools to set UNIX per-user and per-group disk quotas.
[edit] CPU scheduler
The CPU scheduler in OpenVZ is a two-level implementation of fair-share scheduling strategy.
On the first level, the scheduler decides which VE it is to give the CPU time slice to, based on per-VE cpuunits values. On the second level the standard Linux scheduler decides which process to run in that VE, using standard Linux process priorities and such.
It is possible to set different values for the cpus in each VE. Real CPU time will be distributed proportionally to these values.
Strict limits, such as 10% of total CPU time, are also possible.
[edit] User Beancounters
User Beancounters is a set of per-VE counters, limits, and guarantees. There is a set of about 20 parameters which is meant to control all the aspects of VE operation. This is meant to prevent a single VE from monopolizing system resources.
These resources primarily consist of memory and various in-kernel objects such as IPC shared memory segments, and network buffers. Each resource can be seen from /proc/user_beancounters and has five values associated with it: current usage, maximum usage (for the lifetime of a VE), barrier, limit, and fail counter. The meaning of barrier and limit is parameter-dependent; in short, those can be thought of as a soft limit and a hard limit. If any resource hits the limit, the fail counter for it is increased. This allows the owner to detect problems by monitoring /proc/user_beancounters in the VE.
[edit] Checkpointing and live migration
A live migration and checkpointing feature was released for OpenVZ in the middle of April 2006. This makes it possible to move a VE from one physical server to another without shutting down the VE. The process is known as checkpointing: a VE is frozen and its whole state is saved to the file on disk. This file can then be transferred to another machine and a VE can be unfrozen (restored) there; the delay is about a few seconds. Because state is usually preserved completely, this pause may appear to be an ordinary computational delay.
[edit] User-level tools
OpenVZ comes with the command-line tools to manage VEs (vzctl), as well as tools to manage software in VEs (vzpkg).
[edit] vzctl
This is a simple high-level command-line tool to manage a VE.
- vzctl create VEID [--ostemplate <name>] [--config <name>]
- This command will create a new Virtual Environment with numeric ID of VEID, which will be based on a specified OS template (a Linux distro) and having resource management parameters taken from a specified config sample. Both --ostemplate and --config parameters are optional, defaults for them are given in a global configuration file.
- vzctl start VEID
- Starts a given VE. Start means creating a Virtual Environment context within the kernel, setting all the resource management parameters and running VE's /sbin/init in that context.
- vzctl stop VEID
- Stops a given VE. A VE can also be stopped (or rebooted) by its owner using standard /sbin/halt or /sbin/reboot commands.
- vzctl exec VEID <command>
- Execute a command inside a gived VE. Say, to see a list of processes inside VE 102, use vzctl exec 102 ps ax.
- vzctl enter VEID
- Open a VE shell. This is useful if, say, sshd is dead for this VE and you want to troubleshoot the case.
- vzctl set VEID --parameter <value> [...] [--save]
- Set a parameter for VE. There are a lot of different parameters. Say, to add an IP address to a VE, use vzctl set VEID --ipadd x.x.x.x --save. To set VE disk quota, use vzctl set VEID --diskspace soft:hard --save. To set VE kernel memory barrier and limit, use vzctl set VEID --kmemsize barrier:limit --save.
[edit] Templates and vzpkg
Templates are precreated images to be used to create a new VE. Basically, a template is a set of packages, and a template cache is a tarball of a chroot environment with those packages installed. During vzctl create stage, a tarball is unpacked. Using a template cache technique, a new VE can be created in seconds.
vzpkg tools is a set of tools to facilitate in template cache creation. It currently supports rpm and yum-based repositories. So, basically, to create a template of, say, Fedora Core 5 distribution, you need to specify a set of (yum) repositories which have FC5 packages, and a set of packages to be installed. In addition, pre- and post-install scripts can be employed to further optimize/modify a template cache. All the above data (repositories, lists of packages, scripts, GPG keys etc.) form a template metadata. Having a template metadata, template cache can be created automatically; you just run vzpkgcache utility. It will download and install the listed packages into a temporary VE, and pack the result as a template cache.
Template caches for non-RPM distros can be created as well, although this is more a manual process. For example, this HOWTO gives detailed instructions of how to create a Debian template cache.
The following template caches (a.k.a. precreated templates) are currently (as of July 2006) available:
- Fedora Core 3, 4, and 5
- CentOS 4 (4.3)
- Gentoo 2006.0 (20060317)
- openSUSE 10
- SUSE 9.3
- Debian 3.1 (sarge)
- Ubuntu 6.06
- Slackware 10.2
[edit] OpenVZ distinct features
[edit] Scalability
As OpenVZ employs a single kernel model, it is as scalable as the 2.6 Linux kernel; that is, it supports up to 64 CPUs and up to 64 GB of RAM. A single virtual environment can scale up to the whole physical box, i.e. use all the CPUs and all the RAM.
[edit] Density
OpenVZ is able to host hundreds of Virtual Environments on a decent hardware (the main limitations are RAM and CPU).
The graph shows relation of VE's Apache web server response time on the number of VEs. Measurements were done on a machine with 768 Mb (¾ Gb) of RAM; each VE was running usual set of processes: init, syslogd, crond, sshd and Apache. Apache daemons were serving static pages, which were fetched by http_load, and the first response time was measured. As you can see, as the number of VE grows, response time becomes higher because of RAM shortage and excessive swapping.
In this scenario it is possible to run up to 120 such VEs on a ¾ Gb of RAM. It extrapolates in a linear fashion, so it is possible to run up to about 320 such VEs on a box with 2 Gb of RAM.
[edit] Mass-management
An owner (root) of OpenVZ physical server (also known as Hardware Node) can see all the VE processes and files. That makes mass management scenarios possible. Consider that VMware or Xen is used for server consolidation: in order to apply a security update to your 10 virtual servers you have to log in into each one and run an update procedure – the same you would do with the ten real physical servers.
In OpenVZ case, you can run a simple shell script which will update all (or just some selected) VEs at once.
[edit] See also
- Linux-VServer
- FreeBSD Jails
- Solaris Containers
- Operating system-level virtualization
- Comparison of virtual machines
- Virtuozzo
- EasyVZ, an OpenVZ management GUI
- HyperVM, Web based distributed management software.