Universal Virtual Computer

From Wikipedia, the free encyclopedia

A Universal Virtual Computer (UVC) is much like a virtual machine (VM) in computing by means that it creates a layer between the underlying computer platform and upperlying software. It offers the benefit of portability between different platforms. Software, developed for running on top of the VM, is then able to operate on a wide variety of computer systems for which a VM exist.

However, in contrast with normal VM's the UVC is designed to be universal. That is, it offers a platform independent layer that will always remain the same. In this way, programs developed for the UVC are guaranteed to run anytime, in present and future.

Contents

[edit] Conceptual model

The UVC is part of a broader concept, called the Universal Virtual Computer (UVC)-based preservation method. This method is invented by R.A. Lorie [1] (IBM Research Center in Almaden) and allows digital objects (like text documents, spreadsheets, images, sound waves, etc.) to be reconstructed in its original appearance anytime in the future using a unique combination of emulation and migration. The method is a long-term solution for the main problem in the field of digital preservation: how can we keep digital objects accessible over the long term?

The central idea of the UVC-based preservation method is that digital objects preserved in an archive can be reconstructed anytime in the future without losing the meaning of that object. The UVC concept consists of four different components. These are:

- Universal Virtual Computer (UVC)

- Logical Data Schema (LDS) with type description

- UVC program (format decoder)

- Logical Data Viewer

Together with the original data it is possible to reconstruct the meaning of each particular digital object. The UVC can be seen as the heart of the system. Like the Java Virtual Machine and the Common Language Runtime, the UVC is actually an emulator for not really existing hardware and will run as software application on a future platform. Because we do not know at this time which hardware is available in the future, the UVC must be created at the time we want to access a particular document from the repository. The UVC forms the platform on which programs specifically written for the UVC can run. Such a UVC program is needed to decode the file format of a digital object. In turn, it retrieves element tags, which hold specific information about the content of the data. These elements form the Logical Data View (LDV) of the data and look like XML. The LDV is an instantiation of the Logical Data Schema (LDS), which describes the elementary parts of a specific information type as a blueprint.

All these components are controlled by a Logical Data Viewer. This program also runs on future hardware and therefore needs to be created at that time too. The viewer starts the UVC and feeds it with data of the digital object and a format decoder specifically developed for the UVC. It retrieves an LDV from the UVC and reconstructs a specific representation of the original object’s meaning. This process is depicted in figure 1.

Image:UVC-based preservation method.gif

In the figure a distinction is made between preservation time (present) and retrieval time (future). Different steps must be taken during both present and future, which are:

[edit] At preservation time (present)

Step 1 – To view a digital object in the future, we must understand the structure (logical form) of it. Therefore a detailed description needs to be developed which states what the logical view should look like and what it means. This logical view is returned by the UVC in the future and needs to be interpreted by the viewer; therefore future developers must be able to understand it. The information describes the structure of the object’s content in detail. As mentioned above, executing the UVC will return this information in elementary tags. All tags are defined in a blueprint called the Logical Data Schema (LDS), which explains the tags that are retrieved from the UVC for a particular type of digital object. In this, a type is defined as a particular group of files, like an image, sound wave or spreadsheet. The advantage of the LDS is that the same LDS can be used for all formats of a type. But knowing how the LDV is retrieved is not enough. To understand what they mean, the LDS provides a description of the meaning of important elements. Think of an image from which each pixel is described by the colors red, green and blue. For each type of color a code value is returned, like red: 230, green: 17 and blue: 0. Without understanding the scale and spectrum of these values, colors can not be reconstructed to their authentic appearance. The LDS for images is based on the RGB color model as it is the model used by computer screens and most image formats. To gauge this model it is referenced to a general spectrum of colors by the CIE chromaticity diagram. Other color models, like the Hue-Saturation-Lightness (HLS color space) model can also be referenced to this spectrum.

Step 2 – Having a digital object and a description of the elements returned by the UVC is still not enough to reconstruct the object’s meaning. The UVC has to know how it should decipher the logical format of a digital object. Therefore a UVC format decoder has to be written that can decode the format and transform it into a Logical Data View (LDV), using the elements defined by the LDS. It is important that this format decoder is written at preservation time, because waiting can eventually lead to a misunderstanding of the format due to its obsolescence. For each format a decoder has to be written, demanding a lot of effort. But once a decoder is available it can be applied to every object of that same type.

Step 3 – Finally future developers have to know how they could construct a UVC, which can execute the format decoder for a particular object format. Developing one now will not guarantee that it is still operational in the future. This implies that it has to be made understandable how software developers in the distant future can create a new one by themselves. The UVC is designed to be a general-purpose computer, running on future hardware. The architecture conforms to the current Von Neumann architecture, but is very flexible. For instance, it assumes it has an unlimited amount of virtual memory and has no fixed size registers, which differ from today’s computers. To reproduce an UVC in the future, a description of this concept must be carefully preserved. This could be done as a document in a digital repository, but also as a hardcopy on paper.

[edit] Current status

In 2001 the National Library of the Netherlands (Koninklijke Bibliotheek) [2] and IBM started the Long-Term Preservation (LTP) project. The LTP project’s main objective is to investigate the strategy and functionality needed to preserve the digitally stored information for the long term, which in this context means hundreds of years. In 2002 the study delivered their findings in a report series of six documents [3]. One of the outcomes stated that the UVC is a good candidate for maintaining access to digital objects for the long term.

Starting in 2003, the second part of the LTP-project was to do a 'proof of concept' of the UVC-based preservation method [4]. As a result, in 2004 a demonstration tool of the UVC became freely available for the wide public [5]. With this undertaking, the LTP-project wants to emphasize on the problems in digital preservation field and to take the first steps towards a solution.

[edit] References

[1] R.A. Lorie, inventor of the UVC-based preservation method: R.A. Lorie

[2] Koninklijke Bibliotheek, the National Library of the Netherlands: KB web site

[3] The Long-Term Preservation (LTP) study at the National Library of the Netherlands and IBM: LTP / DIAS project

[4] The UVC project at the National Library of the Netherlands and IBM: UVC for JPEG

[5] Development of a Universal Virtual Computer (UVC) for long-term preservation of digital objects. [1] will give the abstract

[edit] See also