Marshalling (computer science)

In computer science, marshalling or marshaling is the process of transforming the memory representation of an object to a data format suitable for storage or transmission, and it is typically used when data must be moved between different parts of a computer program or from one program to another. Marshalling is similar to serialization and is used to communicate to remote objects with an object, in this case a serialized object. It simplifies complex communication, using custom/complex objects to communicate instead of primitives. The opposite, or reverse, of marshalling is called unmarshalling (or demarshalling, similar to deserialization).

Usage

Marshalling is used within implementations of different remote procedure call (RPC) mechanisms, where it is necessary for transporting data between processes and/or between threads. In Microsoft's Component Object Model (COM), interface pointers must be marshalled when crossing COM apartment boundaries[1] (that is, crossing between instances of the COM library).[2] In the .NET Framework, the conversion between an unmanaged type and a CLR type, as in the P/Invoke process, is also an example of an action that requires marshalling to take place.[3]

Additionally, marshalling is used extensively within scripts and applications that utilize the XPCOM technologies provided within the Mozilla application framework. The Mozilla Firefox browser is a popular application built with this framework that additionally allows scripting languages to use XPCOM through XPConnect (Cross-Platform Connect).

Example

In the Microsoft Windows family of operating systems the entire device drivers for Direct3D are kernel-mode drivers. The user-mode portion of the API is handled by the DirectX runtime provided by Microsoft.

This is an issue because calling kernel-mode operations from user-mode requires performing a system call, and this inevitably forces the CPU to switch to "kernel mode". This is a slow operation, taking on the order of microseconds to complete.[4] During this time, the CPU is unable to perform any operations. As such, minimizing the number of times this switching operation must be performed would optimize performance to a substantive degree.

Linux OpenGL drivers are split in two: a kernel-driver and a user-space driver. The user-space driver does all the translation of OpenGL commands into machine code to be submitted to the GPU. To reduce the number of system calls, the user-space driver implements marshalling. If the GPU's command buffer is full of rendering data, the API could simply store the requested rendering call in a temporary buffer and, when the command buffer is close to being empty, it can perform a switch to kernel-mode and add a number of stored commands all at once.

Comparison with serialization

The term "marshal" is considered to be synonymous with "serialize" in the Python standard library,[5] but the terms are not synonymous in the Java-related RFC 2713:

To "marshal" an object means to record its state and codebase(s)[note 1] in such a way that when the marshalled object is "unmarshalled", a copy of the original object is obtained, possibly by automatically loading the class definitions of the object. You can marshal any object that is serializable or remote. Marshalling is like serialization, except marshalling also records codebases. Marshalling is different from serialization in that marshalling treats remote objects specially. (RFC 2713)

To "serialize" an object means to convert its state into a byte stream in such a way that the byte stream can be converted back into a copy of the object.

See also

Look up marshalling (computer science) in Wiktionary, the free dictionary.

Notes

  1. "Codebase" here is used in its Java-specific meaning, to refer to a list of URLs where the object code can be loaded from, rather than in the more general meaning of codebase which refers to source code.

References