Bytecode
From Wikipedia, the free encyclopedia
Bytecode is a binary representation of an executable program designed to be executed by a virtual machine rather than by dedicated hardware. Since it is processed by software, it is usually more abstract than machine code. Different parts of a program are often stored in separate file, similar to object modules.
Bytecode is called so because, historically, most instruction sets had one-byte opcodes, followed by zero or more parameters such as registers or memory address. It is a form of output code used by programming language implementations to reduce dependence on specific hardware (the same binary code can be executed across different platforms) and ease interpretation. After compiling to bytecode, the resulting output may be used as the input of a compiler targeting machine code, or executed directly on a virtual machine.
Compared to source code (intended to be human-readable), bytecodes are less abstract, more compact, and more computer-centric. For example, bytecodes encode the results of semantic analysis such as the scope of each variable access (that is, whether the variable is global or local). Thus, performance is usually better than interpretation of source code.
A bytecode program is normally executed by parsing the instruction one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or "just-in-time" (JIT) compilers, translate bytecode into machine language as necessary at runtime: this makes the virtual machine unportable, but doesn't lose the portability of the bytecode itself. For example, Java and C# code is typically stored in bytecoded format, which then uses a JIT compiler to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when bytecode is compiled to native machine code, but improves execution speed considerably compared to interpretation - normally by several times.
Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing them to the virtual machine. Therefore, there are virtual machines for Java, Perl, PHP, Python, Forth, and Tcl. The current reference implementation of the Ruby programming language instead resembles more that of an interpreter, since it works by walking the abstract syntax tree derived from the source code.
[edit] Examples
- O-code of the BCPL programming language
- p-Code of UCSD Pascal implementation of the Pascal programming language
- Bytecodes of many implementations of the Smalltalk programming language
- Java bytecode, which is executed by the Java virtual machine
- EiffelStudio for the Eiffel programming language
- Managed code such as Microsoft .NET Common Intermediate Language, executed by the .NET Common Language Runtime (CLR)
- Byte Code Engineering Library
- Scheme48 implementation of Scheme using bytecode interpreter
- CLISP implementation of Common Lisp compiles only to bytecode
- CMUCL implementation of Common Lisp can compile either to bytecode or to native code; bytecode is much more compact
- Icon programming language
- Ocaml programming language optionally compiles to a compact bytecode form
- Parrot virtual machine
- Infocom used the Z-machine to make its software applications more portable.
- C to Java Virtual Machine compilers