Java bytecode

From Wikipedia, the free encyclopedia

Java bytecode is the form of instructions that the Java virtual machine executes. Each bytecode instruction or opcode is one byte in length, however not all of the possible 256 instructions are used. In fact, Sun Microsystems, the original creators of the Java programming language, the Java virtual machine and other components of the Java Runtime Environment, have set aside a number of values to be permanently unimplemented.[citation needed]

Contents

[edit] Relation to Java

A Java programmer does not need to be aware of or understand Java bytecode at all. However, as suggested in the IBM developerWorks journal, "Understanding bytecode and what bytecode is likely to be generated by a Java compiler helps the Java programmer in the same way that knowledge of assembler helps the C or C++ programmer."[1].

[edit] Generating bytecode

The most common language targeting Java Virtual Machine by producing Java bytecode is Java. Originally only one compiler existed, the javac compiler from Sun Microsystems, which compiles Java source code to Java bytecode; but because all the specifications for Java bytecode are now available, other parties have supplied compilers that produce Java bytecode. Examples of other compilers include:

  • Jikes, compiles from the Java programming language to Java bytecode developed by IBM, implemented in C++
  • Espresso, compiles from the Java programming language to Java bytecode, only Java 1.0
  • Gnu Compiler for Java, GCJ, compiles from the Java programming language to Java bytecode, is also able to compile to native machine code and is available as part of the GNU Compiler Collection (GCC).

Some projects provide Java assemblers to enable writing Java bytecode by hand. Assembler code may be also generated by machine, for example by compiler targeting Java virtual machine. Notable Java assemblers include:

  • Jasmin, takes textual descriptions for Java classes, written in a simple assembler-like syntax using Java Virtual Machine instruction set and generates a Java class file [2]
  • Jamaica, a macro assembly language for the Java virtual machine. Java syntax is used for class or interface definition. Method bodies are specified using bytecode instructions. [3]

Others developed compilers for different programming languages targeting Java virtual machine, such as

[edit] Bytecode execution

Java bytecode is designed to be executed in Java virtual machine. There are several virtual machines available today both free or commercial.

Further information: Java virtual machine

If executing Java bytecode in a Java virtual machine is not desirable, a developer can also compile Java source code or Java bytecode directly to native machine code with tools such as the GNU Compiler for Java.

[edit] Example

Consider the following Java code.

outer:
 for (int i = 2; i < 1000; i++) {
  for (int j = 2; j < i; j++) {
    if (i % j == 0)
      continue outer;
  }
  System.out.println (i);
 }

A Java compiler might translate the Java code above into byte code as follows, assuming the above was put in a method:

 Code:
  0:   iconst_2
  1:   istore_1
  2:   iload_1
  3:   sipush  1000
  6:   if_icmpge       44
  9:   iconst_2
  10:  istore_2
  11:  iload_2
  12:  iload_1
  13:  if_icmpge       31
  16:  iload_1
  17:  iload_2
  18:  irem             # remainder
  19:  ifne    25
  22:  goto    38
  25:  iinc    2, 1
  28:  goto    11
  31:  getstatic       #84; //Field java/lang/System.out:Ljava/io/PrintStream;
  34:  iload_1
  35:  invokevirtual   #85; //Method java/io/PrintStream.println:(I)V
  38:  iinc    1, 1
  41:  goto    2
  44:  return

[edit] The Java bytecodes

See Sun's Java Virtual Machine Specification[4] for more detailed descriptions

Mnemonic Opcode
(in hex)
Other bytes Stack
[before]→[after]
Description
A
aaload 32 arrayref, index → value loads onto the stack a reference from an array
aastore 53 arrayref, index, value → stores a reference into an array
aconst_null 01 → null pushes a null reference onto the stack
aload 19 index → objectref loads a reference onto the stack from a local variable #index
aload_0 2a → objectref loads a reference onto the stack from local variable 0
aload_1 2b → objectref loads a reference onto the stack from local variable 1
aload_2 2c → objectref loads a reference onto the stack from local variable 2
aload_3 2d → objectref loads a reference onto the stack from local variable 3
anewarray bd indexbyte1, indexbyte2 count → arrayref creates a new array of references of length count and component type identified by the class reference index (indexbyte1 << 8 + indexbyte2) in the constant pool
areturn b0 objectref → [empty] returns a reference from a method
arraylength be arrayref → length gets the length of an array
astore 3a index objectref → stores a reference into a local variable #index
astore_0 4b objectref → stores a reference into local variable 0
astore_1 4c objectref → stores a reference into local variable 1
astore_2 4d objectref → stores a reference into local variable 2
astore_3 4e objectref → stores a reference into local variable 3
athrow bf objectref → [empty], objectref throws an error or exception (notice that the rest of the stack is cleared, leaving only a reference to the Throwable)
B
baload 33 arrayref, index → value loads a byte or Boolean value from an array
bastore 54 arrayref, index, value → stores a byte or Boolean value into an array
bipush 10 byte → value pushes a byte onto the stack as an integer value
C
caload 34 arrayref, index → value loads a char from an array
castore 55 arrayref, index, value → stores a char into an array
checkcast c0 indexbyte1, indexbyte2 objectref → objectref checks whether an objectref is of a certain type, the class reference of which is in the constant pool at index (indexbyte1 << 8 + indexbyte2)
D
d2f 90 value → result converts a double to a float
d2i 8e value → result converts a double to an int
d2l 8f value → result converts a double to a long
dadd 63 value1, value2 → result adds two doubles
daload 31 arrayref, index → value loads a double from an array
dastore 52 arrayref, index, value → stores a double into an array
dcmpg 98 value1, value2 → result compares two doubles
dcmpl 97 value1, value2 → result compares two doubles
dconst_0 0e → 0.0 pushes the constant 0.0 onto the stack
dconst_1 0f → 1.0 pushes the constant 1.0 onto the stack
ddiv 6f value1, value2 → result divides two doubles
dload 18 index → value loads a double value from a local variable #index
dload_0 26 → value loads a double from local variable 0
dload_1 27 → value loads a double from local variable 1
dload_2 28 → value loads a double from local variable 2
dload_3 29 → value loads a double from local variable 3
dmul 6b value1, value2 → result multiplies two doubles
dneg 77 value → result negates a double
drem 73 value1, value2 → result gets the remainder from a division between two doubles
dreturn af value → [empty] returns a double from a method
dstore 39 index value → stores a double value into a local variable #index
dstore_0 47 value → stores a double into local variable 0
dstore_1 48 value → stores a double into local variable 1
dstore_2 49 value → stores a double into local variable 2
dstore_3 4a value → stores a double into local variable 3
dsub 67 value1, value2 → result subtracts a double from another
dup 59 value → value, value duplicates the value on top of the stack
dup_x1 5a value2, value1 → value1, value2, value1 inserts a copy of the top value into the stack two values from the top
dup_x2 5b value3, value2, value1 → value1, value3, value2, value1 inserts a copy of the top value into the stack two (if value2 is double or long it takes up the entry of value3, too) or three values (if value2 is neither double nor long) from the top
dup2 5c {value2, value1} → {value2, value1}, {value2, value1} duplicate top two stack words (two values, if value1 is not double nor long; a single value, if value1 is double or long)
dup2_x1 5d value3, {value2, value1} → {value2, value1}, value3, {value2, value1} duplicate two words and insert beneath third word (see explanation above)
dup2_x2 5e {value4, value3}, {value2, value1} → {value2, value1}, {value4, value3}, {value2, value1} duplicate two words and insert beneath fourth word
F
f2d 8d value → result converts a float to a double
f2i 8b value → result converts a float to an int
f2l 8c value → result converts a float to a long
fadd 62 value1, value2 → result adds two floats
faload 30 arrayref, index → value loads a float from an array
fastore 51 arreyref, index, value → stores a float in an array
fcmpg 96 value1, value2 → result compares two floats
fcmpl 95 value1, value2 → result compares two floats
fconst_0 0b → 0.0f pushes 0.0f on the stack
fconst_1 0c → 1.0f pushes 1.0f on the stack
fconst_2 0d → 2.0f pushes 2.0f on the stack
fdiv 6e value1, value2 → result divides two floats
fload 17 index → value loads a float value from a local variable #index
fload_0 22 → value loads a float value from local variable 0
fload_1 23 → value loads a float value from local variable 1
fload_2 24 → value loads a float value from local variable 2
fload_3 25 → value loads a float value from local variable 3
fmul 6a value1, value2 → result multiplies two floats
fneg 76 value → result negates a float
frem 72 value1, value2 → result gets the remainder from a division between two floats
freturn ae value → [empty] returns a float from method
fstore 38 index value → stores a float value into a local variable #index
fstore_0 43 value → stores a float value into local variable 0
fstore_1 44 value → stores a float value into local variable 1
fstore_2 45 value → stores a float value into local variable 2
fstore_3 46 value → stores a float value into local variable 3
fsub 66 value1, value2 → result subtracts two floats
G
getfield b4 index1, index2 objectref → value gets a field value of an object objectref, where the field is identified by field reference in the constant pool index (index1 << 8 + index2)
getstatic b2 index1, index2 → value gets a static field value of a class, where the field is identified by field reference in the constant pool index (index1 << 8 + index2)
goto a7 branchbyte1, branchbyte2 [no change] goes to another instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
goto_w c8 branchbyte1, branchbyte2, branchbyte3, branchbyte4 [no change] goes to another instruction at branchoffset (signed int constructed from unsigned bytes branchbyte1 << 24 + branchbyte2 << 16 + branchbyte3 << 8 + branchbyte4)
I
i2b 91 value → result converts an int into a byte
i2c 92 value → result converts an int into a character
i2d 87 value → result converts an int into a double
i2f 86 value → result converts an int into a float
i2l 85 value → result converts an int into a long
i2s 93 value → result converts an int into a short
iadd 60 value1, value2 → result adds two ints together
iaload 2e arrayref, index → value loads an int from an array
iand 7e value1, value2 → result performs a logical and on two integers
iastore 4f arrayref, index, value → stores an int into an array
iconst_m1 02 → -1 loads the int value -1 onto the stack
iconst_0 03 → 0 loads the int value 0 onto the stack
iconst_1 04 → 1 loads the int value 1 onto the stack
iconst_2 05 → 2 loads the int value 2 onto the stack
iconst_3 06 → 3 loads the int value 3 onto the stack
iconst_4 07 → 4 loads the int value 4 onto the stack
iconst_5 08 → 5 loads the int value 5 onto the stack
idiv 6c value1, value2 → result divides two integers
if_acmpeq a5 branchbyte1, branchbyte2 value1, value2 → if references are equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_acmpne a6 branchbyte1, branchbyte2 value1, value2 → if references are not equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpeq 9f branchbyte1, branchbyte2 value1, value2 → if ints are equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpne a0 branchbyte1, branchbyte2 value1, value2 → if ints are not equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmplt a1 branchbyte1, branchbyte2 value1, value2 → if value1 is less than value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpge a2 branchbyte1, branchbyte2 value1, value2 → if value1 is greater than or equal to value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpgt a3 branchbyte1, branchbyte2 value1, value2 → if value1 is greater than value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmple a4 branchbyte1, branchbyte2 value1, value2 → if value1 is less than or equal to value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifeq 99 branchbyte1, branchbyte2 value → if value is 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifne 9a branchbyte1, branchbyte2 value → if value is not 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
iflt 9b branchbyte1, branchbyte2 value → if value is less than 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifge 9c branchbyte1, branchbyte2 value → if value is greater than or equal to 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifgt 9d branchbyte1, branchbyte2 value → if value is greater than 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifle 9e branchbyte1, branchbyte2 value → if value is less than or equal to 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifnonnull c7 branchbyte1, branchbyte2 value → if value is not null, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifnull c6 branchbyte1, branchbyte2 value → if value is null, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
iinc 84 index, const [No change] increment local variable #index by signed byte const
iload 15 index → value loads an int value from a variable #index
iload_0 1a → value loads an int value from variable 0
iload_1 1b → value loads an int value from variable 1
iload_2 1c → value loads an int value from variable 2
iload_3 1d → value loads an int value from variable 3
imul 68 value1, value2 → result multiply two integers
ineg 74 value → result negate int
instanceof c1 indexbyte1, indexbyte2 objectref → result determines if an object objectref is of a given type, identified by class reference index in constant pool (indexbyte1 << 8 + indexbyte2)
invokeinterface b9 indexbyte1, indexbyte2, count, 0 objectref, [arg1, arg2, ...] → invokes an interface method on object objectref, where the interface method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2) and count is the number of arguments to pop from the stack frame including the object on which the method is being called and must always be greater than or equal to 1
invokespecial b7 indexbyte1, indexbyte2 objectref, [arg1, arg2, ...] → invoke instance method on object objectref, where the method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2)
invokestatic b8 indexbyte1, indexbyte2 [arg1, arg2, ...] → invoke a static method, where the method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2)
invokevirtual b6 indexbyte1, indexbyte2 objectref, [arg1, arg2, ...] → invoke virtual method on object objectref, where the method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2)
ior 80 value1, value2 → result logical int or
irem 70 value1, value2 → result logical int remainder
ireturn ac value → [empty] returns an integer from a method
ishl 78 value1, value2 → result int shift left
ishr 7a value1, value2 → result int shift right
istore 36 index value → store int value into variable #index
istore_0 3b value → store int value into variable 0
istore_1 3c value → store int value into variable 1
istore_2 3d value → store int value into variable 2
istore_3 3e value → store int value into variable 3
isub 64 value1, value2 → result int subtract
iushr 7c value1, value2 → result int shift right
ixor 82 value1, value2 → result int xor
J
jsr a8 branchbyte1, branchbyte2 → address jump to subroutine at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2) and place the return address on the stack
jsr_w c9 branchbyte1, branchbyte2, branchbyte3, branchbyte4 → address jump to subroutine at branchoffset (signed int constructed from unsigned bytes branchbyte1 << 24 + branchbyte2 << 16 + branchbyte3 << 8 + branchbyte4) and place the return address on the stack
L
l2d 8a value → result converts a long to a double
l2f 89 value → result converts a long to a float
l2i 88 value → result converts a long to an int
ladd 61 value1, value2 → result add two longs
laload 2f arrayref, index → value load a long from an array
land 7f value1, value2 → result bitwise and of two longs
lastore 50 arrayref, index, value → store a long to an array
lcmp 94 value1, value2 → result compares two longs values
lconst_0 09 → 0L pushes the long 0 onto the stack
lconst_1 0a → 1L pushes the long 1 onto the stack
ldc 12 index → value pushes a constant #index from a constant pool (String, int, float or class type) onto the stack
ldc_w 13 indexbyte1, indexbyte2 → value pushes a constant #index from a constant pool (String, int, float or class type) onto the stack (wide index is constructed as indexbyte1 << 8 + indexbyte2)
ldc2_w 14 indexbyte1, indexbyte2 → value pushes a constant #index from a constant pool (double or long) onto the stack (wide index is constructed as indexbyte1 << 8 + indexbyte2)
ldiv 6d value1, value2 → result divide two longs
lload 16 index → value load a long value from a local variable #index
lload_0 1e → value load a long value from a local variable 0
lload_1 1f → value load a long value from a local variable 1
lload_2 20 → value load a long value from a local variable 2
lload_3 21 → value load a long value from a local variable 3
lmul 69 value1, value2 → result multiplies two longs
lneg 75 value → result negates a long
lookupswitch ab <0-3 bytes padding>, defaultbyte1, defaultbyte2, defaultbyte3, defaultbyte4, npairs1, npairs2, npairs3, npairs4, match-offset pairs... key → a target address is looked up from a table using a key and execution continues from the instruction at that address
lor 81 value1, value2 → result bitwise or of two longs
lrem 71 value1, value2 → result remainder of division of two longs
lreturn ad value → [empty] returns a long value
lshl 79 value1, value2 → result bitwise shift left of a long value1 by value2 positions
lshr 7b value1, value2 → result bitwise shift right of a long value1 by value2 positions
lstore 37 index value → store a long value in a local variable #index
lstore_0 3f value → store a long value in a local variable 0
lstore_1 40 value → store a long value in a local variable 1
lstore_2 41 value → store a long value in a local variable 2
lstore_3 42 value → store a long value in a local variable 3
lsub 65 value1, value2 → result subtract two longs
lushr 7d value1, value2 → result bitwise shift right of a long value1 by value2 positions, unsigned
lxor 83 value1, value2 → result bitwise exclusive or of two longs
M
monitorenter c2 objectref → enter monitor for object ("grab the lock" - start of synchronized() section)
monitorexit c3 objectref → exit monitor for object ("release the lock" - end of synchronized() section)
multianewarray c5 indexbyte1, indexbyte2, dimensions count1, [count2,...] → arrayref create a new array of dimensions dimensions with elements of type identified by class reference in constant pool index (indexbyte1 << 8 + indexbyte2); the sizes of each dimension is identified by count1, [count2, etc]
N
new bb indexbyte1, indexbyte2 → objectref creates new object of type identified by class reference in constant pool index (indexbyte1 << 8 + indexbyte2)
newarray bc atype count → arrayref creates new array with count elements of primitive type identified by atype
nop 00 [No change] performs no operation
P
pop 57 value → discards the top value on the stack
pop2 58 {value2, value1} → discards the top two values on the stack (or one value, if it is a double or long)
putfield b5 indexbyte1, indexbyte2 objectref, value → set field to value in an object objectref, where the field is identified by a field reference index in constant pool (indexbyte1 << 8 + indexbyte2)
putstatic b3 indexbyte1, indexbyte2 value → set static field to value in a class, where the field is identified by a field reference index in constant pool (indexbyte1 << 8 + indexbyte2)
R
ret a9 index [No change] continue execution from address taken from a local variable #index (the asymmetry with jsr is intentional)
return b1 → [empty] return void from method
S
saload 35 arrayref, index → value load short from array
sastore 56 arrayref, index, value → store short to array
sipush 11 byte1, byte2 → value pushes a signed integer (byte1 << 8 + byte2) onto the stack
swap 5f value2, value1 → value1, value2 swaps two top words on the stack (note that value1 and value2 must not be double or long)
T
tableswitch aa [0-3 bytes padding], defaultbyte1, defaultbyte2, defaultbyte3, defaultbyte4, lowbyte1, lowbyte2, lowbyte3, lowbyte4, highbyte1, highbyte2, highbyte3, highbyte4, jump offsets... index → continue execution from an address in the table at offset index
W
wide c4 opcode, indexbyte1, indexbyte2
or
iinc, indexbyte1, indexbyte2, countbyte1, countbyte2
[same as for corresponding instructions] execute opcode, where opcode is either iload, fload, aload, lload, dload, istore, fstore, astore, lstore, dstore, or ret, but assume the index is 16 bit; or execute iinc, where the index is 16 bits and the constant to increment by is a signed 16 bit short
Unused
breakpoint ca reserved for breakpoints in Java debuggers; should not appear in any class file
impdep1 fe reserved for implementation-dependent operations within debuggers; should not appear in any class file
impdep2 ff reserved for implementation-dependent operations within debuggers; should not appear in any class file
(no name) cb-fd these values are currently unassigned for opcodes and are reserved for future use
xxxunusedxxx ba this opcode is reserved "for historical reasons"

[edit] Support for Dynamic Languages

Main article: JVM Languages

The Java Virtual Machine has currently no built-in support for dynamically typed languages, because the existing JVM instruction set is statically typed[5].

JSR 292 (Supporting Dynamically Typed Languages on the JavaTM Platform) [6] propose to add a new invokedynamic instruction at the JVM level, to allow method invocation relying on dynamic Type checking (instead of the existing static type checking invokevirtual instruction).

[edit] References

[edit] See also

[edit] External links