Class (file format)
From Wikipedia, the free encyclopedia
In the Java programming language, source files (.java files) are compiled into class files which have a .class extension. Since Java is a platform-independent language, source code is compiled into bytecode, which it stores in a .class file. If a source file has more than one class, each class is compiled into a separate .class file. These .class files can be loaded by any Java Virtual Machine (JVM).
Since JVMs are available for many platforms, the .class file compiled in one platform will execute in a JVM of another platform. This makes Java platform-independent.
As of 2006, the modification of the class file format is being considered under Java Specification Request (JSR) 202.
Contents |
[edit] Structure
[edit] Table representation of the class file format
The structure of the class file format can be visualized by a table as follows:
location in memory by byte | value | item | size | description |
---|---|---|---|---|
0x00000000 | 0xCA hexadecimal = 1100 1010 binary | magic number | 4 bytes | magic number used to identify file as conforming to the class file format |
0x00000001 | 0xFE hexadecimal = 1111 1110 binary | |||
0x00000002 | 0xBA hexadecimal = 1011 1010 binary | |||
0x00000003 | 0xBE hexadecimal = 1011 1110 binary | |||
0x00000004 | u2 minor_version | minor version number | 2 bytes | minor version number of the class file format being used |
0x00000005 | ||||
0x00000006 | u2 major_version | major version number | 2 bytes | major version number of the class file format being used |
0x00000007 | ||||
0x00000008 | u2 constant_pool_count = (one plus the number of entries in the constant pool table) | constant pool count | 2 bytes | constant pool count |
0x00000009 | ||||
0x0000000A | array of constant pool entries where constant_pool.length = constant_pool_count - 1 | constant pool | variable length (length is equal to sizeof(constant_pool_entry_size) * (constant_pool_count - 1)) | constant pool |
... | ||||
... | ||||
... | ||||
new_address1 = 0x0000000A + sizeof(constant_pool); | the access flags for the class file | access flags | 2 bytes | access flags |
new_address1 + 1 | ||||
new_address1 + 2 | this class | this class | 2 bytes | this class |
new_address1 + 3 | ||||
new_address1 + 4 | super class | super class | 2 bytes | super class |
new_address1 + 5 | ||||
new_address1 + 6 | interface count | interface count | 2 bytes | interface count |
new_address1 + 7 | ||||
new_address1 + 8 | array of interfaces | interface table | variable length (interface count * 2) | interface table |
... | ||||
... | ||||
... | ||||
new_address2 = new_address1 + 8 + sizeof(interfaces); | field count | field count | 2 bytes | field count |
new_address2 + 1 | ||||
new_address2 + 2 | array of fields | field table | variable length (field count * sizeof(field_info)) | field table |
... | ||||
... | ||||
... | ||||
new_address3 = new_address2 + 2 + sizeof(fields); | method count | method count | 2 bytes | method count |
new_address3 + 1 | ||||
new_address3 + 2 | array of methods | method table | variable length (method count * sizeof(method_info)) | method table |
... | ||||
... | ||||
... | ||||
new_address4 = new_address3 + 2 + sizeof(attributes); | attribute count | attribute count | 2 bytes | attribute count |
new_address4 + 1 | ||||
new_address4 + 2 | array of attributes | attribute table | variable length (attribute count * sizeof(attribute_info)) | attribute table |
... | ||||
... | ||||
... |
[edit] C programming language representation of the class file format
The structure of the class file format can be fully described using the C programming language as follows:
struct Class_File_Format { u4 magic_number; //unsigned, 4 byte (32 bit) number that //indicates the start of a class file //the actual value is defined in the Java //Virtual Machine Specification as //0xCAFEBABE in hexadecimal, which equals //1100 1010 1111 1110 1011 1010 1011 1110 //in binary, and 3,405,691,582 in decimal u2 minor_version; //unsigned, 2 byte (16 bit) minor version number u2 major_version; //unsigned, 2 byte (16 bit) major version number u2 constant_pool_count //unsigned, 2 byte (16 bit) number //indicating the number of entries //in the constant pool table, plus //one //the constant pool table cp_info constant_pool[constant_pool_count - 1]; u2 access_flags; u2 this_class; u2 super_class; u2 interfaces_count; //unsigned, 2 byte (16 bit) number //indicating the number of entries //in the table of superinterfaces //of this class //the table of superinterfaces of this class u2 interfaces[interfaces_count]; u2 fields_count; //unsigned, 2 byte (16 bit) number //indicating the number of entries in //the table of fields of this class //the table of fields of this class field_info fields[fields_count]; u2 methods_count; //unsigned, 2 byte (16 bit) number //indicating the number of entries in //the table of methods of this class //the table of methods of this class method_info methods[methods_count]; u2 attributes_count; //unsigned, 2 byte (16 bit) number //indicating the number of //attributes in the attributes //table //the attributes table attribute_info attributes[attributes_count]; }
[edit] Trivia
Class files are identified by the following 4 byte header (in hexadecimal): CA FE BA BE
.
The history of this magic number was explained by James Gosling:
"We used to go to lunch at a place called St Michael's Alley. According to local legend, in the deep dark past, the Grateful Dead used to perform there before they made it big. It was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry died, they even put up a little Buddhist-esque shrine. When we used to go there, we referred to the place as Cafe Dead. Somewhere along the line it was noticed that this was a HEX number. I was re-vamping some file format code and needed a couple of magic numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for the object file format, and in grepping for 4 character hex words that fit after "CAFE" (it seemed to be a good theme) I hit on BABE and decided to use it. At that time, it didn't seem terribly important or destined to go anywhere but the trash-can of history. So CAFEBABE became the class file format, and CAFEDEAD was the persistent object format. But the persistent object facility went away, and along with it went the use of CAFEDEAD - it was eventually replaced by RMI."
[edit] References
- The Java™ Virtual Machine Specification, Second Edition is the official defining document of the Java Virtual Machine (which includes the class file format) as officially specified by Sun Microsystems, and is available online on Sun's website at http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.html, and also in printed book form as ISBN 0-201-43294-3. Both the first and second editions of the book are freely available online for viewing and/or download at http://java.sun.com/docs/books/vmspec/
- JSR 202 Java Class File Specification Update
- James Gosling private communication to Bill Bumgarner: DEAD LINK!! http://bbum.pycs.net/2003/01/28.html