java code to byte code--partone--reference

发布时间：2020-12-14 06:17:43 所属栏目：Java 来源：网络整理

导读：div class="box" Understanding how Java code is compiled into byte code and executed on a Java Virtual Machine (JVM) is critical because it helps you understand what is happening as your program executes. This understanding not only ensures

Understanding how Java code is compiled into byte code and executed on a Java Virtual Machine (JVM) is critical because it helps you understand what is happening as your program executes. This understanding not only ensures that language features make logical sense but also that it is possible to understand the trade offs and side effects when making certain discussions.

This article explains how Java code is compiled into byte code and executed on the JVM. To understand the internal architecture in the JVM and different memory areas used during byte code execution see my previous article on?.

This article is split into three parts,with each part being subdivided into sections. It is possible to read each section in isolation however the concepts will generally build up so it is easiest to read the sections. Each section will cover different Java code structures and explain how these are compiled and executed as byte code,as follows:

- - constants)?
Part 2 - Object Orientation And Safety?(next article)
- method invocation
- ?(objects and arrays)
Part 3 - Metaprogramming?(future article)
- generics
- annotations
- reflection

This article includes many code example and shows the corresponding typical byte code that is generated. The numbers that precede each instruction (or opcode) in the byte code indicates the byte position. For example an instruction such a?: is only one byte in length,as there is no operand,so the following byte code would be at?. An instruction such as?: ?would take two bytes,one byte for the opcode??and one byte for the operand?. In this case the following byte code would be at??as the operand occupied the byte at position?.

The Java Virtual Machine (JVM) has a stack based architecture. When each method is executed including the initial main method a frame is created on the stack which has a set of local variables. The array of local variables contains all the variables used during the execution of the method,including a reference to?,all method parameters,and other locally defined variables. For??methods (i.e.??methods) the method parameters start from zero,however,for instance methods the zero slot is reserved for?.

A local variable can be:

All types take a single slot in the local variable array except??and??which both take two consecutive slots because these types are double width (64-bit instead of 32-bit).

When a new variable is created the operand stack is used to store the value of the new variable. The value of the new variable is then stored into the local variables array in the correct slot. If the variable is not a primitive value then the local variable slot only stores a?. The??points to an the object stored in the heap.

For example:

 i = ;

Is compile to:

:       
:

?as an integer to the operand stack,in this case 5 as added to the operand stack.

?they all store an integer into local variables. The??refers to the location in the local variable array that is being stored and can only be 0,1,2 or 3. Another opcode is used for values higher then 3 called?,which takes an operand for the location in the local variable array.

In memory when this is executed the following happens:

The class file also contains a local variable table for each method,if this code was included in a method you would get the following entry in the local variable table for that method in the class file.

A field (or class variable) is stored on the heap as part of a class instance (or object). Information about the field is added into the??array in the class file as shown below.

    
    u2          methods_count;
    method_info     methods[methods_count];
    u2          attributes_count;
    attribute_info  attributes[attributes_count];
}

In addition if the variable is initialized the byte code to do the initialization is added into the constructor.

When the following java code is compiled:

  SimpleClass {
<span class="keyword"&gt;public</span> <span class="keyword"&gt;int</span> simpleField = <span class="numeric_literal"&gt;100</span>;
}

An extra section appears when you run??demonstrating the field added to the?array:

  simpleField;
    Signature: I
    flags:

The byte code for the initialization is added into the constructor (shown in bold),as follows:

 SimpleClass();
  Signature: ()V
  flags: 
  Code:
    stack=2,locals=1,args_size=1
       : 
       :                    ":()V
       : 
       :         
       :                         
      :

?variables (field) actually executed in the default constructor created by the compiler. As a result the first local variable actually points to this,therefore the??opcode loads the??reference onto the operand stack.?is one of a group of opcodes with the format?they all load an object reference into the operand stack. The??refers to the location in the local variable array that is being accessed but can only be 0,2 or 3. There are other similar opcodes for loading values that are not an object reference,?,??and?where i is for?,l is for?,f is for??and d is for?. Local variables with an index higher than 3 can be loaded using?,?,?,?and??these opcodes all take a single operand that specifies the index of local variable to load.

?instruction is used to invoke instance initialization methods as well as private methods and methods of a superclass of the current class. It is part of a group of opcodes that invoke methods in different ways that include,?,,?,?. The??instruction is this code is invoking the superclass constructor i.e. the constructor of?.

?as an integer to the operand stack,in this case 5 as added to the operand stack.

?previously added the object that contains the field and the?previously added the??to the operand stack. The?then removes (pops) both of these values from the operand stack. The??result is that the field simpleField on the this object is updated with the value?.

In memory when this is executed the following happens:

The??opcode has a single operand that referencing the second position in the constant pool. The JVM maintains a per-type constant pool,a run time data structure that is similar to a symbol table although it contains more data. Byte codes in Java require data,often this data is too large to store directly in the byte codes,instead it is stored in the constant pool and the byte code contains a reference to the constant pool. When a??file is created it has a section for the constant pool as follows:

 = Methodref          .         ":()V
    = Fieldref           .         
    = Class                          
    = Class                          
    = Utf8               simpleField
    = Utf8               I
    = Utf8               
    = Utf8               ()V
    = Utf8               Code
   = Utf8               LineNumberTable
   = Utf8               LocalVariableTable
   = Utf8               this
   = Utf8               SimpleClass
   = Utf8               SourceFile
   = Utf8               SimpleClass.java
   = NameAndType        :          ":()V
   = NameAndType        :          
   = Utf8               LSimpleClass;
   = Utf8               java/lang/Object

A constant field with the??modifier is flagged as??in the??file.

For example:

  SimpleClass {
<span class="keyword"&gt;public</span> <span class="keyword bold"&gt;final</span> <span class="keyword"&gt;int</span> simpleField = <span class="numeric_literal"&gt;100</span>;
}

The field description is augmented with?:

    simpleField = ;
    Signature: I
    flags: ,
    ConstantValue:  100

The initialization in the constructor is however unaffected:

: 
:         
:

A??class variable with the??modifier is flagged as??in the?file as follows:

   simpleField;
    Signature: I
    flags: ,

The byte code for initialization of??variables is not found in the instance constructor?. Instead??fields are initialized as part of the?constructor??using the putstatic operand instead of??operand.

  Code:
    stack=1,locals=0,args_size=0
       :          
       :                         
       :

Conditional flow control,such as,?-?statements and??statements work by using an instruction that compares two values and branches to another byte code.

Loops including?-loops and?-loops work in a similar way except that they typically also include a??instructions that causes the byte code to loop.?--loops do not require any??instruction because their conditional branch is at the end of the byte code. For more detail on loops see the?.

Some opcodes can compare two integers or two references and then preform a branch in a single instruction. Comparisons between other types such as doubles,longs or floats is a two-step process. First the comparison is performed and 1,0 or -1 is pushed onto the operand stack. Next a branch is performed based on whether the value on the operand stack is greater,less-than or equal to zero.

First the?-?statement will be explained as an example and then the different types of instructions used for branching will be?.

The following code example shows a simple?-?comparing two integer parameters.

  greaterThen( intOne, intTwo) {
     (intOne > intTwo) {
         ;
    }  {
         ;
    }
}

This method results in the following byte code:

: 
: 
:      
: 
: 
: 
:

First the two parameters are loaded onto the operand stack using??and.??then compares the top two values on the operand stack. This operand branches to byte code??if intOne is less then or equal to intTwo. Notice this is the exact opposite of the test in the??condition in the Java code because if the byte code test is successful execution branches to the?-block where as in the Java code if the test is successful the execution enters the?-block. In other words?is testing if the??condition is not true and jumping over the?-block. The body of the?-block is byte code??and?,the body of the?-block is byte code??and?.

The following code example shows a slightly more complex example which requires a two-step comparison.

  greaterThen( floatOne, floatTwo) {
     result;
     (floatOne > floatTwo) {
        result = ;
    }  {
        result = ;
    }
     result;
}

This method results in the following byte code:

In this example first the two parameters values are pushed onto the operand stack using??and?. This example is different from the previous example because of the two-step comparison.??is first used to compare floatOne and floatTwo and push the result onto the operand stack as follows:

?floatTwo??0
?floatTwo??-1
?NaN??1

Next??is used to branch to byte code??if the result from??is??0.

This example is also different from the previous example in that there is only a single?statement at the end of the method as a result a??is required at the end of the?-block to prevent the?-block from also being executed. The?branches to byte code??where??is then used to push the result stored in the third local variable slot to the top of the operand stack so that it can be returned by the??instruction.

,for comparison to null i.e.??and??and for testing an object's type i.e.?.

eq ne lt le gt ge

?can be:

?- not equals
?- less then
?- less then or equal
?- greater then
?- greater then or equal

eq ne

?equal or??non equal and branch to a new byte code location as specified by the operand.

?or not??and branch to a new byte code location as specified by the operand.

?value2??push 0
?value2??push -1

l g dcmp l g

?or??values and push a value onto the operand stack as follows:

?value2??push 0
?value2??push -1

The difference between the two types of operand ending in??or??is how they handle NaN. The?and??instructions push the??value 1 onto the operand stack whereas??and??push -1 onto the operand stack. This ensures that when testing two values if either of them are Not A Number (NaN) then the test will not be successful. For example if testing if x > y (where x and y are doubles) then??is used so that if either value is NaN the into value -1 is pushed onto the operand stack. The next opcode will always be a??instruction which branches if the value is less then 0. As a result if either x or y was a NaN the??would branch over the?-block preventing the code in the?-block from being executed.

?result of 1 onto the operand stack if the object at the top of the operand stack is an instance of the??specified. The operand for this opcode is used to specify the class by providing an index into the constant pool. If the object is??or not an instance of the specified?then the??result 0 is added to the operand stack.

eq ne lt le gt ge

The type of a Java??expression must be?,?,?,?,Character,Byte,Short,Integer,String or an??type. To support??statements the JVM uses two special instructions called??and??which both only work with integer values. The use of only integers values is not a problem because,??and??types can all be internally promoted to?. Support for String was also added in Java 7 which will be covered below.??is typically a faster opcode however it also typically takes more memory.??works by listing all potential??values between the minimum and maximum??values. The minimum and maximum values are also provided so that the JVM can immediately jump to the?-block if the switch variable is not in the range of listed?values. Values for??statement that are not provided in the Java code are also listed,but point to the?-block,to ensure??values between the minimum and maximum are provided. For example take the following??statement:

  simpleSwitch( intOne) {
     (intOne) {
         :
             ;
         :
             ;
         :
             ;
        :
             ;
    }
}

This produces the following byte code:

The??instruction has values for 0,1 and 4 to match the??statement provided in the code which each point to the byte code for their prospective code block. The??instruction also has values for 2 and 3,as these are not provided as??statements in the Java code they both point to the??code block. When the instruction is executed the value at the top of the operand stack is checked to see if it is between the minimum and maximum. If the value is not between the minimum and maximum execution jumps to the??branch,which is byte code??in the above example. To ensure the??branch value can be found in the??instruction it is always the first byte (after any required padding for alignment). If the value is between the minimum and maximum it is used to index into the??and find the correct byte code to branch to,for example for value 1 above the execution would branch to byte code?. The following diagram shows how this byte code would be executed:

java byte code for switch using tableswitch

If the values in the??statement were too far apart (i.e. too sparse) this approach would not be sensible,as it would take too much memory. Instead when the cases of the??are sparse a??instruction is used. A?instruction lists the byte code to branch to for each??statement but it does not list all possible values. When executing the??the value at the top of the operand stack is compared against each value in the??to determine the correct branch address. With a??the JVM therefore searches (looks up) the correct match in a list of matches this is a slower operation then for the?where the JVM just indexes the correct value immediately. When a select statement is compiled the compiler must trade off memory efficiency with performance to decide which opcode to use for the select statement. For the following code the compiler produces a?:

  simpleSwitch( intOne) {
     (intOne) {
         :
             ;
         :
             ;
         :
             ;
        :
             ;
    }
}

This produces the following byte code:

To ensure efficient search algorithms (more efficient then linear search) the number of matches is provided and the matches are sorted. The following diagram shows how this would be executed:

java byte code for switch using lookupswitch

String switch

In Java 7 the??statement added support for the String type. Although the existing opcodes for switches only support??no new opcodes where added. Instead a??for the String type is done in two stages. First there the hashcode is compared between the top of the operand stack and the value for each?statement. This is done using either a??or??(depending on the sparcity of the hashcode values). This causes a branch to byte code that calls String.equals() to perform an exact match. A??instruction is then used on the result of the String.equals() to branch to the code for the correct??statement.

  simpleSwitch(String stringOne) {
     (stringOne) {
         :
             ;
         :
             ;
         :
             ;
        :
             ;
    }
}

This String??statement will produce the following byte code:

The??containing this byte code also contains the following constant pool values references by this byte code. See the section on??in the??article for more detail about constant pools.

 = Methodref          .        
   = String                         
   = Methodref          .        
   = String                         
   = String                         
<span class="final_value">#25 = Class              <span class="final_value">#33            <span class="attribute_name">//  java/lang/String

<span class="final_value">#26 = NameAndType        <span class="final_value">#34:<span class="final_value">#35        <span class="attribute_name">//  hashCode:()I

<span class="final_value">#27 = Utf8               a

<span class="final_value">#28 = NameAndType        <span class="final_value">#36:<span class="final_value">#37        <span class="attribute_name">//  equals:(Ljava/lang/Object;)Z

<span class="final_value">#29 = Utf8               b

<span class="final_value">#30 = Utf8               c
<span class="final_value">#33 = Utf8               java/lang/String

<span class="final_value">#34 = Utf8               hashCode

<span class="final_value">#35 = Utf8               ()I

<span class="final_value">#36 = Utf8               equals

<span class="final_value">#37 = Utf8               (Ljava/lang/Object;)Z

Notice the amount of byte code required to perform this??including two?instructions and several??instructions used to call String.equal(). See the section on method invocation in the next article for more detail on?. The following diagram shows how this would be executed for the input “b”.

java byte code for switch on String - part 1

java byte code for switch on String - part 2

java byte code for switch on String - part 3

If the hashcode values for the different cases matched,such as for the strings?and??which both have a hashcode of 28. This is handled by slightly altering the flow of equals methods as below. Notice how byte code?:???goes to another invocation of String.equals() instead of the??opcode as in the previous example which had no colliding hashcode values.

  simpleSwitch(String stringOne) {
     (stringOne) {
         "FB":
             ;
         "Ea":
             ;
        :
             ;
    }
}

This generates the following byte code:

Conditional flow control,?-?statements and??statements work by using an instruction that compares two values and branches to another byte code. For more detail on conditionals see the?.

Loops including?-loops and?-loops work in a similar way except that they typically also include a??instructions that causes the byte code to loop.?--loops do not require any??instruction because their conditional branch is at the end of the byte code.

Some opcodes can compare two integers or two references and then preform a branch in a single instruction. Comparisons between other types such as doubles,less-than or equal to zero. For more detail on the different types of instructions used for branching?.

-loops consist of a conditional branch instructions such as??or?() and a??statement. The conditional instruction branches the execution to the instruction immediately after the loop and therefore terminates the loop if the condition is not met. The??instruction in the loop is a?that branches the byte code back to the beginning of the loop ensuring the byte code keeps looping until the conditional branch is met,as follows:

  whileLoop() {
     i = ;
     (i < ) {
        i++;
    }
}

Is compiled to:

: 
 : 
 : 
 : 
 :      
 :           ,
:           
:

The??instruction tests if the local variable in position 1 (i.e. i) is equal or greater then 10 if it is then the instruction jumps to byte code??finishing the loop. The??instruction keeps the byte code looping until the??condition is met at which point the execution branches to the??instruction immediately after the end of the loop. The??instruction is one of the few instruction that updates a local variable directly without having to load or store values in the operand stack. In this example the??instruction increases the first local variable (i.e. i) by 1.

-loops and?-loops use an identical pattern in byte code. This is not surprising because all?-loops can be re-written easily as an identical?-loop. The simple?-loop above could for example be re-written as a?-loop that produces the exactly identical byte-code as follows:

  forLoop() {
    ( i = ; i < ; i++) {
}
}

<h3 id="do_while_loop">do-while-loop

--loops are also very similar to?-loops and?-loops except that they do not require the??instruction as the conditional branch is the last instruction and is be used to loop back to the beginning.

  doWhileLoop() {
     i = ;
     {
        i++;
    }  (i < );
}

Results in the following byte code:

: 
 : 
 :           ,
 : 
 : 
 :      
:

java byte code for do-while loop - part 1

java byte code for do-while loop - part 2

The next two articles will cover the following topics:

Part 2 - Object Orientation And Safety?(next article)
- ```
--
```
- method calls (and parameters)
- ?(objects and arrays)
Part 3 - Metaprogramming?(future article)
- generics
- annotations
- reflection

For more detail on the internal architecture in the JVM and different memory areas used during byte code execution see my previous article on?

James D Bloom

http://blog.jamesdbloom.com/JavaCodeToByteCode_PartOne.html

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!

java code to byte code--partone--reference

String switch

More Articles