加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 编程开发 > Java > 正文

The Java serialization algorithm revealed---reference

发布时间:2020-12-14 06:17:37 所属栏目:Java 来源:网络整理
导读:Serialization ?is the process of saving an object's state to a sequence of bytes;? deserialization ?is the process of rebuilding those bytes into a live object. The Java Serialization API provides a standard mechanism for developers to han

Serialization?is the process of saving an object's state to a sequence of bytes;?deserialization?is the process of rebuilding those bytes into a live object. The Java Serialization API provides a standard mechanism for developers to handle object serialization. In this tip,you will see how to serialize an object,and why serialization is sometimes necessary. You'll learn about the serialization algorithm used in Java,and see an example that illustrates the serialized format of an object. By the time you're done,you should have a solid knowledge of how the serialization algorithm works and what entities are serialized as part of the object at a low level.

Why is serialization required?

In today's world,a typical enterprise application will have multiple components and will be distributed across various systems and networks. In Java,everything is represented as objects; if two Java components want to communicate with each other,there needs be a mechanism to exchange data. One way to achieve this is to define your own protocol and transfer an object. This means that the receiving end must know the protocol used by the sender to re-create the object,which would make it very difficult to talk to third-party components. Hence,there needs to be a generic and efficient protocol to transfer the object between components. Serialization is defined for this purpose,and Java components use this protocol to transfer objects.

Figure 1 shows a high-level view of client/server communication,where an object is transferred from the client to the server through serialization.

A high-level view of serialization in action

Figure 1. A high-level view of serialization in action?

How to serialize an object

In order to serialize an object,you need to ensure that the class of the object implements thejava.io.Serializable?interface,as shown in Listing 1.

Listing 1. Implementing Serializable

=100=0

In Listing 1,the only thing you had to do differently from creating a normal class is implement the?java.io.Serializable?interface. The?Serializable?interface is a marker interface; it declares no methods at all. It tells the serialization mechanism that the class can be serialized.

Now that you have made the class eligible for serialization,the next step is to actually serialize the object. That is done by calling the?writeObject()?method of thejava.io.ObjectOutputStream?class,as shown in Listing 2.

Listing 2. Calling writeObject()

=newFileOutputStream("temp.out"==

Listing 2 stores the state of the?TestSerial?object in a file called?temp.out.oos.writeObject(ts);?actually kicks off the serialization algorithm,which in turn writes the object to?temp.out.

To re-create the object from the persistent file,you would employ the code in Listing 3.

Listing 3. Recreating a serialized object

=newFileInputStream("temp.out"=="version="+ts.version);}

In Listing 3,the object's restoration occurs with theoin.readObject()?method call. This method call reads in the raw bytes that we previously persisted and creates a live object that is an exact replica of the original object graph. Because?readObject()?can read any serializable object,a cast to the correct type is required.

Executing this code will print?version=100?on the standard output.

The serialized format of an object

What does the serialized version of the object look like? Remember,the sample code in the previous section saved the serialized version of the?TestSerial?object into the file?temp.out. Listing 4 shows the contents of?temp.out,displayed in hexadecimal. (You need a hexadecimal editor to see the output in hexadecimal format.)

Listing 4. Hexadecimal form of TestSerial

AC ED 6C A0 FE B1 DD F9 6F 6E 6F 6E

If you look again at the actual?TestSerial?object,you'll see that it has only two byte members,as shown in Listing 5.

Listing 5. TestSerial's byte members

publicbyte version =100=0;

Java's serialization algorithm

By now,you should have a pretty good knowledge of how to serialize an object. But how does the process work under the hood? In general the serialization algorithm does the following:

  • It writes out the metadata of the class associated with an instance.
  • It recursively writes out the description of the superclass until it findsjava.lang.object.
  • Once it finishes writing the metadata information,it then starts with the actual data associated with the instance. But this time,it starts from the topmost superclass.
  • It recursively writes the data associated with the instance,starting from the least superclass to the most-derived class.

I've written a different example object for this section that will cover all possible cases. The new sample object to be serialized is shown in Listing 6.

Listing 6. Sample serialized object

parentVersion =10 containVersion =11<span style="color: #0000ff;">public<span style="color: #000000;"> classSerialTestextends parent implementsSeriali zable{
<span style="color: #0000ff;">int
version =66<span style="color: #000000;">;
contain con
=<span style="color: #0000ff;">new
<span style="color: #000000;"> contain();

publicint getVersion(){
<span style="color: #0000ff;">return<span style="color: #000000;"> version;}

<span style="color: #0000ff;">public <span style="color: #0000ff;">static <span style="color: #0000ff;">void<span style="color: #000000;"> main(String args[])throwsIOException{
FileOutputStream fos =newFileOutputStream("temp.out"<span style="color: #000000;">);
ObjectOutputStream oos =<span style="color: #000000;">newObjectOutputStream(fos);
SerialTest st =<span style="color: #000000;">newSerialTest();
oos.writeObject(st);
oos.flush();
oos.close();
}
}

This example is a straightforward one. It serializes an object of type?SerialTest,which is derived from?parent?and has a container object,?contain. The serialized format of this object is shown in Listing 7.

Listing 7. Serialized form of sample object

AC ED AC F6 DB D2 BD EE 637A02000149000D706172656E7456657273696F6E78700000000A0000004273720007636F6E7461696E FC BB E6 FB CB C7

An outline of the serialization algorithm

Figure 2. An outline of the serialization algorithm

Let's go through the serialized format of the object in detail and see what each byte represents. Begin with the serialization protocol information:

  • AC ED:?STREAM_MAGIC. Specifies that this is a serialization protocol.
  • 00 05:?STREAM_VERSION. The serialization version.
  • 0x73:?TC_OBJECT. Specifies that this is a new?Object.

The first step of the serialization algorithm is to write the description of the class associated with an instance. The example serializes an object of type?SerialTest,so the algorithm starts by writing the description of theSerialTest?class.

  • 0x72:?TC_CLASSDESC. Specifies that this is a new class.
  • 00 0A: Length of the class name.
  • 53 65 72 69 61 6c 54 65 73 74:?SerialTest,the name of the class.
  • 05 52 81 5A AC 66 02 F6:?SerialVersionUID,the serial version identifier of this class.
  • 0x02: Various flags. This particular flag says that the object supports serialization.
  • 00 02: Number of fields in this class.

Next,the algorithm writes the fieldint version = 66;.

  • 0x49: Field type code. 49 represents "I",which stands for?Int.
  • 00 07: Length of the field name.
  • 76 65 72 73 69 6F 6E:?version,the name of the field.

And then the algorithm writes the next field,?contain con = new contain();. This is an object,so it will write the canonical JVM signature of this field.

  • 0x74:?TC_STRING. Represents a new string.
  • 00 09: Length of the string.
  • 4C 63 6F 6E 74 61 69 6E 3B:?Lcontain;,the canonical JVM signature.
  • 0x78:?TC_ENDBLOCKDATA,the end of the optional block data for an object.

The next step of the algorithm is to write the description of the?parent?class,which is the immediate superclass of?SerialTest.

  • 0x72:?TC_CLASSDESC. Specifies that this is a new class.
  • 00 06: Length of the class name.
  • 70 61 72 65 6E 74:?SerialTest,the name of the class
  • 0E DB D2 BD 85 EE 63 7A:?SerialVersionUID,the serial version identifier of this class.
  • 0x02: Various flags. This flag notes that the object supports serialization.
  • 00 01: Number of fields in this class.
parentclass.?parent?has one field,?int parentVersion = 100;.

  • 0x49: Field type code. 49 represents "I",which stands forInt.
  • 00 0D: Length of the field name.
  • 70 61 72 65 6E 74 56 65 72 73 69 6F 6E:?parentVersion,the name of the field.
  • 0x78:?TC_ENDBLOCKDATA,the end of block data for this object.
  • 0x70:?TC_NULL,which represents the fact that there are no more superclasses because we have reached the top of the class hierarchy.

So far,the serialization algorithm has written the description of the class associated with the instance and all its superclasses. Next,it will write the actual data associated with the instance. It writes the parent class members first:

  • 00 00 00 0A: 10,the value of?parentVersion.

Then it moves on to?SerialTest.

  • 00 00 00 42: 66,the value of?version.

The next few bytes are interesting. The algorithm needs to write the information about the?contain?object,shown in Listing 8.

Listing 8. The contain object

contain con = contain();

Remember,the serialization algorithm hasn't written the class description for the?contain?class yet. This is the opportunity to write this description.

  • 0x73:?TC_OBJECT,designating a new object.
  • 0x72:?TC_CLASSDESC.
  • 00 07: Length of the class name.
  • 63 6F 6E 74 61 69 6E:?contain,the name of the class.
  • FC BB E6 0E FB CB 60 C7:?SerialVersionUID,the serial version identifier of this class.
  • 0x02: Various flags. This flag indicates that this class supports serialization.
  • 00 01: Number of fields in this class.

Next,the algorithm must write the description for?contain's only field,?int containVersion = 11;.

  • 0x49: Field type code. 49 represents "I",which stands forInt.
  • 00 0E: Length of the field name.
  • 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E:?containVersion,the name of the field.
  • 0x78:?TC_ENDBLOCKDATA.

Next,the serialization algorithm checks to see if?contain?has any parent classes. If it did,the algorithm would start writing that class; but in this case there is no superclass for?contain,so the algorithm writes?TC_NULL.

  • 0x70:?TC_NULL.

Finally,the algorithm writes the actual data associated with?contain.

  • 00 00 00 0B: 11,the value of?containVersion.

Conclusion

In this tip,you have seen how to serialize an object,and learned how the serialization algorithm works in detail. I hope this article gives you more detail on what happens when you actually serialize an object.

About the author

?has more than four years of experience in the IT industry,and has been working with Java-related technologies for more than three years. Currently,he is working as a system software engineer at the Java Technology Center,IBM Labs. He also has experience in the telecom industry.

Resources

  • Read the?. (Spec is a PDF.)
  • "" (Todd M. Greanier,JavaWorld,July 2000) offers a look into the nuts and bolts of the serialization process.
  • ?of?Java RMI?(William Grosso,O'Reilly,October 2001) is also a useful reference.

reference address:

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读