Java serialization - how many bytes for a character? -
i'm having string objects in java i'm serializing. i'm wondering size of each serialized character in string.
is true standard english language letters (e.g. 'a' or 'g') need 1 or 2 bytes , special symbols comma or exclamation mark need 8 bytes?
but how much bytes need number symbol (0 - 9) in serialized string?
edit: serialization in next way:
socket = new socket(host, port); objectoutputstream outputstream = new objectoutputstream(new bufferedoutputstream(socket.getoutputstream())); outputstream.writeobject(request); outputstream.flush();
the deserialization done in similar way using objectinputstream.
the object serialize (request) contains field of type string can e.g. "aaaa" or "aaaa" or "a0a3a5" etc. (i.e. uper- , lowercase letters , numbers).
you utilize java serialization complies http://docs.oracle.com/javase/6/docs/platform/serialization/spec/protocol.html.
the representation of string objects consists of length info followed contents of string encoded in modified utf-8. modified utf-8 encoding same used in javatm virtual machine , in java.io.datainput , dataoutput interfaces; differs standard utf-8 in representation of supplementary characters , of null character. form of length info depends on length of string in modified utf-8 encoding. if modified utf-8 encoding of given string less 65536 bytes in length, length written 2 bytes representing unsigned 16-bit integer. starting javatm 2 platform, standard edition, v1.3, if length of string in modified utf-8 encoding 65536 bytes or more, length written in 8 bytes representing signed 64-bit integer. typecode preceding string in serialization stream indicates format used write string.
string serialized utf-8 ascii chars encoded 1 byte , numbers ascii yes encoded 1 byte.
see http://en.wikipedia.org/wiki/utf-8 farther information.
java serialization
No comments:
Post a Comment