Friday 15 March 2013

Java - Why String( a).getBytes() == a not give the same result? -



Java - Why String( a).getBytes() == a not give the same result? -

for illustration array:

byte[] arr = {37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -30, -29, -49, -45, -121, -104 };

and code:

string = new string(arr, charset.forname("us-ascii")); system.out.println(arrays.tostring(arr)); system.out.println(arrays.tostring(a.getbytes(charset.forname("us-ascii")))); system.out.println( arrays.equals(arr, a.getbytes(charset.forname("us-ascii"))) );

the result is:

in "windows-1251":

[37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -30, -29, -49, -45, -121, -104] [37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -30, -29, -49, -45, -121, 63] false

in "us-ascii":

[37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -30, -29, -49, -45, -121, -104] [37, 80, 68, 70, 45, 49, 46, 53, 13, 37, 63, 63, 63, 63, 63, 63] false

in "utf-8":

[37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -30, -29, -49, -45, -121, -104] [37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -17, -65, -67, -17, -65, -67, -17, -65, -67, -45, -121, -17, -65, -67] false

i have test various test case , found give different arrays when there negative number. , tried "windows-1251" in question arrays still different. question is:

why? how prepare it?

addtional info:

i'm using jre8 , on windows 8.1.

resolution: utilize charset iso-8859-1, give thanks slaks explaining , jb nizet point out iso-8859-1

string = new string(arr, charset.forname("iso-8859-1")); system.out.println(arrays.tostring(arr)); system.out.println(arrays.tostring(a.getbytes(charset.forname("iso-8859-1")))); system.out.println( arrays.equals(arr, a.getbytes(charset.forname("iso-8859-1"))) );

result:

[37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -30, -29, -49, -45, -121, -104] [37, 80, 68, 70, 45, 49, 46, 53, 13, 37, -30, -29, -49, -45, -121, -104] true

63 codepoint ?. decoder homecoming ? every byte not valid in encoding.

for us-ascii, includes every byte above 127.

for utf-8, includes every byte above 127 not follow proper utf8 rules.

java

No comments:

Post a Comment