Monday, September 15, 2014

Play with Java byte array

Java byte variable is mostly used when you do some low level programming. Let's go though some FAQ you will meet when using Java byte array.
Here are some ASCII characters and their values that used in the following demo lines:

Character Value in hex Value in decimal
a 0x61 97
b 0x62 98
c 0x63 99
1 0x31 49
2 0x32 50
3 0x33 51
How to initial a byte array with values
// these two lines are same
byte[] byteArray = new byte[] {0x61,0x62,0x63,0x31,0x32,0x33};
byte[] byteArray = new byte[] {97,98,99,49,50,51};
 

one thing to note is that if value of a byte is >= 0x80 (128 in decimal), you need to add cast like the following demo:

// these two lines are same
byte[] byteArray = new byte[] {0x61,(byte)0x82,0x63};
byte[] byteArray = new byte[] {97,(byte)130,99};
 

How to convert a byte array to String
This is a very ambiguous question, so let's clarify it in more details.

Case 1 {0x61,0x62,0x63} -> "abc":
Suppose you have a byte array. You target is to get a string from the ASCII value of the byte array, which means you want to get a String looks like "abc"from byte array {0x61,0x62,0x63} (If you want the output looks like "61 62 63" or "0x61 0x62 0x63" instead, check case 2 here). Java class String provide us the ability for doing that.
// english String
byte[] byteArray = new byte[] {0x61,0x62,0x63,0x31,0x32,0x33};
String output = new String(byteArray);
Syste.out.println("output="+output);  // output=abc123

If you have byte value > 0x80 or maybe you are dealing with non-english character, you need to specify the character set in the String's constructor to get the right String output. Adding character set in String constructor will also need you to catch UnsupportedEncodingException.
// non-english String
try {
 byte[] byteArray = new byte[]{(byte) 0xd6,(byte) 0xd0,(byte) 0xce,(byte) 0xc4};
 String output = new String(byteArray,"GB18030");
 Syste.out.println("output="+output);  // output=中文
} catch (UnsupportedEncodingException e) {
 e.printStackTrace();
}

Case 2 {0x61,0x62,0x63} -> "61 62 63" or "0x61 0x62 0x63":
Suppose you have a byte array. You target is to get a hex value string, so you can print it to stdout or log file. This is quite useful in debug. I have done some search one the web, the best solution is from stackoverflow here, not the it's selected answer but the discuss below it, suggesting javax.xml.bind.DatatypeConverter.printHexBinary(byte[] byteArray). It saves you from writing any thing like a loop to read the byte from array one by one. It comes with Java itself and faster than any customerized code.

The output of DatatypeConverter.printHexBinary(byte[] byteArray) is a string without space separator. For example a intput byte array {0x61,0x62,0x63} will get you "616263". We can then beautify the output of DatatypeConverter.printHexBinary(byte[] byteArray) with a regular expression to add space to every 2 characters.


import javax.xml.bind.DatatypeConverter;

byte[] byteArray = new byte[] {0x61,0x62,0x63,0x31,0x32,0x33};
String hexString = DatatypeConverter.printHexBinary(byteArray).replaceAll("([0-9a-fA-F]{2})", "$1 ").trim();
System.out.println(hexString);  // 61 62 63 31 32 33

You man have noticed there is one space after the regular expression match group variable $1. After the replaceAll function you have the String "61 62 63 31 32 33 ". The last match will create a space at the end of the string. The trim() will remove the tailing space nicely.

If you need a more readable String like "0x61 0x62" with "0x" prefix for each hex value, you can tweak regular expression part, change "$1 " to "0x$1 ".
byte[] byteArray = new byte[] {0x61,0x62,0x63,0x31,0x32,0x33};
String hexString = DatatypeConverter.printHexBinary(byteArray).replaceAll("([0-9a-fA-F]{2})", "0x$1 ").trim();
System.out.println(hexString);  // 0x61 0x62 0x63 0x31 0x32 0x33

How to convert a String to byte array
Like the previous one, this is also a ambiguous question.

Case 1 "a1b2" -> {0x61,0x31,0x62,0x32}:
Suppose you have a String, you want to get every character's ASCII value. This is how the method getBytes() of String class does.
  String str = "a1b2";
  byte[] byteArray = str.getBytes();     // byteArray = {0x61,0x31,0x62,0x32}
  System.out.println(byteArray.length);  // 4
Also like the previous question case 1 on convert byte array to String, If you have any non-english character in the String, character set is needed for getBytes() method. The getBytes(String characterSet) can throw UnsupportedEncodingException exception, so you have to catch it in your code.
  
  String str = "中文";
  byte[] byteArray;
  try {
   byteArray = str.getBytes("GB18030");   // byteArray = {0xD6,0xD0,0xCE,0xC4}
   System.out.println(byteArray.length);  // 4
   logger.info(Utils.byteArrayToString(byteArray));
  } catch (UnsupportedEncodingException e) {
   e.printStackTrace();
  } 

Case 2 "a1b2" -> {0xa1,0xb2}:
Suppose you have String as a hex String. Every 2 characters in the string stand for a byte. You can use javax.xml.bind.DatatypeConverter.parseHexBinary(String str) to achieve your goal gracefully.

There are also many code snippets on the web for the same purpose, but I think the DatatypeConverter.parseHexBinary(String str) is the best way to do so. It's a one-line solution provided by Java itself, simple and clean.
 
  import javax.xml.bind.DatatypeConverter;
 
  String str = "a1b2";
  byte[] byteArray = DatatypeConverter.parseHexBinary(str); // byteArray={0xa1,0xb2}


0 comments:

Post a Comment

Powered by Blogger.

About The Author

My Photo
Has been a senior software developer, project manager for 10+ years. Dedicate himself to Alcatel-Lucent and China Telecom for delivering software solutions.

Pages

Unordered List