byte[] => String => byte[]

Does anyone know how to take a byte[] to a String and back to a byte[] and get the same bytes you started with? I figured this would just work signed bytes not with standing. But it turns out the String(byte[]) converts negative bytes to unexpected characters, unexpected because I don’t know what its doing. Here an example:


public class Test {
    public static void main(String[] args) {
        try {
            byte[] b = new byte[] {0, -127, 0, 12};
            String s = new String(b, "ascii");
            printBytes(b);
            System.out.println("\ntest:" + s + ":");
            printBytes(s.getBytes("ascii"));
        } catch(Exception e) {
            e.printStackTrace();
        }
    }

    private static void printBytes(byte[] bytes) {
        System.out.print("bytes:");

        for(byte b : bytes) {
            System.out.format("%d,", b);
        }
    }
}

The output is:


bytes:bytes:0,-127,0,12,
test: {unprintable characters}
bytes:bytes:0,63,0,12,

So -127 goes to 63, that makes not sense. Does anyone have any ideas on how to go about doing this? Or what a good google search phrase would be?

-127 is not ASCII code. 0-255 is traditionally Extended ASCII.

Right, -127 is 0x81 in hex which is 129 in unsigned byte, in signed byte it’s -127.

Check out base 64 encoding

public static void main (String[] args) throws Exception {
	byte[] bytes = new byte[256];
	for (int i = 0; i < 256; i++) {
		bytes[i] = (byte)i;
		System.out.print(bytes[i]);
		System.out.print(", ");
	}
	System.out.println();

	StringBuilder buffer = new StringBuilder(bytes.length);
	for (byte b : bytes)
		buffer.append((char)b);

	byte[] bytes2 = new byte[buffer.length()];
	int i = 0;
	for (char c : buffer.toString().toCharArray()) {
		bytes2[i] = (byte)c;
		System.out.print(bytes2[i]);
		System.out.print(", ");
		if (bytes2[i] != bytes[i]) System.out.println("\no noes!");
		i++;
	}

	System.out.println(buffer);
}

Use “ISO-8859-1” or “UTF-8” instead of “ASCII”.

ASCII’s 0-127, not 0-255.

Cas :slight_smile:

I cannot stress this enough: you can not use UTF8 for encoding/decoding binary.
You will end up with corrupted data in the decoded byte[].

http://www.java-gaming.org/index.php/topic,20316.0.html

That’ll teach me to read the question again before answering. By the time I posted I was thinking in terms of String -> byte[] -> String.

Thanks guys, yes utf-8 doesn’t work, but ISO-8859-1 works.

Doesn’t solve my problem since what I really need to do is go from ascii to utf-16be and back.

Edit: apparently


String s = new String(bytes, "UTF-16BE");

and


s.getBytes("ISO-8859-1");

does the trick after using CharsetEncoder and CharsetDecoder.