7/17/2006

Binary Serialization To / From String and Encoding

Recently somebody posted on the C# language newsgroup that they couldn't figure out how to convert an object to a string (and the reverse) since all the examples only showed how to write / read to a file.

I chimed in that I thought what the OP really meant was "how to convert a stream to a string" (as in using the BinaryFormatter for serialization), and so I posted the following sample:

Stream to string:

byte[] b = MyMemoryStream.ToArray();
string s = System.Text.Encoding.UTF8.GetString(b);


String to stream:

string s = "whatever";
byte[] b = System.Text.Encoding.UTF8.GetBytes(s);
MemoryStream ms = new MemoryStream(b);



Friend and fellow MVP Jon Skeet, who is pedantic to a fault, responded with this:

"That's a way which is almost guaranteed to lose data. Serialization with BinaryFormatter produces opaque binary data, which may very well not be a valid UTF-8 encoded string.

To convert arbitrary binary data to a string and back, I'd use Convert.ToBase64String and Convert.FromBase64String."

Jon is absolutely correct, and I suspect that many developers are not aware that just by choosing what one would "think" is a broad encoding, that we are guaranteed data integrity. Well, we are not.

The correct way (MSDN documentation links first:)

[MSDN] Convert.ToBase64String:


[MSDN] Convert.FromBase64String:

And, revised code sample:

Stream to string:

byte[] b = MyMemoryStream.ToArray();
string s = Convert.ToBase64String(b);


String to stream:

string s = "whatever";
byte[] b = Convert.FromBase64String(s);
MemoryStream ms = new MemoryStream(b);