Mike Knepper

Strings vs. Bytes

May 16, 2014

One of my tasks this week was to make my server capable of serving up image files. Ultimately this isn’t too difficult a thing to do, but it did require me revising my understanding about a fundamental aspect of how a server works.

Early last week I was using curl to get more comfortable with HTTP requests and responses. I was delighted to realize that the communication between a server and its clients was not actually that complicated; curl suggested that at the end of the day, each was simply sending and receiving text. Of course, the server would have to parse the text to determine things like what the client is requesting, and the client would need to parse the text to do things like render it properly in a browser, but those were just details. If I could capture incoming text and generate outgoing text, I figured I would be in good shape.

This conception of HTTP requests and responses is not terribly inaccurate, except for one key point: the “stuff” getting communicated back and forth is not specifically text, but data in the form of bytes. Now, one might argue that this distinction is purely semantic: after all, strings of text are just collections of bytes displayed a certain, human-readable way. Indeed, the difference seemed trivial while I was serving up HTML pages, as I had no errors printing out strings of text as my response and the web browser client had no difficulty rendering the HTML properly. It wasn’t until I had to work with images that I understood the difference. Like HTML files, PNG/JPEG/GIF files are just collections of bytes displayed a certain way. However, these bytes cannot be displayed or translated as “text.” Try running curl on an image URL. The output is… not pretty. I described one difference between curl and a web browser being that a web browser can interpret HTML and render it properly, while curl cannot. In fact, this is a step past an even earlier difference: a web browser can convert bytes into text which it can then interpret as HTML or plain text, and can convert bytes into image files which it can display. Curl, on the other hand, can convert bytes into text that it prints (without distinguishing between HTML and plain text), but struggles to interpret, translate, and display bytes representing an image file. This is not necessarily curl’s fault–the bytes in an image file are not meant to be translated into text characters; by definition, they are meant to be translated into pixels of color. Meanwhile, a terminal is only concerned with presenting text characters.

You might be able to anticipate my problem now. My Response object had three fields–String status, String version, and String body–and my Responder simply arranged that info into a single string formatted according to HTTP convention and printed it out to the client. The body field was set by locating the requested resource, opening up a BufferedReader, and reading the file line by line. This obviously fails when the requested resource is an image! Essentially I was trying to translate something that cannot be translated as text into text, send it as text to a client, and hope that the client could somehow translate the not-actually-translated text back into an image file. If I thought it was hard to translate an image file into text, imagine how hard it’d be for the client to receive complete nonsense and form something coherent out of it!

The solution was to get closer to the metal, as Uncle Bob might say. When my server locates the requested resource, it no longer tries to translate it and store it as text, but rather stores the collection of bytes as bytes. Indeed, why translate? It’s impossible for some files (like images), and for others, what benefit does the server derive from storing it as human-readable text? Unlike a human, the server is perfectly happy working with raw bytes. With my body now stored properly as bytes, I can then convert the rest of the response (the status line and headers) into bytes (with the delightfully simple getBytes() method) and send one big byte collection back to the client, which is perfectly happy receiving and interpreting raw bytes.