Streaming HTTP responses with Ruby and Rack
Almost every Ruby program that handles HTTP requests will use the Rack
interface to do so. Rack
is an interface describing what applications can expect an HTTP request to look like and how they should respond. Having a common interface for this allows any web framework (such as Rails or Sinatra) to interoperate with any of the available servers (like Unicorn, Puma, or Falcon). In this post we’ll look at a subset of HTTP responses, namely those where the response body is not completely known when the response starts or which are too large to completely fit into memory. These responses must therefore be generated dynamically and streamed to the client.
How Rack handles responses
All Rack responses are of the form [<response code>, <hash with headers>, <response body>]
, where the response body can be any object that will yield zero or more Strings
when each
is called on it. A very simple example would be an Array
with one String
in it, like ['Hello world!']
. However, it is also possible to make your own classes that have arbitrary complexity in their each
method. With this, you can implement more sophisticated behavior like ranged requests and streaming responses. A very simple example would be the following class that just streams the numbers 1 to 10 at one second intervals:
class SlowStreamer
def each
1..10).each do |i|
(yield (i.to_s + "\n")
sleep 1
end
end
end
You could put a SlowStreamer.new
into a Rack response and any Ruby server (ie Puma, Unicorn, Thin, Falcon, etc) will stream this for you without problems. It’s not even required to add a Content-Length
header because these servers will automatically apply chunked encoding for you. This basic building block can be used for a lot of different applications.
Zip streaming
The structure of a zip file consists of a number of sections, each of which consists of a header, the actual file contents and then an optional footer. At the end of the file is another section called the “central directory”, which contains more metadata. All these values are just sequences of bytes, which means that they can be represented as String
s with the BINARY
encoding. By yield
ing each of those values in turn, it’s possible to dynamically create and stream out an entire zip file in an HTTP response and this is exactly what the ZipTricks gem provides:
= ZipTricks::RackBody.new do | zip |
body .write_stored_file('mov.mp4') do |sink| # Those MPEG4 files do not compress that well
zipFile.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
end
.write_deflated_file('long-novel.txt') do |sink|
zipFile.open('novel.txt', 'rb'){|source| IO.copy_stream(source, sink) }
end
end
[200, {}, body]
This snippet will take care of generating all the zip-specific bits you will need and yield
them in turn. It is quite feasible to have a big list of files and just @files.each
all of those files into the RackBody
. No matter how large these files are, the amount of memory used for the response will remain small.
Ranged requests
HTTP requests can have a Range
header attached to indicate that the client does not want the entire response but only a part of it. This is fairly easy to implement if the response is a short String
or even a smallish file on disk, but if the response is composed of several (possibly many) long strings or files together, then things get a lot trickier. Luckily, there’s a gem for that too. The interval_response
gem can, for example, be given a list of filepaths to be composed together (if you are streaming log files for example) and will automatically compute which parts of which file is requires. It will make sure that only one file at a time is opened and that only a constant amount of memory is required no matter the size of the file.
= log_paths.map { |path| IntervalResponse::LazyFile.new(path) }
lazy_files = IntervalResponse::Sequence.new(*lazy_files)
interval_sequence = IntervalResponse.new(interval_sequence, env)
response .to_rack_response_triplet response
The trick is once again carefully choosing where and how to yield
(or not, in this case) your String
s so that in the end only the parts of the body that the user actually requested are actually computed.
Rails and Sinatra streaming helpers
It’s fairly uncommon in the Ruby world to program directly against the Rack interface that web servers expose. Usually, we use a framework that implements a lot of the boilerplate code like routing, logging, etc. Both Rails and Sinatra have defined helpers to let you easily define streaming responses for your app.
Rails
In Rails, include ActionController::Live
into any controller that needs to stream its responses, then provide a stream
method that will actually write to the response:
def stream
100.times {
.stream.write "hello world\n"
responsesleep 1
}
ensure
.stream.close
responseend
Sinatra
In Sinatra you can call stream
in any route handler. It takes a block that takes an out
parameter representing the output which you can <<
the data into:
'/my-endpoint' do
get do |out|
stream 100.times do |i|
<< "hello world\n"
out sleep 1
end
end
end
Conclusion
Rack’s choice to allow anything that responds to each
to function as a response body makes streaming very straightforward, it also allows us to create dynamic reponse bodies very easily. A small caveat is that you should be careful when creating a response body that will yield
many small strings. Most servers will assume that you are doing this intentionally and will use the write()
syscall on every string yielded, potentially causing a lot of OS overhead. In one example we were able to get a 4-fold increase in requests per second just by combining the smaller chunks into bigger ones before releasing them to the server.
If you require more fancy response types such as websocket upgrades, you’ll probably want to look into socket hijacking as that allows you a lot more control over the socket. Happy streaming!