Meet Einhorn
Greg Brockman on May 24, 2012 in Engineering
Today we're happy to release Einhorn, the language-independent shared socket manager. Einhorn makes it easy to have multiple instances of an application server listen on the same port. You can also seamlessly restart your workers without dropping any requests. Einhorn requires minimal application-level support, making it easy to use with an existing project.
Motivation
The main alternatives for achieving this functionality are FastCGI (and related options such as Phusion Passenger) and Unicorn (and derivatives such as Rainbows!). In our case using either would have required significant application changes. As well, we could only use them for applications speaking HTTP. So we decided to build a general solution.
Unicorn's architecture has a lot going for it, though. It uses a shared socket opened by a master process and then inherited by workers. This means all concurrency is handled by your operating system's scheduler. At any time, you can ask Unicorn to upgrade your workers, and it will spin up a new pool of workers before killing off the old. Unicorn can also preload your application, meaning it loads everything prior to forking so that your code is only stored in memory once.
We decided to take the best features of Unicorn and roll them into a language-independent shared socket manager, which we dubbed Einhorn (the German word for Unicorn).
Using Einhorn
Installing Einhorn is easy:
$ gem install einhorn
Running a process under Einhorn is as simple as:
$ einhorn -n 3 sleep 5 [MASTER 19665] INFO: Writing PID to /tmp/einhorn.pid [MASTER 19665] INFO: Launching 3 new workers [MASTER 19665] INFO: ===> Launched 19666 [WORKER 19666] INFO: About to exec ["/bin/sleep", "5"] [MASTER 19665] INFO: ===> Launched 19667 [WORKER 19667] INFO: About to exec ["/bin/sleep", "5"] [MASTER 19665] INFO: ===> Launched 19668 [WORKER 19668] INFO: About to exec ["/bin/sleep", "5"] ...
This will spawn and autorestart three copies of sleep
5
. Einhorn is configured with a handful command line flags (run
einhorn -h
for usage).
Einhorn ships with a sample app, time_server
,
that shows how to use Einhorn's shared-socket features. To run it,
cd
into the example
directory, and execute
something like the following:
$ einhorn -m manual ./time_server srv:127.0.0.1:2345,so_reuseaddr [MASTER 20265] INFO: Writing PID to /tmp/einhorn.pid [MASTER 20265] INFO: Binding to 127.0.0.1:2345 with flags ["so_reuseaddr"] [MASTER 20265] INFO: Launching 1 new workers [MASTER 20265] INFO: ===> Launched 20266 [WORKER 20266] INFO: About to exec ["./time_server", "6"] Called with ["6"] [MASTER 20265] INFO: [client 2:7] client connected [MASTER 20265] INFO: Received a manual ACK from 20266 [MASTER 20265] INFO: Up to 1 / 1 manual ACKs [MASTER 20265] INFO: [client 2:7] client disconnected ...
Let's break down the arguments here. The -m manual
flag indicates that Einhorn should wait for an explicit
acknowledgement from the time_server
before considering
it "up". (By default, Einhorn will consider a worker up if it's been
alive for one second.) When it's ready, the time_server
worker connects to the Einhorn master and sends an ACK command.
The remaining arguments serve as a template of the program to
run. Einhorn scans for server socket specifications of the form
srv:(IP:PORT)[<,OPT>...]
. When it finds one, it
configures a corresponding socket and replaces the specification with
the socket's file
descriptor number. The specification
srv:127.0.0.1:2345,so_reuseaddr
is taken to mean "create
a socket listening on 127.0.0.1:2345
with the SO_REUSEADDR
flag set". In the above case, the opened socket had file descriptor
number 6. See the README
for more details on specifying server sockets.
Features
Einhorn lets you spin up any number of worker processes (the
number can be adjusted on the fly) each possessing one or more shared
sockets. Einhorn can spawn a new pool of workers and gracefully kill
off the old ones, allowing seamless upgrades to new versions of your
code. As well, Einhorn gets out of your application's way — the
shared sockets are just file descriptors which your application
manipulates directly or manages with an existing framework. You can
introspect a running Einhorn's state or send it administrative
commands using its command shell, einhornsh
.
If you happen to be using Ruby, Einhorn can also preload your
application. Just pass a -p PATH_TO_CODE
and define a
method einhorn_main
as your workers' entry point:
$ einhorn -n 2 -p ./pool_worker.rb ./pool_worker.rb argument [MASTER 20873] INFO: Writing PID to /tmp/einhorn.pid [MASTER 20873] INFO: Set ARGV = ["argument"] [MASTER 20873] INFO: Requiring ./pool_worker.rb (if this hangs, make sure your code can be properly loaded as a library) From PID 20873: loading /home/gdb/stripe/einhorn/example/pool_worker.rb [MASTER 20873] INFO: Successfully loaded ./pool_worker.rb [MASTER 20873] INFO: Launching 2 new workers [MASTER 20873] INFO: ===> Launched 20875 [WORKER 20875] INFO: About to tear down Einhorn state and run einhorn_main [WORKER 20875] INFO: Set $0 = "./pool_worker.rb argument", ARGV = ["argument"] [MASTER 20873] INFO: ===> Launched 20878 From PID 20875: Doing some work [WORKER 20878] INFO: About to tear down Einhorn state and run einhorn_main [WORKER 20878] INFO: Set $0 = "./pool_worker.rb argument", ARGV = ["argument"] From PID 20878: Doing some work ...
As in Unicorn, this reduces memory usage and makes spawning additional workers very lightweight. Preloading is Einhorn's only language-dependent feature (and was easy to implement because Einhorn is itself written in Ruby). Adding preloading for other languages would require some architectural changes, but we might do it in the future.

Though Einhorn requires very little cooperation from your code, we still had to do some work to make our API servers compatible. In particular, we use Thin and EventMachine, both of which needed patching to support the use of an existing file descriptor. The relevant patches are on the master branch of our public forks of the respective projects.
These days, we use Einhorn to run all of our application
servers. We also use it to run our non-web processes where we want to
spawn and keep alive multiple instances. We run Einhorn under a
process manager (we use daemontools, but any will
work) — adding Einhorn into your existing infrastructure should
just require adding an einhorn
into the command-line
arguments of your managed processes.
We've been using Einhorn in production for a number of months now. We hope you'll find it useful as well. If you want to run a web app but can't use Unicorn, or if you have a worker process that you want to start pooling, you should check Einhorn out and let us know what you think!