In philosophy, CTF3 was the same as our previous CTFs: we gave people a chance to solve problems they normally would only get to read about. However, in terms of infrastructure, this was by far our most complex CTF: we needed to build, run, and test arbitrary distributed systems code. In the course of the week it was live, our 7,500 participants pushed over 640,000 times, meaning we needed a scalable and robust architecture that provided isolation between users.
Participants have released a number of walkthroughs for the actual levels, so we won't be releasing official solutions here. Instead, we'll give you a tour of how we made the systems work. (If you'd prefer to see this in video form, we've just released the video from our CTF3 wrapup.)
As an aside, the architecture for CTF reflects a lot of what we've learned in building Stripe. If you're interested in this kind of thing, we're hiring engineers in San Francisco and remotely within US timezones. I also wrote a Quora post about the problems we're working on. (It turns out we do things besides just building CTFs :).)
CTF3 consisted of five levels. Most of the levels looked pretty similar from a high level: the user would push some code to us, we'd run it in a sandbox environment, and then we'd return a score. The one exception was the Gitcoin level, where we would just validate Git commits people had mined locally (or on their cloud vendor).
Code was submitted to us in the simplest possible way: you just
git push. On the backend, we received your code
and used wrappers and commit hooks to implement the CTF-specific
The "wrappers and commit hooks" had lot of moving parts, though. One important design goal was to decouple components and make it possible to horizontally scale any given piece of the system. Stateful pieces were few in number and were constrained to be low volume. In the following sections, we'll go into detail about how all the pieces worked, but here's how things roughly fit together:
Wondering what actually happened after you ran git push? The following steps were common between all levels.
stripe-ctf.comto the public IP for one of our
You connected to port 22 on your chosen gate server. An
haproxydaemon load-balanced your traffic to one of our
submitterboxes. We had three submitter boxes in the pool for much of the event.
As an optimization, the load balancing used IP stickiness to route you to the same submitter backend on each connection. The submitters were mostly stateless: all that they held was the code you were pushing and convenience tags for each submission. If you'd committed a large blob though, being routed to the same submitter was nice since you wouldn't have to re-upload it on each push.
In previous CTFs, rather than load balancing, we'd just exposed our machine hostnames (so you'd connect to directly to e.g.
level0-01.stripe-ctf.com). In that case, it was hard to drop a machine out of the pool or rebalance traffic. Controlling the load balancing here gave us operational flexibility at the cost of additional constraints on our system design (e.g. haproxy knew only your IP address, so we couldn't do stickiness based on username).
The public-facing sshd on your chosen submitter received the username we'd given you in the web interface, which looked like
We'd configured our PAM stack to use LDAP. So we could share the user database with the web interface, we put together a quick-and-dirty LDAP server implementation (called fakeuser) to grab usernames directly out of our central database. The users had empty passwords, which (given appropriate settings in
sshd.confand PAM) meant that you could log in without pasting a password or giving us your SSH key. Of course, the downside was that your username became a secret credential.
At this point the sshd ran your user's shell, which was a custom script in
/usr/local/bin/login-shell. The shell was pretty simple: it set some environment variables, took out an
flockon a per-user file, and then (conceptually) ran a bunch of Ruby code that did all of the level-specific work.
At first, we'd actually spawn a new Ruby interpreter and load our code on each login. This turned out to be untenable. First of all, loading Bundler plus all our code took a few seconds, which was way too slow for a login session. So we split out the code intended for just the login session into a module we called
CTF3NoBundler. This was painful to manage, and meant the no-bundler code couldn't use most of the libraries we were writing over in Bundler-land.
Even with this split, it still took about 100-200 milliseconds to load our code, which was effectively all CPU time. When we tested continuously running about 20 concurrent logins, the submitter box ground to a halt under the load. We effectively DOSed ourselves through the work of loading the same code over and over again.
At this point, perhaps the most obvious thing to do would be to rewrite in a faster-loading language. However, there's actually a decent amount of code involved in submission, and there was nothing wrong with the code once it was up and running. So instead, we decided to try a load-once, fork-for-each-login model. We took a look at using Zeus for this purpose. It's a cool tool, but unfortunately it's aimed at development rather than production, and doesn't have the kind of robust failure handling we'd need for something as core as this. So instead, we wrote a simpler implementation based off similar ideas, called Poseidon.
Here's the point at which Gitcoin and the standard pipelines diverged. The remainder of the standard submission pipeline looked like the following:
Next, we constructed your user's level repository (that is, the actual repository that you would clone) if it didn't already exist on disk. This lazy assembly meant we didn't have to waste disk space on users until they'd actually fetched some code.
In the case of a pull, we would just run git-shell and be done with it. Pushes had a lot more going on, however.
In order to make submission as easy to test drive as possible, we wanted it to be possible to
git pushstraight from a fresh clone. So before running git-shell, we played some branch renaming tricks.
We then invoked git-shell, which in turn invoked a
post-receivehook. The hook was also implemented as a Poseidon client for fast boot.
The post-receive code in the Poseidon master then served as the coordinator of your scoring run. First, it called to a
test_case_assignerservice, which ran on the singleton
colossusserver. For this and other services which required synchronous responses, we used the Ruby Thrift abstractions we use internally at Stripe.
The test_case_assigner simply grabbed some free test case records from the database, marked them as allocated, and then returned the resulting cases. These test cases were originally created by the
test_case_generatordaemon (running on the
testasaurusboxes — ok, we ran out of good names at some point). The generator simply ran our benchmark solution against random test cases. We stored metadata in our database, with the actual blob data stored on S3 so your client could later download it.
Once the post-recieve hook had its test cases, it started listening on two new RabbitMQ queues: one for results and one for output to display to the user. The hook then submitted a build RPC over RabbitMQ. We used RabbitMQ as a buffer for RPCs that we expected might get backed up, or where a synchronous response wasn't needed.
At the other end of the queue was a
builderdaemon, running on one of our aptly-named
buildboxes. Upon receiving the RPC, the daemon fetched the code from the relevant submitter's
git-daemoninto a temporary directory.
The builder then asked a central
build_cacherThrift service if the built commit was cached. Assuming not, the builder spawned a Docker container with your code mounted at your user's home directory and ran
./build.shin the container. We then streamed back the first few hundred KB of output.
The builder then
tarredup your output directory and generated a RabbitMQ score RPC for each test case. The score RPC contained a URL to fetch the tarball from an
nginxrunning on the build box. Finally, the builder uploaded the built tarball to S3 and informed the build_cacher about the new SHA1.
In the cached case, the builder just short-circuited this logic and sent the score RPCs right away.
Each score RPC was serviced by an
executordaemon on a
workerbox. The executor fetched the build product and then spawned a new container with the code mounted into it. It then (finally!) ran your code, again streaming output back to you. Once complete, the executor determined the results of your trial and then sent a result RPC back to the post-receive hook.
The post-receive hook aggregated your results and from there compiled a final result. It sent a single
FinalScoreRPCrepresenting the results of the test run to RabbitMQ.
At the other end of the wire, a
resulterdaemon hung around on the colossus box waiting to consume the FinalScoreRPC. Upon consuming the RPC, it updated your user's high score.
Gitcoin had its own architecture. Since we didn't need to run any of your code (we just needed to validate the purported Gitcoin), we could get by with a lot less complexity.
Our mining bots
To clear the level, you just needed to mine faster than our bots. The obvious design is to spawn a new miner for each end-user. However, this would be pretty expensive, as we'd have to be mining hundreds of Gitcoins at any one time.
So instead, we had miners on a single central repository on
gitcoin box, which produced a steady stream of
Gitcoins. Each submitter had a
gitcoin daemon whose job
was to periodically fetch from the central repository and then release
at most one new commit to a machine-local Gitcoin instance.
We'd started out with a coin release frequency of
rand(20) seconds, but after seeing how many people were
struggling to mine that quickly, we dropped the frequency to a
flat 90 seconds.
When you pushed, we had a
hook which would perform a bunch
checks to ensure it was a valid Gitcoin. Once your commit was
accepted, the bots had to stop because our pre-mined Gitcoins
wouldn't apply cleanly to your repository.
Gitcoin bonus round
In this round, we pitted everyone together in a master Gitcoin instance. Conveniently for us, we didn't have to run our own miners, since people provided plenty of competition against each other.
The architecture here was a single shared (created
init --bare --shared=all) global Gitcoin repository on
the gitcoin box. The submitters maintained their own clone of this
On pull, you just hit the submitter repository. On push, the commit was validated by the submitter, which then pushed (via a new SSH connection) to the backend gitcoin box. If the backend push was successful, a Thrift service on the gitcoin box would synchronously push the new commit to all other submitters.
One consequence of this architecture was that submitting Gitcoins
was decently slow — we weren't maintaining persistent
connections to the backend gitcoin server, so there was a decent
amount of overhead. We compensated for this by tuning the difficulty
to ensure the time to mine a coin was large compared to the time to
complete a push. By the contest's end, the difficulty
0000000005, a full 4 (!) orders of magnitude harder
than the difficulty we'd started with.
I hope you had as much fun playing CTF3 as we had building it. If you're curious about any details I didn't cover here, feel free to send me an email.