Follow Stripe on Twitter

Running three hours of Ruby tests in under three minutes

Nelson Elhage on August 13, 2015

At Stripe, we make extensive use of automated testing to help ensure the stability and reliability of our services. We have expansive test coverage for our API and other core services, we run tests on a continuous integration server over every git branch, and we never deploy without green tests.

The size and complexity of our codebase has grown over the past few years—and so has the size of the test suite. As of August 2015, we have over 1400 test files that define nearly 15,000 test cases and make over 130,000 assertions. According to our CI server, the tests would take over three hours if run sequentially.

With a large (and growing) group of engineers waiting for those tests with every change they make, the speed of running tests is critical. We’ve used a number of hosted CI solutions in the past, but as test runtimes crept past 10 minutes, we brought testing in-house to give us more control and room for experimentation.

Recently, we’ve implemented our own distributed test runner that brought the runtime of our tests to just under three minutes. While some of these tactics are specific to our codebase and systems, we hope sharing what we did to improve our test runtimes will help other engineering organizations.

Forking executor

We write tests using minitest, but we've implemented our own plugin to execute tests in parallel across multiple CPUs on multiple different servers.

In order to get maximum parallel performance out of our build servers, we run tests in separate processes, allowing each process to make maximum use of the machine's CPU and I/O capability. (We run builds on Amazon's c4.8xlarge instances, which give us 36 cores each.)

Initially, we experimented with using Ruby's threads instead of multiple processes, but discovered that using a large number of threads was significantly slower than using multiple processes. This slowdown was present even if the ruby threads were doing nothing but monitoring subprocess children. Our current runner doesn’t use Ruby threads at all.

When tests start up, we start by loading all of our application code into a single Ruby process so we don’t have to parse and load all our Ruby code and gem dependencies multiple times. This process then calls fork a number of times to produce N different processes that’ll each have all of the code pre-loaded and ready to go.

Each of those workers then starts executing tests. As they execute tests, our custom executor forks further: Each process forks and executes a single test file’s worth of tests inside the child process. The child process writes the results to the parent over a pipe, and then exits.

This second round of forking provides a layer of isolation between tests: If a test makes changes to global state, running the test inside a throwaway process will clean everything up once that process exits. Isolating state at a per-file level also means that running individual tests on developer machines will behave similarly to the way they behave in CI, which is an important debugging affordance.


The custom forking executor spawns a lot of processes, and creates a number of scratch files on disk. We run all builds at Stripe inside of Docker, which means we don't need to worry about cleaning up all of these processes or this on-disk state. At the end of a build, all of the state—be that in-memory processes or on disk—will be cleaned up by a docker stop, every time.

Managing trees of UNIX processes is notoriously difficult to do reliably, and it would be easy for a system that forks this often to leak zombie processes or stray workers (especially during development of the test framework itself). Using a containerization solution like Docker eliminates that nuisance, and eliminates the need to write a bunch of fiddly cleanup code.

Managing build workers

In order to run each build across multiple machines at once, we need a system to keep track of which servers are currently in-use and which ones are free, and to assign incoming work to available servers.

We run all our tests inside of Jenkins; Rather than writing custom code to manage worker pools, we (ab)use a Jenkins plugin called the matrix build plugin.

The matrix build plugin is designed for projects where you want a "build matrix" that tests a project in multiple environments. For example, you might want to build every release of a library against several versions of Ruby and make sure it works on each of them.

We misuse it slightly by configuring a custom build axis, called BUILD_ROLE, and telling Jenkins to build with BUILD_ROLE=leader, BUILD_ROLE=worker1, BUILD_ROLE=worker2, and so on. This causes Jenkins to run N simultaneous jobs for each build.

Combined with some other Jenkins configuration, we can ensure that each of these builds runs on its own machine. Using this, we can take advantage of Jenkins worker management, scheduling, and resource allocation to accomplish our goal of maintaining a large pool of identical workers and allocating a small number of them for each build.


Once we have a pool of workers running, we decide which tests to run on each node.

One tactic for splitting work—used by several of our previous test runners—is to split tests up statically. You decide ahead of time which workers will run which tests, and then each worker just runs those tests start-to-finish. A simple version of this strategy just hashes each test and take the result modulo the number of workers; Sophisticated versions can record how long each test took, and try to divide tests into group of equal total runtime.

The problem with static allocations is that they’re extremely prone to stragglers. If you guess wrong about how long tests will take, or if one server is briefly slow for whatever reason, it’s very easy for one job to finish far after all the others, which means slower, less efficient, tests.

We opted for an alternate, dynamic approach, which allocates work in real-time using a work queue. We manage all coordination between workers using an nsqd instance. nsq is a super-simple queue that was developed at; we already use it in a few other places, so it was natural to adopt here.

Using the build number provided by Jenkins, we separate distinct test runs. Each run makes use of three queues to coordinate work:

  • The node with BUILD_ROLE=leader writes each test file that needs to be run into the test.<BUILD_NUMBER>.jobs queue.
  • As workers execute tests, they write the results back to the test.<BUILD_NUMBER>.results queue, where they are collected by the leader node.
  • Once the leader has results for each test, it writes "kill" signals to the test.<BUILD_NUMBER>.shutdown queue, one for each worker machine. A thread on each worker pulls off a single event and terminates all work on that node.

Each worker machine forks off a pool of processes after loading code. Each of those processes independently reads from the jobs queue and executes tests. By relying on nsq for coordination even within a single machine, we have no need for a second, machine-local, communication mechanism, which might risk limiting our concurrency across multiple CPUs.

Other than the leader node, all nodes are homogenous; they blindly pull work off the queue and execute it, and otherwise behave identically.

Dynamic allocation has proven to be hugely effective. All of our worker processes across all of our different machines reliably finish within a few seconds of each other, which means we're making excellent use of our available resources.

Because workers only accept jobs as they go, work remains well-balanced even if things go slightly awry: Even if one of the servers starts up slightly slowly, or if there isn't enough capacity to start all four servers right at once, or if the servers happen to be on different-sized hardware, we still tend to see every worker finishing essentially at once.


Reasoning about and understanding performance of a distributed system is always a challenging task. If tests aren't finishing quickly, it's important that we can understand why so we can debug and resolve the issue.

The right visualization can often capture performance characteristics and problems in a very powerful (and visible) way, letting operators spot the problems immediately, without having to pore through reams of log files and timing data.

To this end, we've built a waterfall visualizer for our test runner. The test processes record timing data as they run, and save the result in a central file on the build leader. Some Javascript d3 code can then assemble that data into a waterfall diagram showing when each individual job started and stopped.

Waterfall diagrams of a slow test run and a fast test run.

Each group of blue bars shows tests run by a single process on a single machine. The black lines that drop down near the right show the finish times for each process. In the first visualization, you can see that the first process (and to a lesser extent, the second) took much longer to finish than all the others, meaning a single test was holding up the entire build.

By default, our test runner uses test files as the unit of parallelism, with each process running an entire file at a time. Because of stragglers like the above case, we implemented an option to split individual test files further, distributing the individual test classes in the file instead of the entire file.

If we apply that option to the slow files and re-run, all the "finished" lines collapse into one, indicating that every process on every worker finished at essentially the same time—an optimal usage of resources.

Notice also that the waterfall graphs show processes generally going from slower tests to faster ones. The test runner keeps a persistent cache recording how long each test took on previous runs, and enqueues tests starting with the slowest. This ensures that slow tests start as soon as possible and is important for ensuring an optimal work distribution.

The decision to invest effort in our own testing infrastructure wasn't necessarily obvious: we could have continued to use a third-party solution. However, spending a comparatively small amount of effort allowed the rest of our engineering organization to move significantly faster—and with more confidence. I'm also optimistic this test runner will continue to scale with us and support our growth for several years to come.

If you end up implementing something like this (or have already), send me a note! I'd love to hear what you've done, and what's worked or hasn't for others with similar problems.

August 13, 2015

Stripe in the Nordics

Felix Huber on June 15, 2015

We’ve been in beta in the Nordics for the past year. In that time, we’ve had the chance to work closely with a growing number of companies here to better understand the challenges they face to run a business in Denmark, Finland, Norway, and Sweden.

Starting today, any company in these countries can set up a Stripe account and accept live charges in minutes. As usual, all businesses can accept payments from customers around the world in over 135 currencies.

Based on our conversations with many of our beta users, we’re also updating our pricing for the countries in the region:

  • Denmark: 1.7% + 1.8kr for Danish cards and 2.9% + 1.8kr for international cards
  • Finland: 2.0% + 20c for Finnish cards and 2.9% + 20c for international cards
  • Norway: 2.4% + 2kr for Norwegian cards and 2.9% + 2kr for international cards
  • Sweden: 1.9% + 1.8kr for Swedish cards and 2.9% + 1.8kr for international cards

Thanks to the hundreds of beta users who’ve helped to shape our product for the Nordics. We’ve been delighted to work with a wide variety of companies to bring their businesses online—from Cancerfonden (Sweden’s largest non-profit organization for cancer research) to Tictail (a platform that lets users create and run beautiful online stores within minutes).

Now that we’re fully launched, we’re looking forward to seeing all the new experiences you’ll build with Stripe.

If you have any questions or feedback (or are interested in working with us), please drop me a line.

Start accepting payments instantly. Get Started with Stripe

June 15, 2015

Preview subscription changes

Peter Raboud on April 23, 2015

Upgrades! We love them. You know you’re doing something right when a user wants to pay more for your product. But upgrades can be tricky, too. If you prorate the change, your customers may be confused on how much they’ll be charged on their next bill.

We now offer a preview of upgrade charges for subscriptions before they happen. You can see how switching plans or changing quantities would impact a customer by querying the upcoming invoices endpoint. We’ll return an estimate of the user’s next invoice, including any applicable prorations. You can display this estimate to users to maximize the chance they finish their upgrade.

Here’s an example of what happens when you send a new subscription plan your customer is considering:

curl -G \
   -u sk_test_BQokikJOvBiI2HlWgH4olfQ2: \
   -d customer=cus_66Nqfe223Fjuy0 \
   -d subscription=sub_66Nux8KYRsiquq \
   -d subscription_plan=super_gold_plan

  "date": 1432252223,
  "period_start": 1429808152,
  "period_end": 1432252223,
  "next_payment_attempt": 1432255823,
  "amount_due": 1762

You can also estimate the charge for a subscription change at a specific point of time in the future. Just pass a proration_date when previewing the amount.

We hope this helps provide your customers with more predictable upgrades (or downgrades). If you have any questions or feedback, please let me know.

April 23, 2015

The new Connect

Brian Krausz on March 23, 2015

In 2012, we noticed that many of the most exciting Stripe users were building businesses that helped others accept money. Shopify was helping e-commerce businesses get started and Postmates was building a mobile restaurant delivery network. In October 2012, we launched Stripe Connect to allow these multi-sided platforms to connect with thousands of seller accounts on Stripe.

Plenty has changed since then. The number of these platforms has exploded: services like Instacart, Kickstarter, Shyp, Tilt, Lyft, TaskRabbit, and Handy. Stripe has helped over half a million sellers get paid on platforms like these. And we’ve learned a lot about the subtleties of each use case.

Regular businesses just need to accept money from buyers, but running one of these platforms is much more complex. They also have to verify seller identity (to comply with know-your-customer laws and to prevent fraud), collect and verify their sellers’ banking information, track seller earnings, help sellers get paid on the right schedule, handle IRS tax reporting requirements, and more. I get tired just thinking about it.

The new Stripe Connect is the result of everything we’ve learned from powering these platforms. The changes we’ve made make setting up accounts for sellers even easier—they don’t need to even come to Stripe. You can now support sellers in more countries. And Stripe helps with everything involved in operating the platform.

Managed accounts

In addition to connecting to regular Stripe accounts (which Connect has supported since 2012), we’re now enabling platforms to spin up and administer “managed accounts”. Managed accounts allow you to customize all aspects of the experience for sellers—from what the setup flow looks like and payment schedules to who pays fees and when info is collected. These managed accounts can be set up for sellers wherever Stripe is supported (18 countries, with more coming this year). This unifies our previously-separate “Transfers API” with Connect.

New Lightweight setup for sellers

With managed accounts, Stripe gets out of the way of your relationship with your sellers or contractors. You can fully customize how sellers join your platform and build very lightweight sign up flows. In fact, you can get a seller started with just a country and email address using the new accounts endpoint:

curl \
   -u sk_test_BQokikJOvBiI2HlWgH4olfQ2: \
   -d managed=true \
   -d country=US \

  "id": "acct_12QkqYGSOD4VcegJ",
  "keys": {
    "secret": "sk_live_AxSI9q6ieYWjGIeRbURf6EG0",
    "publishable": "pk_live_h9xguYGf2GcfytemKs5tHrtg",
  "managed": true,

We’ll send back the seller’s account information, which you can use to start creating charges on their behalf right away. We’ll let you know via the account.updated webhook if and when you need to collect any additional info about your sellers.

When creating charges, you can also specify the seller receiving the funds. Stripe will handle paying out the seller on the schedule you specified, and you won’t need to manually reconcile payments and transfers, making accounting and bookkeeping easier:

curl \
   -u sk_test_BQokikJOvBiI2HlWgH4olfQ2: \
   -d amount=1000 \
   -d currency=EUR \
   -d customer=cus_49mpFwI9tFb1AO \
   -d destination=acct_12QkqYGSOD4VcegJ
   -d application_fee=200

New International sellers

Scaling businesses internationally can be pretty hard, but it’s especially difficult for platforms. Traditionally, supporting sellers in other countries required you to either run all international transactions through the U.S. (which meant more declined cards and currency conversion fees) or registering local business entities in every region supported.

With Connect, we’ve worked to help you provide a local experience for your sellers while keeping your code manageable and scalable. For example, you might need a Spanish crowdfunding campaign’s Número de Identificación Fiscal or an Australian boutique’s ABN before they can get paid. In all these cases, we’ll help you out and let you know what info to collect via the aptly-named fields_needed array on the account.

Update Build apps for Stripe users

As before, you can use Connect to get secure access to Stripe data, and use that to build dashboards, invoicing integrations, feature add-ons, and more. If you like being meta, you can now also create integrations specifically for platforms, such as the one built by QuickBooks for self-employed workers.


Nearly all of Connect’s functionality is free. The cost for accepting payments remains the same. Setting up managed accounts costs just 0.5% of funds paid out—this flat rate includes ID verification, helping to generate tax documents where necessary, and even international accounts.

We’re thrilled to see diverse marketplaces and platforms being built and grown on Stripe.

Marketplaces have always been a big deal (just ask Bill Gurley), but we think that they’re likely to become more important still. As the rise of smartphones lets internet businesses make the leap from the virtual to the real world, we think that enormously successful marketplaces and platforms remain to be built. We look forward to building the most useful tools and APIs for them along the way.

Thanks very much to our beta users for their feedback which has helped us shape the product. If you have any questions, get in touch!

Learn more about Connect Explore the Docs

March 23, 2015

Unifying payment types in the API

Max Lahey on March 11, 2015

As we expand the range of payment types that we support, we’ve released a new API version to unify the interface across payments of all types.

Globally, credit cards are the source of most online payments but they’re far from ubiquitous. Local payment mechanisms (such as China’s prominent e-wallets) power commerce in markets around the world. Meanwhile, novel payment types, from digital currencies to Apple Pay, are establishing new standards for online transactions.

In accepting these new instruments, there’s a lot to be excited about: broader global reach, increased revenue, and improved user experience. We want to make supporting them as easy as possible—hence building this unified API. For essentially all purposes, a payment is a payment regardless of where it came from. Payments of all types behave identically in the API and Dashboard.

Here’s a quick overview of the new API version:

UpdateCreate a charge from a source

Charge objects returned by the API now have a source property in place of the card property. This describes the source that you used for the charge, such as a card or Bitcoin receiver.

You can still create a charge using the card parameter, but it is now superseded by the source parameter.

UpdateManage customer sources

Customer objects now have the sources and default_source properties in place of cards and default_card. Similarly, we’ve introduced the new request parameters source and default_source although the old parameters will be supported indefinitely.

There is a new API endpoint for managing the customer’s payment sources beyond just cards: /v1/customers/{CUSTOMER_ID}/sources.

Here is an example of creating a charge in the new API version:

curl \
   -u sk_test_BQokikJOvBiI2HlWgH4olfQ2: \
   -d source=aliacc_4hzzUhIjJ9sZZv \
   -d amount=1000 \
   -d currency=usd

  "id": "ch_15aZNk2eZvKYlo2CW82HVvQf",
  "object": "charge",
  "amount": 1000,
  "currency": "usd",
  "source": {
    "id": "aliacc_4hzzUhIjJ9sZZv",
    "object": "alipay_account",
    "created": 1424987420,
    "username": "",
    "reusable": true,

Switching to the source property means that you don’t need additional code changes to accept the new payment types that we add over time. We’ve made this switch as easy as possible: old API versions include both the old and new properties on all responses. For instance, if you charge a card on an old API version then the charge object will have identical card and source properties. If you charge an Alipay account on an old API version then the card property will be null and the source property will be the Alipay account.

If you have questions or feedback about this change, then we’d love to hear from you!

March 11, 2015

Smarter saved cards

Michelle Bu on January 21, 2015

Outdated card details are a big problem for online businesses. If your customers get a new card from their banks (or the number or expiry date changes), they have to manually re-add it to every service or the service stops working. It’s frustrating for customers, and loses paying customers for businesses needlessly.

We’ve rolled out support for handling new cards nicely. Now, when you save a customer with Stripe, their card will continue to work even if the physical card gets replaced by the bank. Stripe works directly with card networks so that your customers can continue using your service without interruption.

There’s no extra work required, and this feature works with most MasterCard, Discover, and Visa cards—without this improvement, over half of the cards stored with Stripe in the last year would stop working by 2016 if they weren’t updated.

  "id": "evt_5WmzN8V26JZQ1B",
  "type": "customer.source.updated",
  "object": "event",
  "data": {
    "object": {
      "id": "card_0ggBPvHF5HODr5",
      "object": "card",
      "last4": "3110",
      "exp_month": 11,
      "exp_year": 2017,
      "customer": "cus_8h42pwFc41m2",
    "previous_attributes": {
      "exp_year": 2014

The customer.source.updated webhook will fire if your customers’ info changes.

The saved card only stays working as long as the credit or debit card account stays open. Your customers won’t have to worry about being billed after they’ve canceled their subscription or after they’ve closed a credit card account.

We hope this makes life easier for you and your users alike. If you have any questions or feedback, get in touch!

January 21, 2015

Machine learning for fraud detection

Michael Manapat on January 14, 2015

Using data from across the Stripe network, we’ve developed a machine learning system that evaluates charges in real-time and blocks those that are almost certainly fraudulent. By analyzing hundreds of different characteristics pertaining to each payment, these algorithms have already shielded businesses on Stripe from millions of attempted fraudulent charges.

Starting today, you can help improve these models. By letting us know when we’ve missed a charge that you believe to be fraudulent, or declined one that you think is legitimate, you can help train a model that’s optimized for your business. This ensures that the protection you receive will get better over time.

In practice, it’s very simple. In the dashboard, you can now:

  • Report and refund charges you believe to be fraudulent—just follow the “Report fraudulent payment” link when viewing a charge.
  • Mark as safe charges that Stripe blocked as suspected fraud. Charges we believe to be fraudulent will appear with a status of “Blocked” in the payments list and have a prominent message on the payment detail page. If you mark one of these charges as safe, you can retry it—we will not attempt to block it again.

There’s also equivalent functionality available in the API—check out the documentation for more details.

Over time, businesses on Stripe should not have to think about fraud and disputes. By enabling this feedback loop—between you and other businesses built on Stripe on the one hand, and our fraud detection infrastructure on the other—we’re confident that we’ll be able to deliver significant improvements in the future.

If you have any feedback or suggestions on these tools, or other thoughts about how we could better help you block fraud, please drop me a line.

January 14, 2015