Engineering

Railyard: how we rapidly train machine learning models with Kubernetes

May 7, 2019

Rob Story Risk Intelligence

Stripe uses machine learning to respond to our users’ complex, real-world problems. Machine learning powers Radar to block fraud, and Billing to retry failed charges on the network. Our machine learning infrastructure scores hundreds of millions of predictions across many machine learning models. Over time, the volume, quality of data, and number of signals have grown enormously. Here we discuss Railyard and our lessons on building and operating machine learning infrastructure.

Engineering

Stripe’s fifth engineering hub is Remote

May 2, 2019

David Singleton CTO

Stripe has engineering hubs in San Francisco, Seattle, Dublin, and Singapore. We are establishing a fifth hub that is less traditional but no less important: Remote. We are doing this to situate product development closer to our customers, improve our ability to tap the 99.74% of talented engineers living outside the metro areas of our first four hubs, and further our mission of increasing the GDP of the internet.

Engineering

On building a new engineering hub in Dublin

March 21, 2019

Madison White Banking Integrations

Stripe builds economic infrastructure, and we’re designing for a global audience and market. In doing so, we carefully consider our technology and tools, organizational structure, and employee representation. Successful global organizations establish this mindset for different reasons. For some, it’s foundational—their mission, product, and addressable market crosses time zones. Others develop an international customer base, hire remote employees, or begin to open offices abroad to extend their physical presence.

Engineering

Effectively using AWS Reserved Instances

June 26, 2018

Ryan Lopopolo Merchant Experience

Reserved instances are hard to purchase effectively. It’s easy to allocate the wrong number, and hard to predict future compute requirements over time. Deciding which and how many reserved instances to buy is a non-trivial exercise at the nexus of cloud strategy, bin packing, and capacity planning. Here's how we use AWS Reserved Instances to dynamically scale our fleet of servers and predictably forecast our cloud spend.

Engineering

Learning to operate Kubernetes reliably

December 20, 2017

Julia Evans Engineering

We built a distributed cron job scheduling system on top of Kubernetes, an exciting new platform for container orchestration. In this post, we’ll explain why we chose to build on top of Kubernetes, how we integrated Kubernetes into our existing infrastructure, our approach to building confidence in (and improving) our Kubernetes’ cluster’s reliability, and the abstractions we’ve built on top of Kubernetes.

Engineering

Supporting Hypothesis

September 1, 2017

Sam Ritchie Engineering

At Stripe, we regularly contribute to open-source projects and rely on open-source software for developing many different parts of our stack. Stripe supported the development of Hypothesis, an open-source testing library for Python created by David MacIver. Hypothesis provides effective tooling for testing code for machine learning, a domain in which testing and correctness are notoriously difficult.

Engineering

Official support for .NET

August 8, 2017

Andrew Nelder Technical Solutions Engineering

Today, we’re excited to add .NET to our officially supported languages (alongside Ruby, Python, PHP, Java, Node, and Go). Going forward, we’ll regularly be updating the Stripe .NET library to support our latest products and features.

Engineering

APIs as infrastructure: future-proofing Stripe with versioning

August 5, 2017

Brandur Leach API Experience

When it comes to APIs, change isn’t popular. While software developers are used to iterating quickly and often, API developers lose that flexibility as soon as even one user starts consuming their interface. Many of us are familiar with how the Unix operating system evolved. In 1994, <em>The Unix-Haters Handbook</em> was published containing a long list of missives about the software---everything from overly-cryptic command names that were optimized for Teletype machines, to irreversible file deletion, to unintuitive programs with far too many options. Over twenty years later, an overwhelming majority of these complaints are still valid even across the dozens of modern derivatives. Unix had become so widely used that changing its behavior would have challenging implications. For better or worse, it established a contract with its users that defined how Unix interfaces behave.

Engineering

Connect: behind the front-end experience

June 19, 2017

Benjamin De Cock Design

We recently released a new and improved version of Connect, our suite of tools designed for platforms and marketplaces. Stripe’s design team works hard to create unique landing pages that tell a story for our major products. For this release, we designed Connect’s landing page to reflect its intricate, cutting-edge capabilities while keeping things light and simple on the surface.

In this blog post, we’ll describe how we used several next-generation web technologies to bring Connect to life, and walk through some of the finer technical details (and excitement!) on our front-end journey.

Engineering

Scaling your API with rate limiters

March 30, 2017

Paul Tarjan Engineering

Availability and reliability are paramount for all web applications and APIs. If you’re providing an API, chances are you’ve already experienced sudden increases in traffic that affect the quality of your service, potentially even leading to a service outage for all your users.

The first few times this happens, it’s reasonable to just add more capacity to your infrastructure to accommodate user growth. However, when you’re running a production API, not only do you have to make it robust with techniques like idempotency, you also need to build for scale and ensure that one bad actor can’t accidentally or deliberately affect its availability.

Engineering

Designing robust and predictable APIs with idempotency

February 22, 2017

Brandur Leach API Experience

The networks connecting our servers are, on average, more reliable than consumer-level last miles like cellular or home ISPs, but given enough information moving across the wire, they’re still going to fail in exotic ways. Outages, routing problems, and other intermittent failures may be statistically unusual on the whole, but still bound to be happening all the time at some ambient background rate.

To overcome this sort of inherently unreliable environment, it’s important to design APIs and clients that will be robust in the event of failure, and will predictably bring a complex integration to a consistent state despite them. Let’s take a look at a few ways to do that.

Engineering

Online migrations at scale

February 2, 2017

Jacqueline Xu Atlas

Engineering teams face a common challenge when building software: they eventually need to redesign the data models they use to support clean abstractions and more complex features. In production environments, this might mean migrating millions of active objects and refactoring thousands of lines of code.

Stripe users expect availability and consistency from our API. This means that when we do migrations, we need to be extra careful: objects stored in our systems need to have accurate values, and Stripe’s services need to remain available at all times.

In this post, we’ll explain how we safely did one large migration of our hundreds of millions of Subscriptions objects.

Engineering

Android SDK updates

December 5, 2016

Michael McDuffee Android

We just launched version 1.1.0 of our Android SDK, which fixes a few bugs and adds some new features: threading control, brand instead of type for cards, look up a card’s funding source, and Javadoc.

Engineering

Reproducible research: Stripe’s approach to data science

November 22, 2016

Dan Frank Machine Learning

When people talk about their data infrastructure, they tend to focus on the technologies. However, we’ve found that just as important as the technologies themselves are the principles that guide their use. Here's our experience with one such principle that we’ve found useful: reproducibility.

Engineering

Service discovery at Stripe

October 31, 2016

Julia Evans Engineering

With so many new technologies coming out every year (like Kubernetes or Habitat), it’s easy to become so entangled in our excitement about the future that we forget to pay homage to the tools that have been quietly supporting our production environments. One such tool we've been using at Stripe for several years now is Consul. Consul helps discover services (that is, it helps us navigate the thousands of servers we run with various services running on them and tells us which ones are up and available for use). This effective and practical architectural choice wasn't flashy or entirely novel, but has served us dutifully in our continued mission to provide reliable service to our users around the world.

Engineering

A primer on machine learning for fraud detection

October 27, 2016

Michael Manapat Engineering

Stripe Radar is a collection of tools to help businesses detect and prevent fraud. At Radar’s core is a machine learning engine that scans every card payment across Stripe’s 100,000+ businesses, aggregates information from those payments into behavioral signals that are predictive of fraud, and blocks payments that have a high probability of being fraudulent. Here's how we use machine learning to detect and prevent fraud.

Engineering

Introducing Veneur: high performance and global aggregation for Datadog

October 18, 2016

Cory Watson Reliability

When a company writes about their observability stack, they often focus on sweet visualizations, advanced anomaly detection or innovative data stores. Those are well and good, but today we’d like to talk about the tip of the spear when it comes to observing your systems: metrics pipelines! Metrics pipelines are how we get metrics from where they happen—our hosts and services—to storage quickly and efficiently so they can be queried, all without interrupting the host service.

Engineering

Open-Source Retreat meetup 2016

March 17, 2016

Krithika Muthukumar Product Marketing

This January, we invited three developers to come work on open-source projects full-time at Stripe. We specifically chose projects for this Open-Source Retreat that we felt would have deep impact in a variety of different areas. Over the past few months, our grantees have made significant progress on their projects.

Engineering

Open-Source Retreat 2016 grantees

December 8, 2015

Michelle Bu Payments

Like many developers, we often contribute to open-source software in bits and pieces over long periods of time. So we started the Open-Source Retreat to help open-source developers make concentrated progress on features and releases with the potential for significant impact. For 2016’s Retreat, we’re inviting three developers to work on their projects from Stripe’s office in SF.

Engineering

Open-Source Retreat 2016

September 3, 2015

Kyle Conroy Engineering

We increasingly rely on (and contribute back to!) a lot of open-source software to build Stripe, and we’d like to give back and get more people working on open-source.

Last year, we invited four developers to the Stripe office as part of our first Open-Source Retreat. Our grantees made significant progress on their projects in a relatively short time. Starting January, we’re hosting another Open-Source Retreat at Stripe.

Global payments

Money Management

Revenue and Finance Automation

By stage

By business model

By use case

Ecosystem

Get started

Guides