Software Engineer, Reliability Infrastructure
Build a reliability platform powering economic growth
Stripe’s infrastructure powers businesses all over the world. Our customers trust us with their businesses and livelihoods, and every request that stripe handles is critical. We process billions of dollars every year for millions of users, from the largest enterprises to a startup making their first sale. That is why both world-class reliability and seamless infrastructure scale are considered table stakes to support massive economic transactions for our customers.
The Reliability team at stripe is in charge of building the core reliability infrastructure used by various services at stripe as well as defining and driving the reliability best practices necessary to achieve world-class availability and latencies. Our team owns various reliability building blocks and frameworks ranging from rate limiters, circuit breakers, retry logics/policies, orchestrated safe change management to fault injection and load validation. We work with various teams across stripe to make their service more resilient against failures through applying common patterns and practices, and scale them to keep up with the every increasing demand.
We’re looking for an experienced distributed systems engineer with outstanding technical and leadership skills, strong collaboration skills and huge passion for customers to help deliver the foundation of our reliability infrastructure and work with various teams and across the entire stack to deliver world-class reliability solutions. In this role you’ll not only be in charge of designing, implementing and testing your reliability infrastructure components, but you’ll have the opportunity to play an influential role in enabling engineering team to make their services more reliable by identifying, creating, and deploying engineering practices, processes, and solutions.
- Design, build, test and operationalize end to end Reliability infrastructure and solutions that will be integrated into various services.
- Debug production issues across services and several levels of the stack
- Heavily participate and contribute to all the design discussions and code reviews within your team.
- Independently own and drive one of our reliability work streams, this include all planning and execution as well as managing the partnerships with other teams.
- Deliver value through strong collaborative approach with multiple customers and stakeholders across stripe.
We’re looking for someone who has:
- 5+ years of professional hands-on software development experience, able to write highly optimized algorithms and have in depth experience with commonly used data structure and algorithms
- Hands-on experience designing and building large scale distributed systems
- Strong collaboration skills, can work cross-team and cross-organization to deliver integrated reliability solutions
- Customer obsession, ability to articulate and represent customer experience in various forums to drive the right outcome
- Have the ability to thrive on a high level of autonomy, responsibility, and think of yourself as entrepreneurial
At Stripe, we're looking for people with passion, grit, and integrity. You're encouraged to apply even if your experience doesn't precisely match the job description. Your skills and passion will stand out—and set you apart—especially if your career has taken some extraordinary twists and turns. At Stripe, we welcome diverse perspectives and people who think rigorously and aren't afraid to challenge assumptions. Join us.