Seattle. Remote in North America only

Full time

Merchant intelligence

Data Infrastructure Engineer, Feature Computation

Seattle. Remote in North America only

Full time

Merchant intelligence

Stripe’s mission is to increase the GDP of the internet. To do this, we need to fight fraud at scale and build great software products, which means assembling strong machine learning teams and equipping them with the technologies they need to be effective. Our mission on Feature Computation is to make these teams more impactful by providing reliable and flexible data infrastructure, tooling, and technical guidance.

The Feature Computation team does this by designing and engineering the underlying infrastructure that powers feature generation for Stripe’s key machine learning systems. Our flagship product, Semblance, provides an expressive and powerful interface for feature definition, solving classic feature engineering problems like time-traveling and online / offline discrepancies. We work closely with ML engineers, data scientists, and platform infrastructure teams to build the powerful, flexible, and user-friendly systems that substantially increase ML velocity across the company.

You will work on:

  • Building powerful, flexible, and user-friendly infrastructure that powers all of ML at Stripe
  • Designing and building a low-latency, high-throughput data pipeline for our ML models, and distributing that pipeline and infrastructure across multiple regions
  • Creating services and libraries that enable ML engineers at Stripe to seamlessly transition from experimentation to production across Stripe’s data systems
  • Pairing with product teams and ML modeling engineers to develop easy to use data infrastructure for production ML models
  • Creating a centralized feature store that improves machine learning models across Stripe.

We are looking for:

  • A strong engineering background and experience with data infrastructure and/or distributed systems. You’ll be writing production Scala code.
  • Experience optimizing the end-to-end performance of distributed systems.
  • Experience developing and maintaining distributed systems built with open source tools.
  • Experience in writing and debugging ETL jobs using a distributed data framework (such as Spark, Kafka, or Flink)

Nice to haves:

  • Experience with Scala and Python
  • Experience with Spark or an equivalent framework
  • Experience with lambda architecture
  • Experience with model training and inference in production and at scale.

It’s not expected that you’ll have deep expertise in every dimension above, but you should be interested in learning any of the areas that are less familiar.

At Stripe, we're looking for people with passion, grit, and integrity. You're encouraged to apply even if your experience doesn't precisely match the job description. Your skills and passion will stand out—and set you apart—especially if your career has taken some extraordinary twists and turns. At Stripe, we welcome diverse perspectives and people who think rigorously and aren't afraid to challenge assumptions. Join us.

#LI-MB2