ㅤ
Growing up in Poland in the early 2000s, Mati Staniszewski and Piotr Dabkowski couldn’t stand the poorly dubbed movies they had to sit through. There is a limited budget for dubbing foreign films into Polish, which meant that most movies had only one speaker reciting the lines of every character with little depth, nuance, or emotion.
Those childhood experiences stuck with the friends as they made their careers in the tech industry and ultimately sparked an idea: what if they could solve the terrible dubbing problem—and many others—by creating realistic, emotional, and contextually aware synthesized voices?
That idea led to the 2022 launch of ElevenLabs, which harnessed advances in AI and deep learning to become the first company to build artificial voices capable of creating human-like speech (and even laughter). That breakthrough put the startup at the leading edge of AI audio technology. Today, ElevenLabs offers a growing product suite that includes a Text to Speech engine for audio and video content, voice design tools for character development, a translation and dubbing studio, and a toolkit for conversational AI chatbots that can interact with customers.
Monetizing those products as the company scaled required an equally advanced payment partner. London and NYC-based ElevenLabs turned to Stripe in 2023 to launch flat-rate subscriptions for its audio AI tools. Since then, ElevenLabs has depended on the depth of Stripe’s products to expand into enterprise-level services and support its fast-evolving business model. For instance, the company has relied on Stripe as it builds out major initiatives like its marketplace, where voice actors can license their voices for commercial use.
“We started about two and a half years ago, and we are now a unicorn,” said Luke Harries, ElevenLabs’ head of growth. “We have hundreds of thousands of self-service subscribers and enterprises like Perplexity, Time magazine, and Bertelsmann using our platform. All these payments have been handled by our first engineer setting up Stripe.”
Supporting subscriptions, payouts, and agentic workflows with one billing engineer
ElevenLabs started with 11 human-like AI voices. Unlike previous robotic AI voices, ElevenLabs technology replicates the nuances of age, accent, gender, intonation, and other factors that make each human voice unique. That realism, combined with the platform’s ability to gauge emotion from textual clues, made ElevenLabs’ Text to Speech engine a hit among creators looking to voice video scripts, podcasts, news reports, audiobooks, and almost any other type of audio or video content.
ElevenLabs chose Stripe Billing to get started easily, iterate rapidly, and seamlessly scale its subscription service for voice-to-text tools for content creators and publishers. The ease of working with the Stripe API and SDK made the team confident they could quickly build multiple pricing tiers with virtually no engineering time dedicated to the task. Billing’s flexibility also meant the company could scale its subscription offerings to accommodate larger customers as it rolled out enterprise-scale products such as a full-fledged audio production studio and dubbing services.
With Stripe’s global reach, ElevenLabs was able to instantly accept subscribers from all over the world, and the company used Stripe’s Optimized Checkout Suite to design a simple, effective subscription sign-up page for the global audience. For example, the company embedded the prebuilt Checkout form on its page, which made it easy to offer digital wallets and local payment methods such as Apple Pay, Google Pay, and Revolut Pay with no additional coding required. ElevenLabs also added Stripe’s accelerated checkout solution, Link, to enable customers to autofill their saved payment information anywhere across the Link network. Optimized Checkout Suite users benefit from an uplift in conversion rates, and Link’s easy, faster checkout experience now accounts for 20% of ElevenLabs’ payments.
As an AI company, ElevenLabs saw the potential for Stripe’s AI to make a significant impact on the subscriber journey. Instead of relying on rigid rules, the AI models built into the Optimized Checkout Suite dynamically determine which payment methods to display in what order for every checkout, helping ElevenLabs provide a more personalized user experience.
Stripe products also enabled ElevenLabs to efficiently manage a range of billing and payment tasks, such as managing payouts and simplifying the onboarding process. In fact, ElevenLabs accomplished its Stripe integrations and is managing its various billing and payment workflows with just one engineer. “If we had to do all the subscription infrastructure in-house to handle all our different geographic regions, I’d expect we’d need a full engineering team dedicated purely to payments,” said Harries.
When ElevenLabs developed sophisticated voice cloning technology, the company saw an opportunity to support the professional voice actor community and add another new layer to its business model. Using Stripe Connect, ElevenLabs created a marketplace where actors can clone their voices for commercial projects, set terms, and receive payouts any time an ElevenLabs user selects their voice for a project. Connect offered ready-to-use capabilities to handle voice actor onboarding, including supporting international payouts and managing regulatory hurdles such as Know Your Customer (KYC) requirements. For instance, compliance with KYC rules can pose a considerable hurdle for platforms throughout the onboarding process. Stripe’s features again saved development time and resources that ElevenLabs could dedicate to its core audio AI projects.
ElevenLabs saw many companies using their Text to Speech and Speech to Text models to build AI agents. The companies often took months to get in production and were rebuilding the same underlying stack. So ElevenLabs launched their own platform to create a Conversational AI voice for customers to get to production quicker and focus on building the agent business logic rather than infrastructure. With the Stripe agent toolkit, ElevenLabs’ agent platform could enable agents to complete customer service or sales workflows. For example, a business’s AI agent could reach into its Stripe account to issue a refund or complete a transaction by sending out a checkout link. “The biggest shift in conversational AI agents is going to be from just pure question answering, to now using their own autonomy to execute certain actions,” said Harries.
A partner to keep ElevenLabs ahead of the competition
Building on those first 11 voices, ElevenLabs now has more than 5,000 voices available on its platform—driven in part by its advanced marketplace. The platform has paid out more than $4 million to voice actors, with some top earners making more than $10,000 a month.
Already, users have made more than 550,000 AI agents on the platform, which is just the start, considering the number of use cases enabled by truly conversational bots and agentic workflows. ElevenLabs also continues to add more languages to its Text to Speech and dubbing capabilities, which now supports 33 languages ranging from English, French, and Spanish to newer additions such as Croatian and Tamil.
Harries likens the competition in the AI audio space to Formula 1, where every company is looking for the next technology iteration or breakthrough product to power it to the front of the field. As a result, he doesn’t expect the pace of innovation to slow anytime soon for ElevenLabs. And he sees Stripe as a key partner for continuing that innovation.
“I’m excited to keep scaling up much more volume of payments through Stripe, [making] many more millions of payments to voice actors on our platform, and expanding into far more countries and payment options,” said Harries.