Machine payments and the protocols behind agentic commerce
Advancing developer craft
Durée
Remplissez le formulaire pour voir la vidéo en entier
Agents are already economic actors. They consume tokens, call paid APIs, and transact on behalf of customers. A new type of agentic infrastructure is emerging to support them, from protocols such as MCP for tool discovery and UCP for checkout to MPP and x402 for machine payments. This session breaks down how these protocols fit together and goes deep on the mechanics of machine payments.
Speakers
Guillaume Poncin, CTO, Alchemy
Manik Surtani, Head of Open Source, Block
Steve Kaliski, Principal Software Engineer, Stripe
STEVE KALISKI: Awesome. We’re going to get going. I thought this was going to be a talk just for robots, but I’m very pleased to see some humans here. So we’ll have to adapt the presentation a little bit. So I’m Steve Kaliski. I’m a principal software engineer here at Stripe. My team and I focus on all the protocols and infrastructure that powers some of the stuff we presented yesterday, like Stripe Projects, the Link CLI, and the Agentic Commerce Suite. So we’ve officially moved past the era of AI that just chats, summarizes our codes. We’re entering the era of AI that works. So today agents are becoming economic actors, autonomous entities capable of building code and navigating real purchases and handling your customer support queue. And for the past year, we’ve been heads down building all the infrastructure that’s going to allow agents to safely move money.
So we’re going to explore some of the unique hurdles of agentic commerce, why the current purchasing flows that are meant for humans aren’t enough, and some of the technical standards that are emerging in the industry. So in the developer keynote, which some of you may have gone to yesterday, we touched on how agents are nondeterministic by design. They’re great at sort of figuring out what to do and completing the job, but they’re not always completing the job in the same way. But that’s sort of the feature of it. So in commerce specifically, this is a superpower. You don’t know what you want or you don’t know what you need to buy or spend on in order to fulfill some outcome. You want an agent to be creative when hunting for a unique gift or picking the right software or figuring out the data that you need.
But the moment you move from browsing to buying, the rules change. You want a solution that is when you pick the thing you want to buy, you want the nondeterminism to end. So in commerce, creativity can be a liability. You can’t have an agent hallucinate a credit card number or shipping address or spend funds that it shouldn’t authorize. So we want to sort of take all those constraints around making sure that we have the right seller and the right amount and the right credential and get authorization from the human in some way. So we’re going to take a quick step back and sort of journey through a checkout as it is built for humans. And what would it look like if we let a nondeterministic agent sort of unleashed on a checkout page? And today we force them into these pages meant for humans. They’re scraping HTML, mimicking clicks, filling out forms, right?
And there’s this sort of guessing game that emerges. So for example, how does an agent certify that it’s in the right place or domain? How does it make sure that it’s picking the right product? How does it make sure that it’s getting the right price information out of the DOM? How do we make sure that it picks up on the tax and the other kinds of fees associated with it? And then when it comes time to actually fill the form, how’s it not going to get blocked by CAPTCHA? And then of course, we want to pay the right amount and put in credentials correctly. So to move past this, we need a sort of dedicated setup or infrastructure built for agents where we can have a structured negotiation that is built on API calls and stable data models. So instead of loading a webpage and crawling around it, we should just make a call to create a checkout.
Instead of guessing where all the data is on the page, we should just look at adjacent object with line items and tax and shipping and so on. And as things update, we make calls to update the cart, get new data back, and then ultimately pay with secure credentials.
So that foundation is exactly what we’re working together as an industry, establishing technical standards. And we’ve designed a series of protocols and tools along the way to give agents their preferred method of interaction. And by collaborating with industry partners, we can sort of take these norms and deploy them in the real world, and hopefully we can all start buying things with agents. And while discovery is a huge part of the puzzle, we’re going to focus on two main things today, checkout and machine payments.
So first we want to jump into checkout. And the universal commerce protocol, UCP, is how we define APIs around the checkout lifecycle. Imagine all the things that happen between loading a webpage, picking out the things you want, the shipping options, and then ultimately clicking buy. So instead of a webpage, again, we have this back and forth between an agent and the seller and the seller’s PSP, where we initiate a checkout with an API call, we generate the cart. All that lives in the seller’s backend with their existing commerce stack and we sort of back and forth, maybe rendering a UI or a chat response as the agents and the human work together to navigate a checkout. And ultimately when it comes time to pay, we can generate a Shared Payment Token, a secure credential that we can relay to the seller and their PSP to actually execute the payment. So we’re going to go ahead and just jump into a demo to look at this quickly so we can see this in action.
So up here we have Stripe Press and this site, it’s really pretty. It’s built for humans where I can sort of scroll through and navigate the books that are available. But ultimately what a robot wants is just sort of a JSON list of this, right? So to illustrate this, I have running on my computer api.stripe.press, the robot-viewable version of Stripe Press is obviously much worse looking, but it gets to the point. And for any of these books, I can click in or the agent can navigate and it can see that instead of a form, I can actually just make an API call to buy the book. So I have this demo here on the left-hand side. We have the human readable view and on the right-hand side, the machine readable view. And what we can see is that when that checkout gets initiated, we’re making a call over UCP where the agent is relaying some information about the buyer, their name John Doe, the SKUs that they want, so these two books.
And then in response over UCP, the seller is declaring how they can process payments, the state of the cart, the line items and the amounts and the images and the total breakdown and the tax, so on and so forth. And as the human or as the agent updates shipping options, we’re making subsequent requests, right? So we can see the line items we want, the shipping method we picked, and the state of the world is coming back from the seller who is managing all this data in their existing backend, the same backend that would power the human viewable UI. And when it comes time to pay, we can generate a new secure credential. In this case, the Stripe Shared Payment Token, which we’ll look at closely next, which gets sent over to the seller and now the seller can process payments in their existing payment stack.
Okay. So with UCP, we have API-driven commerce flows, right? So we can turn that human readable view into a machine readable view. We can support all kinds of payment methods. So we demoed cards here, but we could support other methods like Link and buy now, pay laters like Affirm and Klarna. And then critically, the seller remains in control, right? So it’s still their backend, still their payment stack, nothing changes for them, right? So they are just sort of exposing everything as an API instead. Now, you may have noticed in that demo, there’s this part I moved through quickly where, well, how do we pay? We collected some payment method up front, but how do we send that over? And for that, we designed what we’re calling the Shared Payment Token. And the Shared Payment Token is a mechanism to relay any kind of payment method, whether it’s a card or otherwise, from one Stripe account to another so that they can process payments without being in PCI scope.
And so that also the limits that are established by the agent and their human buyer can be enforced programmatically at time of transaction. And in addition to that, we pass over fraud signals that are collected when the token is provisioned over to the sellers so they can have a full view to integrate to their existing fraud stack. So with Stripe, agents can collect payment methods through things like Stripe elements or other interfaces. In many cases, they may already have a payment method on file, right? Many of us probably pay some monthly fee to a popular LLM application, so they can reuse those payment methods without having to collect a new one. And then when the agent is ready to buy, it mints a Shared Payment Token that has specific usage limits on it. So it’s scoped to an individual seller, so only that seller in the Stripe network can actually process payments with it.
It declares a max amount, so how much the agent is willing and the agent and their human are willing to pay, a currency, an expiry time, and some additional risk data. And when it comes time for payment, the seller can receive that token, they can just pipe that right into their existing Stripe PaymentIntent flow and everything downstream of the payment looks the same. Now, in the optimistic case, the seller’s going to charge the amount that was pre-agreed upon, and if it’s within that limit, everything works as expected, but in the off case that it’s attempting to charge more than we agreed upon, then Stripe actually enforces those limits and declines the charge.
So we’re going to jump into a demo now just to see how that works in practice. So here I have sort of not just one Stripe account, but two Stripe accounts. We have the agent’s account and the seller’s account. And what we’re going to illustrate is how an agent can take a payment method, mint a Shared Payment Token, send that over to a seller, and then process a payment. And then we’ll test out what we showed earlier as well where we’ll attempt to charge less than the limit and more than the limit and see what happened. So in this case, we can imagine the buyer provided a card on the Visa network. We’re going to generate a new Shared Payment Token with that payment method. We’re going to apply a $25 limit and the card’s going to expire in a month and we’re going to scope that, in this case to my test merchant, but scope that to a particular Stripe merchant.
Now, we’re going to send that token over something like UCP. The seller’s going to receive that token. We’re going to introspect it as well so we can see some of the brand, and last four, and expiry, and so on that the seller can use in their existing fraud and risk analysis. And then we’re going to also create a PaymentIntent as the seller. And the only line of code that changed is that instead of passing in a payment method that we had collected or the seller had collected, we’re going to pass in the granted token that the agent had relayed to it. So we’re going to go ahead and run that now.
So we can see that we issued that token. It’s going to work for up to $25. It expires in a month. The seller received that token and they could see some information about it. So it’s a Visa card expiring in April 2027, and then we were able to charge it as the seller for $25. Now, let’s go ahead and increase that to 50. We’re going to run that again. And we can see that now it failed, right? So even though we made best effort to agree upon an amount and mint that token and send it over, Stripe still enforced the limits to make sure that what the agent and their human user had agreed upon would be enforced. So we can switch over. So just to cover Shared Payment Tokens. It’s a mechanism to relay any payment method, including cards, buy now, pay laters, Link, Apple Pay, and Google Pay from where an agent collects it, so in a chat service or otherwise, to a seller programmatically, and then also make sure that Stripe enforces limits around usage and amount and time and so on.
But all this works really well for replicating a checkout that we’re all familiar with, right? So maybe buying things that we need, ecommerce goods, and so on. But what do we do if an agent wants to buy something, right? I’m unaware of an agent that needs a T-shirt or anything like that, but certainly agents need things like API calls, access to data, invoking MCP servers, and so on. So that’s where the Machine Payments Protocol comes in. Now, MPP, Machine Payments Protocol, is an open protocol that basically explains how a server can declare an HTTP resource as requiring payment. So it’s built on top of the 402 payment required status code, and in the same flow that the agent is asking for data or asking for information, we can remit payment. So basically we’ve swapped out the API key that would typically be there with a payment credential.
And just like UCP, MPP also works for fiat and crypto, so we can use Shared Payment Tokens as well. Now we’re going to go into a demo just to see how this works.
Now on the left-hand side, I have a server running. And my business in this example sells weather data, right? So users can make requests with a ZIP code and get back sort of historical data. And typically what a robot might have to do is go on some weather website, crawl around, try to get, parse out weather information, or alternatively if the weather company had an API, it’d have to ask its human user to procure an API key for it. And then we’d have to pass that to the agent and the agent would have to know to use that and I’d have this sort of recurring subscription and so on. But to the point of nondeterminism, I don’t know if I need weather data, right? The agent may decide that it needs it as part of trying to solve my prompt. So we’re going to make a request to it, and I’m going to put in my ZIP code at home in New York and we’ll format it.
So we can see back that we actually didn’t get any weather data, right? We got this indication that I was going to have to send funds over to get it. And I got this challenge ID, which is coming from MPP that basically explains to me the nature of how I can send over payment data. Now we’re going to inspect this a little bit more closely to see exactly what’s being sent over to explain to me how to remit that payment. So we’re going to use a tool that we built called Purl, which is payments plus Curl, and we’re going to make that request one more time.
Okay. Now, sort of dissecting the header that’s in the response, we can see that it supports the MPP protocol. In this case, we’re just running on Tempo testnet, but this would work on mainnet as well. It’s going to cost one penny in USDC to get that data back, and I have more of the challenge ID that I can work with. And then critically, I have a recipient address. So this is where the address on the Tempo network where I have to remit funds to get this back. Now, we’re going to just let this run through. We’re replaying the request now, but now we’ve included the payment method information in the request back. So now the seller can verify that I actually paid them and can return me back that weather data. And then we can see that we also have a transaction that’s settled on the Tempo network.
So just to wrap that up, what we were able to do is a server that already had an API that was accessible for robots with a sort of stable API key to make a request. We’ve now actually sort of swapped the API key part with payment over MPP, which allows for direct settlement at time of request. So when I actually need the data, we’re actually paying for it. It’s again, flexible to payment methods, so we demonstrated that with Tempo here, but it works with shared payments too. And as you may have seen in the product keynote yesterday, we support new kinds of business methods as well, like streaming payments.
So just to recap, UCP is great when you need a structured checkout. You already have sort of a regular commerce business and you want to expose that to robots. And in cases where you’re maybe selling an API or your primary audience is already agents, you can use something like MPP or x402 to monetize directly. So we went over the theory in tech and demonstrated it, but how does it look in the real world? And to do that, I want to invite a couple experts, Manik Surtani, head of open source at Block, and Guillaume Poncin, CTO at Alchemy, to talk about this more.
MANIK SURTANI: Thank you. Thank you, Steve.
STEVE KALISKI: Awesome. It’s the sitting part of the presentation. I’m with you all now. Okay. So I think what’s interesting about the three of us is we’ve all sort of had chapters working in fiat and we’ve all had chapters working in crypto. And historically for us, in the pre-AI era, our primary customers were humans, and now increasingly so our primary customers and users are agents. So I’m sort of curious to hear from both of you, how has that transition, both in the monetary transition but also in the user transition, sort of played out for you and how are you sort of unlearning previous things or sort of handling that change?
MANIK SURTANI: Yeah, great. Thank you for the intro, by the way, Steve. So I’m Manik. A bit of a change, by the way. I was head of open source at Block. I’ve recently actually resigned from Block. I now have joined the Linux Foundation. We recently launched something called the Agentic AI Foundation. We did that late last year, and I’ve joined the Linux Foundation to be the CTO of the AAIF. So when you talk about machine protocols, commerce protocols for agents, super relevant, we do a lot of stuff in that space, and it’s soon becoming my bread and butter. Awesome. But yes.
GUILLAUME PONCIN: Yeah. And quick introduction. CTO at Alchemy, for those of you who don’t really operate in the blockchain, we provide developer infrastructure for the blockchain developer. We can help you with data and APIs and wallets and so on. I think it’s been very interesting for me. The transition from fiat to crypto is starting to converge with this transition towards AI agents in the sense that the rules of the game are a little bit simpler if you operate in the crypto space, in the blockchain space. And so we get the chance to reinvent commerce for agents at the same time as we have a settlement layer that is much, much faster and instant. And so you can change some of the rules at the same time as you change the actors to get to a much, much better place.
MANIK SURTANI: So that’s actually super interesting, right? You talked about one of the big changes that we’re seeing from moving from human payments to machine payments is the fact that machines themselves need something like your quip around, I’ve never seen an agent that needs a T-shirt, right? That’s absolutely true, but agents need other things which we haven’t needed in the past when it was purely a human flow. So that’s a whole new category of things. And that’s where MPP, x402 come in. And that’s where that, as you said, low price, low latency sort of payment mechanisms for micropayments come in again. So the whole new space there. Yeah.
STEVE KALISKI: It seems like every week there’s a new protocol or new thing being announced. And I’m probably to blame for that partially, but I’m kind of curious why you think that’s happening and why do you think it’s useful that these things are emerging?
MANIK SURTANI: All those three letter acronyms, right? I love it. They’re all over the place, aren’t they? Yes. You’re absolutely right. There’s a lot of them, and there’s a reason for that. It’s such a new space still. It’s still nascent, which means everyone’s trying, everyone’s experimenting, everyone’s building. That’s great. That’s exactly what you want to see, that Cambrian explosion of people trying different things to see what sticks. But also what happens at the tail end of that is you start to see consolidation. You start to see protocols starting to look very similar to one another and say, “Do we actually need two protocols for this? No, they’re the same thing. Let’s actually start to converge.” And that’s also something you start to see in the market as that happens. And shameless plug for the Agentic AI Foundation, a strong believer that these protocols should be open as well.
They may start within a company as a proprietary protocol, but the closer and closer they get to becoming a de facto standard, the earlier you should actually open source them, make sure they are public protocols, everyone can contribute to them, help them grow. And I can spend hours talking about why that’s important. And if anyone wants to hit me up later, please do, but yes.
GUILLAUME PONCIN: Yeah. I’ll complement this. I think I’m actually very excited about the emergence of many, many protocols. They all aim to solve the problem in a slightly different way. I think innovation phase of the curve here. Actually, for Alchemy, it’s an interesting... We have a particular use case where for $1, you can get thousands of API calls, that is not well supported in most of these protocols. And so, okay, that’s the niche that for our use case, maybe there is a protocol that would serve that much, much better. I think everybody will have... There will be thousands of these use cases and corner cases where, okay, you want subscriptions or you want usage-based billing. All of these will need to emerge from this. And in this phase, I think it’s great to see a lot of innovation. To me, it’s actually not really a problem.
And we see this in a blockchain space. Agents are able to deal with this. Humans have trouble with five protocols. For an agent, it’s just normal. It’s not a problem to work with complexity, with many, many blockchains, with many, many protocols. So I actually don’t think we will see a convergence as fast as we’ve seen in other domains in the past.
STEVE KALISKI: One thing I’m interested from you is, it seems like every other week we’re deciding if agents want to use cash or crypto or different payment methods. And I think similar to that, they’ll use all of them, but I’m curious how you think through the pros and cons of different types of payment methods or different types of currencies in these transactions.
GUILLAUME PONCIN: It’s a good question. I think for a lot of the things we sell and a lot of the things that agents will want to buy are digital, you can perform the purchase, get the product immediately with no settlement time. I think in that context, crypto is much better. It doesn’t cover all use cases, but I expect that to become sort of the dominant use case of agentic commerce very, very quickly. Now, there are many other places where you actually want the physical currency because you need to go pay for your groceries at some point. There’s other use cases, but I think for what I think a lot of the people in this room are working on, agent ecommerce will likely dwarf human transactions within a few years. Total human transactions worldwide will be eclipsed by what agents are able to do. And I think a lot of it will be instant settlement of that type.
And for that, I think crypto is a natural sort of economy.
MANIK SURTANI: Well, you’re specifically talking about machine-to-machine commerce. Correct. Yes. When you say agent commerce, right? Because there’s also agentic commerce that’s human-driven, which is a different kettle of fish.
GUILLAUME PONCIN: Absolutely.
STEVE KALISKI: And I think most of us feel like the curve is starting to happen in terms of agentic commerce becoming real, but what do you think is in the way of it actually exploding in something that we’re all participating in all the time?
MANIK SURTANI: I think that there’s a whole trust piece there that I think is... I’m not going to say it’s missing. I think it’s still nascent. It’s still too young. It’s not fully developed yet. And trust is a funny thing. Even once you’ve developed the technology behind it, it still takes a while for humans to trust the trust protocols. And that’s what’s lagging in my opinion. People are still going to test it out or try it out with relatively not that important payments first and unleash an agent to go and do things, but we’re a while away still from getting an agent to go pay my taxes for me or something like that. I’d love to get there, but we’re a ways off.
GUILLAUME PONCIN: One thing I think I’ve had to learn personally, I don’t know how many of you have played with OpenClaw and how many of you have been gaslighted by your OpenCloud?
MANIK SURTANI: I don’t see any hands. There’s one over there…
GUILLAUME PONCIN: I think one thing I had to learn through playing with a lot of agents is that you actually need a verification moment. You need to close the loop where it’s not like, “Hey, go build a blockchain for me for this particular use case.” It’s, “Go build the blockchain and make sure that it passes these acceptance tests,” MetaMask works, whatever. And that closing the loop to me is a new pattern that is not necessarily how we were doing engineering before. And I think in this commerce protocol, that’s the trust bit. It’s like, does your agent actually do what you asked it to do? Is there a way it can verify that it did what you asked? Can you verify what you asked? Can the merchant verify that all the loops have to be created? I think we’re still in the first phase of just making it work.
STEVE KALISKI: Yep. I just have one more question. You brought up an interesting point about, agents can manage the complexity here in supporting multiple protocols. And I’m curious for you, as you think about standards in general that historically we’d be writing for the human coder or the human author, having to understand it. How do you think that changes when maybe the predominant user, even implementer of it, is no longer a human?
MANIK SURTANI: That’s right. So in terms of complexity of the protocol, we’re not bounded anymore by the level of complexity that a human can understand or can deal with. So that’s great, right? Suddenly you have a lot more potential on the table, but in terms of standards, it still doesn’t mean that standards shouldn’t exist or they’re bad and you can just have too many of them. The reason why is interoperability, and that’s a really, really important aspect. If you don’t have interoperability on the standards layer, the protocol layer, what you end up with is a lot of clustering, a lot of siloing, and that’s dangerous and that’s bad. We’ve seen what happens when, let’s say, other powerful technologies end up getting siloed, social media, for example, right? Lack of open standards there means you’re locked in, means you don’t have transparency, you can’t move from one to the other easily, and all sorts of bad things happen in the world because of that.
With AI, with agentic commerce, this is like a hundred times more powerful than social media. This cannot be locked up. It’s too dangerous to be locked up. And that’s where, again, open protocols and open standards are important, even if complexity isn’t what you’re solving for.
GUILLAUME PONCIN: Yeah. 100% agree with that. I’m a big proponent of open standard, open everything. And ideally with iteration from the community, it’s not like owned by one or two companies. It becomes a living and breathing thing and we can iterate on it together.
STEVE KALISKI: Awesome. Okay. Well, and that note, I just want to say thank you guys for coming so much. It’s been really useful.