Tracking customer spend across processors
Advancing developer craft
运行时间
填写表单,观看完整视频
Learn how to build systems that track spending patterns across providers and payment methods. We’ll cover how Payment Account Reference (PAR) works, how to implement it, and how to keep the customer journey intact even as accounts change.
Speakers
Liam O’Neill, Solutions Architect, Stripe
Andrew Robinson, Principal Solutions Architect, Stripe
ANDREW ROBINSON: Hello everybody and welcome. I’m Andrew Robinson, solutions architect at Stripe.
LIAM O’NEILL: And I’m Liam O’Neill, also solutions architect at Stripe.
ANDREW ROBINSON: And we’re here today to talk a little bit about tracking customer spend across multiple different processes. So let’s start out with an example. We’ve got a customer, she’s called Sarah. She’s making some purchases. The first one of these purchases that she makes is in store using a physical card and this is processed on Stripe. Next up, she remembers that she forgot something, maybe it was a HDMI cable for a TV. She now goes back into a different store and she buys that HDMI cable. This time using Apple Pay on her Apple Watch, and this is processed by a different payment processor.
Next, one evening she’s browsing on her phone and in the app for your company, there’s a membership subscription. Judging by how much stuff she’s bought from you, she thinks that it’s great value for money. She decides to sign up for that. This time using Apple Pay on her iPhone. And this is processed through a different payment processor, or PSP. So in this scenario, we’ve got three transactions processed by three separate processors on the same underlying card, but using different types of payment method. Apple Pay on the watch, Apple Pay on the phone, and the physical card.
Sarah then loses her card. She’s now got a new card that gets reissued with a new PAN, the 16-digit number on the front of that card. Now, what the challenge this then brings is that breaks the lifecycle across those three separate transactions that you’ve got. Now, they’re three separate unlinked transactions. And now Sarah’s got a new card. So if she comes in and makes another purchase with that, there’s no way of tying that transaction back to those three previous ones. The chain has been broken. This problem’s made worse in a multiprocessor environment, and many companies out there will be using multiple processors.
Sometimes this might be that you have different processes for online versus in-store. Sometimes it might be through mergers and acquisitions, where a new company’s been acquired and they’re using their own payment processor. Sometimes it could be done to give improvements for auth rate or improvements for performance or improving for cost. Generally, multiprocessor is something that’s thought out and architected, but it makes this problem of tracking spend across those different channels and across those different processes much more difficult.
So a few different ways you could think about solving this problem. The first of those would be to use the PAN. If you’ve got the PAN and you can track it across those three different processes, that would solve the problem. The challenge that you’ll have is that increases PCI scope. And when that card reissuance occurs, you’ve got a new PAN. So you’ve then lost that link across those previous transactions. So PAN is going to be out. The other approach that some people think is using tokens. Now, tokens could be a good way of doing this, but the challenge with them is that whilst they reduce PCI scope, they do change on channel, on device, and processor. So a token that’s generated by Stripe is going to be different to a token generated by another PSP. So tokens are also out.
Another approach we sometimes see is card hashing. Now, again, this helps reduce PCI burden because you’re storing a hash of the card, but you’re still left with the same challenge of when that PAN changes, when a card gets reissued, you end up with a new card hash. So card hash doesn’t really solve the problem either. And some of the other approaches is using something like BIN and last four. This helps reduce PCI burden, but you can get clash at scale and collisions where the same BIN and last four could be two separate cards. And it also still doesn’t solve for the problem of when that PAN is reissued, you’re going to get a new BIN and a new last four. So none of these four approaches realistically can solve that problem at scale. What can? Well, that’s what we’re going to talk about today in the payment account reference.
LIAM O’NEILL: So payment account reference, commonly abbreviated to PAR. What is it? It’s a 29-character identifier, which comes from EMVCo, which means that it’s an industry standard. On Stripe, it gets returned with the charge object. Now there’s four things I’d like you to remember about PAR. Number one, it’s stable. It’s stable across channels. It’s stable across processes. It’s stable across devices. Number two, it’s persistent. We saw an example earlier where a card got reissued. The PAR can be persistent through events such as that. Number three, it’s nonfinancial unlike the PAN, which means it doesn’t increase your PCI compliance burden. And similarly, it’s nonreversible. You can’t derive the PAN from the PAR, which means that it’s safe to store.
Now, here’s the identity model. We have a mapping of one PAN to one PAR. Now, something could happen such as the card reissuance. So PAN A goes away, and PAN B comes onto the scene, but still the relationship persists between PAN B and the same underlying PAR. Similarly, if you add that same card to a new device, such as an Apple Watch or a phone, the relationship persists because the relationship maps back to the account. Now, I’m going to show you a demo of what a customer loyalty system built with PAR in mind would look like.
So we have a typical ecommerce website here called Clubware, where I’m going to make a purchase of a television. As you can see in the top right-hand side, I’m logged in as myself. So I’m going to pick a nice television here. I add it to my cart, proceed to checkout. You can notice that the loyalty number has been added because that got created when I created my account, which is great because I really want to get my loyalty points. So now I’m going to pay. I’m going to pay with Stripe Link for the first transaction with my real card ending in 9277. So I’m going to go ahead and make the one-click purchase. I’m expecting to see a 3D Secure authentication because I’m a European customer, which as you can see, has just popped up now. So on my phone, I’m going to authenticate that transaction.
And that’s just authenticated. And we can see the purchase going through now. Great. And I can see I’ve got my 50 loyalty points for this purchase, exactly what I want as a consumer. But I just bought a TV, but I forgot my HDMI cable. So I’m going to rush back and I’m going to make that purchase. But this time, I forgot to sign in. I’m a guest checkout in this example. So I’m going to search for HDMI, find what I’m looking for, add it to the cart. Now this site has a 50-cent minimum on transaction, so I’m actually going to purchase 10 cables because why not? So I’ve added those to cart now. I don’t have a customer loyalty number this time because I’m a guest checkout user. I want you to remember that.
So I’m going to pay this time. This time, I’m not going to pay with Stripe Link. I’m going to pay with a different wallet. I’m going to use Google Pay this time. So I’m going to add in my shipping info here. I’m just copying and pasting my shipping info in because it’s quite long and I’m going to pay. So with Google Pay, I’ve already added the card ending in 9277 to my wallet, so I’m just going to click pay now. Again, much like last time, I’m expecting a 3D Secure authentication to pop up. So I’m going to pay now. And as expected, the 3D Secure authentication is going to pop up now, which I’m going to authenticate on my phone. So as that’s happening, just remember the first time I was logged in and I was using Stripe Link. This time I’m not logged in at all. I don’t have a loyalty card added, and I’m using an entirely different wallet. So this purchase should go through now.
So as we can see, the order is confirmed, but I didn’t get a notification about loyalty points because I’m not logged in. But let me log back in now as the previous user. So I’m just adding my credentials now. Great. So I’m logged in, but look at the notification I’ve just gotten. It’s noticed that I had a guest checkout from Google Pay and it’s added those loyalty points to my account as a user, even though I wasn’t logged in. That is because this website is using a PAR-keyed system, and I’m going to show you what that looks like on the Stripe Dashboard in a second. This is something that you could build yourself.
So now I’m on the Stripe Dashboard, which lives behind the website that we just saw. And I’m going to show you where you can actually find the part in the Stripe Dashboard and build your own integration just like this. So I’m going to go to the transactions on the left-hand side, and I’m going to look at the last two transactions because they’re the ones I just made. You can see one there from Google Pay and another one from Stripe Link, both with a card ending in 9277. So as I mentioned earlier, on Stripe, the PAR gets returned on the charge object. So I’m going to pick one of these transactions and I’m going to try and find this PAR.
So this transaction, this is the one that I made when I was logged in. So I’m going to scroll down to the events and click on any of them, and I’m going to look for the charge event. There’s a charge.updated or charge.succeeded. You can find the PAR on either. Now I’m just scrolling down, looking for payment account reference, and there it is. So just remember the last few digits so we can match it to the other one. It’s 727G. So now I’m going to go into the other transaction. Much like last time, look through the events, find the charge object. And there it is, and it’s the same as expected. So the way I built this was an event-based system. So when that charge event fired, my system would listen for that, find the PAR, and then try and match it to an existing customer, which is exactly what you just saw when the second order had the loyalty points added to it. This is something that you can build yourself. Thank you.
ANDREW ROBINSON: Thanks, Liam. So if we then look at building a system similar to what Liam talked about there with the use case of customer loyalty and being able to automatically leverage those reward points for a customer, even if they’re not logged into the website, we’re going to need to think about what data attributes we’re going to need to capture to build this with. So some examples here, of course, we’re going to need the PAR, the payment account reference, but we’re also going to then need to store some additional context in there to help us understand. Things like the processor who processed the transaction, the channel that it went through, whether it was online or in store or even agentic, a merchant ID, a transaction ID and a timestamp, and importantly, that customer ID, because it’s joining that payment account reference to your customer ID that then allows you to associate the transactions to that specific customer.
So if we were then looking to build a pipeline for this, the first step is capture. Make sure across all your channels, across all your processes, you’re capturing that payment account reference and you’re storing it somewhere. Next is going to be to normalize that data. Different payment processors will provide PAR in different formats and different structures. The actual PAR is the same, but where it is in the response you get from the processor, whether it’s synchronous or asynchronous, is going to differ. So make sure that that data is normalized so that it’s consistent. Then join that data. Take that payment account reference data that you have and join it to your existing customer data that you have so you know this payment account reference resolves to this customer ID that acts as that important join key for you. And finally, build your analysis on that data.
So if you want to use this data as a single unified dataset, you can across your fraud systems, across your customer loyalty systems, across your analysis systems, total customer value and customer lifetime value, all of that can be built from this one dataset. And that means that all those downstream systems don’t have to be concerned with which processor processed that payment and tried to pull all of that data together. It’s one single consistent layer that you can pull it all from.
So when we do have a PAR, PAR is that primary join key. You make sure that you’re doing a deterministic match between that payment account reference to that customer identity. If you don’t have a PAR, that’s where it can get a little bit tricky. Avoid using some of the examples we talked about earlier, like raw PANs or BIN plus last four or a token ID—make the matching deterministic. If you’re not certain and can’t confidently join that payment account reference to that customer, flag it for review and use a manual process to validate that. Edge cases do exist, and we’ll talk a little bit about those later. You need to handle those gracefully and not just do a silent merge of that data.
So if we go back to Sarah and look at that spend history that she had, so all the way back at the beginning, there was the TV that she purchased with the physical card, the HDMI cable with Apple Pay on her watch, the membership with Apple Pay on her iPhone. All of those would have had different token identifiers with them because they all came from different processors and the Apple Pay on our watch and Apple Pay on our phone have different device PANs. If that card is then reissued, she gets a new physical card and then uses that physical card to go buy a set of headphones, that’s another token ID, even if it was on the same processor. So four separate token IDs to try and match up to one customer. But if we use a PAR-keyed system, all of those four transactions, even after that card reissuance, all return the same payment account reference value. And it’s that single persistent property of PAR that makes it worth the entire implementation.
PAR isn’t magic though. It’s not going to solve all the problems, and there are some things to watch out for and be aware of. The first of these is that it represents a payment account, not a person. So if you have multiple people with the same payment account or multiple people that have maybe a shared card, that means that the PAR that’s returned could be the same across those. It also is important to remember that whilst the PAR isn’t a financial reference, you can’t derive a PAN from it, you can’t use it to authorize a transaction with, it is still personally identifiable information, so you need to handle it in the same way that you would with the same data protection regulations for other personally identifiable information. And finally, it’s important to remember that PAR isn’t supported by all card issuers. We typically see return rates between 90 and 95%, but it’s important then to remember that not all issuers return it, so your systems need to be able to handle for a PAR not being there or potentially that PAR arriving late.
So let’s look then at operationalizing this and how you could take it from this theory into production. So we mentioned there about not all issuers return a PAR. It’s also important to remember that sometimes this data can arrive late as well. It’s not always going to be synchronous at the time of transaction. Sometimes it can arrive asynchronously, potentially minutes or hours after the transaction. So be able to plan for that late arriving data and be able to handle those replays and backfills so that if that data arrives at a later point in time, potentially hours after the transaction, your systems are able to ingest it. A great way of being able to do this with Stripe is using webhooks and events, because when that charge object changes, when we add the payment account reference to it from the card networks, we’ll create an event that will be charge updated and you’ll know that there’s then been a PAR added to that charge object.
And then finally, multiprocessor imports. So we mentioned about different processes storing PAR in perhaps different structures. The format of the PAR is going to be the same, but it’s going to be in different locations. So being able to handle those multiprocessor imports and importing this data from multiple processes is also super important. And the last two are really about the health of the pipeline. First one of these, the return rate is incredibly important to monitor. This is the return rate for how many PARs you see as a percentage. So out of however many transactions, you see a PAR on 95% of them. And the last is the match confidence. How confident are you that you’re matching the correct PAR to the correct customer? Those two last points, they’re really the guardrails. They’re really the KPIs that you use to show the health of the overall pipeline.
So let’s have a look at an example analysis that we could do using PAR data. So we’ve got a quick little demo to go through here. So the first example that we’re going to look at is a couple of the data files that back this. Now these are CSV files in the real world. These would probably be blob files in a data warehouse somewhere, but this file contains all of our transaction data. So this is what we’d be normalizing on. We’ve got in here the card network, we’ve got the expiry date, we’ve got things like the amount that it was authorized for, we’ve got a charge ID, all of those attributes that you’d expect to be associated with a transaction. And then also importantly, we have the payment account reference in here as well. So each one of these lines represents a single transaction, and each line contains a payment account reference identifying what the payment method was that was used across all of those different transactions.
We then have a mapping file. Again, this is a CSV, but in your world, it would probably be something in the data warehouse where we are mapping that payment account reference from that transaction list to a known customer ID and then all the other information we have about that customer. In your examples, you’d probably have much more than just a name and an email address, but in this example here, we’ve got a way now of mapping that payment account reference to a customer ID. Once we’ve done that, we can then start looking at building an analytics dashboard on top of that. The example that we’ve got here was built using Streamlit proxy, but other visualization tools are available. We’ve got some fairly common attributes here that you would expect to see, things like customer spend analysis, looking at how much your customers are spending and being able to then do any ranking that you need to, looking at total spending by card network or by payment method, if it was physical card, Apple Pay, Google Pay, or so on.
The difference here is how that data is keyed and joined together. All of this is keyed and joined on that payment account reference. So that’s what we’re using to identify all those transactions that are being done by Sarah or by other customers. We then have that information in graph form, but we also have a table on here as well where we’ve got a list of all those customers by customer ID. We’re including the payment account reference, their total spend. Again, the key difference here is this is all keyed on that payment account reference. We then look at some detailed analysis for customer transactions or terminal usage. And then where it gets really powerful is when we do the deep dive into a specific customer. So we can see our customer here, Sarah, who we talked about earlier. She’s got that $3,220 of spend. That’s across seven different transactions, and importantly, across four card variants.
So you can see that if you look in that card column, you can see that there’s four different PANs for those cards. One of them is the physical card, one is Apple Pay on her watch, one is Apple Pay on her iPhone, and then the last number is that reissued card after she lost her previous card. Now, in a scenario where we were going to be doing token-keyed systems, these would be four completely unlinked transactions. You’d be able to link together the two that were on Apple Pay on the phone, the recurring subscription ones because they’ve got the same number, but for all the other ones, they would be separate transactions. The key thing that the PAR brings us here is being able to join all of those together into a continuous timeline, and you have that unique identifier to be able to bring those seven separate transactions together and attribute them to Sarah as a customer.
That means if Sarah forgot to scan her loyalty card on one of those, you can automatically add those to her account. If she forgot to log in when she made one of the purchases online, you can add that purchase automatically to her purchase history, and she can then manage that order and be able to do things like returns or changes to that order. So super, super powerful attribute to be able to use here to join all of that data together and act as that join key.
So if we then look quickly at what can we do to actually build this system, what should you do? Said it a few times, but I’ll keep saying it: start capturing PAR. Make sure that you have it every processor, every channel, every transaction, and every customer touchpoint that you have, make sure that you are capturing that payment account reference and storing it with context. The additional context with that payment account reference is what makes it valuable. You need to attribute the payment account reference to a customer and then all the other attributes about what was on the order, how much was it for, where was it? Was it done on a physical terminal in store? Was it done online? Was it done through an agentic process? Storing that additional context gives the value to the data. And then build all of this into that unified data layer.
So if you’ve got downstream systems that are looking at things like fraud or loyalty or even order management, use the same consistent data layer across all of those systems so that you’re not having to pull data from multiple sources. And of course, if we’ve got some do’s, we need to have some don’ts as well. Don’t default to some of those options we talked about at the beginning, like using BIN and last four or PAN or token reference. They can cause clash at scale and they can also cause challenges with increasing PCI scope. Don’t assume 100% coverage. Not every issuer supports PAR today. It’s growing. It’s getting a lot better. As I said, it’s sort of 90, 95% return rate we see, but there’s still going to be those 5 to 10% of transactions that don’t have a PAR on them, or that PAR is returned at a later point in time after that transaction.
And then don’t treat this as a one-off migration. I mentioned in a few slides ago about those two core KPIs that you’d want to use on PAR return rate and match confidence. They show you the health of the pipeline, incredibly important to make sure that they’re being monitored so that it’s not just being treated as a one-off migration. So to summarize, PAR or payment account reference is the processor-agnostic industry standard join key for payment account identity in an omnichannel world. Across multiple processes, across multiple channels, you can now have a single identifier to identify each one of those transactions back to a payment account and then join that payment account to your customer data so that you can now see that Sarah’s transactions all join together and you can make sure that she’s getting the right loyalty points and the right experience when she’s on your website.
Thank you.