The Match Game: Why Both Probabilistic and Deterministic Identity Resolution Matter

It is well accepted that customers today expect a seamless, personalized experience across all channels. In research from the Harris Poll, in a survey commissioned by Redpoint, 63 percent of customers said that personalization is a standard service they expect. And 37 percent said they will stop doing business with any company that fails to offer a personalized experience.

To meet this expectation, brands must not only demonstrate a personal understanding of a customer, but do so across an omnichannel journey. A personalized experience, in other words, must be consistent wherever and whenever a customer chooses to engage. To be relevant at the moment of interaction requires that a brand resolve, with a high degree of confidence, a customer’s identity across many signals, some of which may be at odds with one another.

At its core, advanced identity resolution aims to build a golden record – an accurate, complete identity graph that represents a detailed collection of all connected experiences, interactions and facets of a particular customer, person, household or organization. An identity graph may not necessarily reveal an individual customer’s name per se, but rather everything associated with a customer ID: all devices, IP addresses, physical addresses, emails, phone numbers, transactions, etc.

Brands formulate an identity graph by tying together distinct facets – the collections of experiences or signals that constitute an interaction. Associating two or more facets to represent one customer, household or organization ID may entail deterministic and/or probabilistic matching. The use of both provides brands with a high degree of confidence that they’re identifying the customer accurately during an interaction.

Parsing Signals to Determine Who’s Who

A common question from companies wanting to use advanced identity resolution to help deliver hyper-personalized experiences is when to use probabilistic matching vs. deterministic matching, and why the need for both.

Let’s examine a familiar scenario to see how it might play out. Imagine, for a moment, a man engages with a retailer’s website. The customer surfs various pages, clicks on an ad or two, hovers over certain images, etc. He then puts items into a shopping cart, and at checkout he provides a name, a shipping address and a credit card number. That entire online session represents one facet, consisting of various signals.

The device IP and the web behavior may be a signal telling the retailer it is the man doing the shopping, but the credit card number may “belong” to his wife, while the shipping address may be for his daughter. In the one session, there are three distinct signals.

This is where deterministic and probabilistic matching come into play, depending on an organization’s goals and the purpose for resolving an identity. If we consider each facet as a unit, the rules for attaching a unit to an identity have to take many factors into account – which channel did the unit come from, how much information did it contain, what is the degree of confidence that the unit’s various signals represent a particular person or ID? Is the facet online or offline? Does it represent anonymous or known behavior?

A company may assign various probabilities to help it identify how to assign a certain facet. It may know, for instance, that the device in the above scenario is used by the husband more often than his wife, and assign a 70/30 percentage split. Conversely, if the session was on a mobile device, perhaps the company assigns a 90/10 split.

A Combination of Deterministic, Probabilistic & Better Outcomes

Conflicting or unclear signals is one reason to use probabilistic matching, in that it may produce a higher degree of confidence than deterministic rules alone – such as an ironclad rule that says if a particular credit card number is used, the facet or unit must always be assigned to the husband’s customer ID – regardless if every other signal points to the wife and there may be a higher probability that she is driving the customer journey.

Another reason for using probabilistic matching is if there is reason to mistrust some of the data in the facet. Human error is a prime example. A customer service record from a call center interaction may contain information inconsistent with the information in a master golden record. A “Jon Smith” instead of “John Smith,” for instance, or a “123 Battery Road” instead of “123 Battery Avenue.” Each piece of information may have its own probability rule attached – if the probability is 80 percent or above that it belongs to the master record, use it, etc. In this case, probabilistic rules produce better outcomes than a deterministic rule that says to discard all but 100 percent matching information.

A third reason for using probabilistic matching is to strike the right balance between false positives and false negatives, the former being when facets are tied together incorrectly and the latter when facets are not tied together when they should be.

Fine-tuning the balance depends a great deal on the purpose behind making a match. A healthcare organization that wants to send general marketing material announcing the hiring of new general practitioners will have fewer constraints than it will if it needs to send information to a patient about test results. The rules an organization uses and the probabilities it creates based on the data in a particular facet will be specific to an industry, regulations and other factors. In this situation, a deterministic rule that may be perfectly adequate for one situation – consider “Jon Smith” and “John Smith” the same customer ID when sending out a credit card application, but a non-starter for another – sending an overdraft notice.

Similarly, one set of rules may be used to tie a facet to a person, while different rules are used to tie it to a household, address, or organization. Allowing for rules to reflect the purpose of the interaction and the type of graph being built gives a marketer enormous power in getting CX right.

Identity resolution is about detecting patterns to determine whether different facets constitute the same customer, person or organization. By using deterministic and probabilistic matching, an organization considerably improves the likelihood of success of not just engaging with the right customer, but engaging the customer with a personalized experience that reflects a deep understanding consistent with today’s expectations.

Related Content

Take a Personalized CX to the Next Level with Advanced Identity Resolution

Data Matching and Identity Resolution: Keys to Hyper-Personalization

A Contextual Customer Relationship is the Basis for a Superior Experience

Be in-the-know with all the latest customer engagement, data management, and Redpoint Global news by following us on LinkedInTwitter, and Facebook.

Get Started on Getting Ahead

Schedule a conversation and learn how Redpoint can put your goals within reach.

Get Started on Getting Ahead

Schedule a conversation and learn how Redpoint can put your goals within reach.

Email Your Molecule
Do Not Sell

Submit the form below to set a "Do Not Sell" preference for your user within our persistent customer records.

Meet with a Redpoint Partner
Hello there!

Please fill out the form below and we will reach out to you.

Open

Your Unique rgOne™ Solution

Click below to create a personalized Molecule to meet your specific goals.

Create Your Molecule