Redpoint Logo
Redpoint Logo
June 26, 2026

Customer Identity Resolution in Snowflake: Build or Buy?

Enterprise financial services companies face a build-or-buy decision on customer identity resolution in Snowflake. Building it in-house honors the right architectural instinct but carries significant maintenance debt. A native capability running inside Snowflake compresses procurement and removes the build burden. The right approach depends on engineering depth, regulatory exposure, and what kind of long-term work the bank wants to own.

Walk into any bank’s data team in 2026 and you’ll find some combination of dbt models doing deterministic matching, custom Python services for fuzzy logic, ML notebooks experimenting with entity resolution, open-source tools like Splink or Zingg, and the occasional Cortex AI proof of concept. The data team owns it.

The real question is whether to keep building this in the warehouse, or to run a capability that’s already native to the same place.

Where customer identity resolution lives today

Customer identity resolution is the work of matching and unifying customer records across siloed or disparate product systems, channels, and time into a single, persistent identity. Its architectural location has shifted three times in the last fifteen years.

It used to live in the Customer MDM: Reltio, Informatica MDM, IBM InfoSphere, or Profisee. The mental model was straightforward. The MDM was where customer identity got resolved, and everything else was supposed to consume from it.

That frame held until banks discovered three things about commercial MDM. It was built for stable master attributes (name, address, household hierarchy, ownership relationships), not for behavioral data or real-time customer activity. Extending it to cover modern customer identity needs is slow and expensive because the data model and the licensing both push against the extension, making it hard, costly, or impractical to expand it for newer needs. And even when extended, the MDM still doesn’t sit in the data cloud where downstream tools and AI consumers now read from. The architectural conversation has moved past the MDM as the sole answer.

It then lived in custom data engineering: hand-coded SQL, fuzzy matching libraries, internal Python services maintained by whichever engineer drew the short straw, often unmaintained as that engineer moved on. The data leader knew the code existed and tried not to think about it more than necessary.

In practice, neither location was ever the sole answer. Even when the MDM was supposed to be master, the CDP built its own customer profile, and the Anti-Money Laundering (AML) monitor, the Know Your Customer (KYC) platform, and the fraud engine each did their own matching. Customer identity has been silently scattered across systems all along.

The data cloud era is the first chance to consolidate. Snowflake is now the dominant location for enterprise financial services customer data. The architectural instinct is right. The question is whether to keep building the resolution logic in the warehouse yourself, or to run a native capability.

What building customer identity resolution in Snowflake takes

The technical scope is broader than most build projects budget for at the outset.

  • Deterministic and probabilistic matching, tuned to the bank’s data realities. Deterministic matching catches the easier cases. The harder ones (M&A overlap, channel drift, behavioral joining, name and address variations) require probabilistic models that need ongoing retraining as data quality, customer behavior, and the bank’s data sources evolve (especially after M&A). A dbt model with deterministic rules is achievable in a sprint. A probabilistic matcher that holds up over time is multi-quarter engineering with a permanent maintenance footprint.
  • Householding across joint accounts, beneficiaries, business owners, family wealth, and beneficial ownership. Joint accounts are straightforward, but the rest is genuinely complex and gets harder as the bank grows or acquires. In-warehouse builds typically ship the easy cases first and then accumulate the long tail as backlog that never quite gets cleared.
  • Persistent customer keys through M&A and core system replacements. Every acquisition inherits a different customer schema, and every core replacement breaks the keys. Each event triggers a multi-month reconciliation project. The bank’s customer identity work is never finished, just paused between integrations.
  • Lineage propagation to every downstream consumer. Regulation B adverse-action notices, Consumer Financial Protection Bureau (CFPB) complaint resolution, fair-lending audits, and the customer-data side of KYC and AML obligations all require defensible lineage. Custom matching code rarely builds lineage with any depth, because the engineer writing the matcher isn’t also writing the audit trail.
  • Real-time and batch in a single code path. Most in-warehouse builds end up with two implementations that drift apart over time, and customers get different resolved identities depending on which path is hit.
  • Governance and audit-grade traceability on every attribute. Usually the last capability built into a custom matcher, and usually the first one the regulator asks about.

The accumulating cost is what defines most in-warehouse projects over time. Within the first two years of launch, the team is typically spending more time keeping the system working than improving it, and the institutional knowledge tends to leave with the engineers who built it. The bank ends up with a critical piece of customer infrastructure that fewer and fewer people understand.

How native customer identity resolution in Snowflake works

A customer identity capability that runs inside the bank’s Snowflake account inherits the security envelope that’s already approved, places the resolved customer record where every downstream tool already reads from, and adds no data movement, no new InfoSec re-evaluation, and no new residency exposure. Evaluation runs in days rather than quarters because the procurement work is largely already done.

Architecturally, that matters because the resolved customer record becomes a data layer capability rather than a separate system. The Customer 360, the AI and agent platforms, and the BI stack all consume from the same resolved record, so every customer-facing system in the bank is acting on the same person. There’s no new integration layer to maintain and no vendor sandbox to keep in sync with production.

The Redpoint Identity Studio is the product instance of this pattern. The detailed capability work (matching, householding, persistent keys, real-time and batch, governance and lineage) is the same scope as the build described above. The difference is that the team that would have been building it is freed to work on the parts of the bank’s customer data.

Build or buy: what to take away

Customer identity resolution belongs in the data architecture, in your Snowflake account, at the layer between sources and consumers. The data leader’s instinct on this is correct. The honest question is whether to build it there or run something native in the same place, and both are defensible decisions in the right context.

Organizations must decide based on engineering depth, regulatory exposure, and tolerance for maintenance debt. Either way, run it where the data already lives.

FAQs

Q: What is customer identity resolution?

Customer identity resolution is the work of matching customer records across product systems, channels, and time into a single, persistent identity. The output is a resolved customer record that downstream systems (Customer 360, CDPs, AI and agent platforms, KYC engines, fraud and AML monitors, and BI) can read from consistently.

Q: Why isn’t my MDM enough for customer identity resolution?

Commercial MDM platforms like Reltio, Informatica MDM, IBM InfoSphere, and Profisee are built for stable master attributes (name, address, household hierarchy, ownership relationships) rather than for behavioral data or real-time customer activity. Extending an MDM to cover modern customer identity needs is slow and expensive because the data model and the licensing both push against the extension. Even when extended, the MDM still doesn’t sit in the data cloud where downstream tools and AI consumers now read from. For most enterprise financial services teams, the more productive question is no longer “how do we extend the MDM” but “where does customer identity work belong now.”

Q: Where should customer identity resolution live in a modern bank’s data architecture?

In the data layer, between source systems and downstream consumers. For enterprise financial services, that increasingly means in or near the data cloud where customer data already lives. Snowflake has become the dominant location for customer data in Tier-1 banks.

Q: Should we build customer identity resolution in Snowflake or buy a native capability?

Both are defensible. Build makes sense when the bank has serious in-house data engineering depth, narrow customer-identity scope, and tolerance for multi-year maintenance debt. Buying a native capability makes sense when the bank needs faster time-to-value, has regulatory complexity, or wants to free its data team to work on proprietary capabilities rather than rebuilding common customer-data infrastructure.

Q: How long does customer identity resolution take to deploy in Snowflake?

A native capability running inside the bank’s existing Snowflake account can typically be evaluated in days rather than quarters, because the security envelope and procurement work are already done. In-warehouse builds typically take multiple quarters to ship the easy cases and extend over years to handle the long tail of householding, M&A reconciliation, and governance requirements.

Steve Zisk 2022 Scaled

Kris Tomes

Vice President of Engineering at Redpoint Global

Do you like this article? Share it!

Related Articles: