What is Data Lineage?
Data lineage is defined as a data lifecycle that includes the data’s origins and where it moves over time. The ability to track, manage, and view data lineage helps simplify tracking errors back to the data source and it helps debugging the data flow process.
One common denominator for all successful data-driven marketing organizations is a recognition of the importance of data curation, or data lineage, to ensure that data is being used as the basis for innovative customer engagement or other purposes is the right data.
Regardless of business goals, industry regulations, or the level of sophistication an organization wishes to attain with data analysis, data lineage is a vital capability of any enterprise-grade customer data platform (CDP). There are three key use cases an organization should keep in mind when determining what they’re trying to accomplish with data lineage.
Privacy compliance is broad in scope, encompassing GDPR, CCPA, and other regulations that protect the use of consumer data, including the consumer’s right to be forgotten and governance around permissions. Different industries face different compliance rules. Banks, for instance, must comply with strict loan documentation and anti-money laundering rules. In general, all industries must document and be prepared to answer questions about how they acquire customer data, the permissions surrounding the data, how the data has been shared, and the future use of the data.
These questions can be asked by regulators, boards, customers, or even lines of business, each with a different reason for wanting the information. Data lineage ensures that marketers and data managers will have the right answers.
Know Your Data
Another reason to care about data lineage is that it encompasses data quality metrics. Measuring how data is cleansed, merged, matched, and split produces is a roadmap, if you will, that details the history of the data that is being ingested and produces visibility into the lineage. This data roadmap also allows for tracking and measuring data movement; its route from ingestion to fueling either systems of engagement or for a specific engagement.
Measurement allows marketing to understand, detect, and minimize raw data issues, common problems, and anything else that might adversely affect the quality, quantity, accuracy, and veracity of data used to create high-quality customer records that in turn form the basis of personalized customer experiences that consumers expect.
Add Value to Your Data
Finally, data lineage gives greater contextual value to data. Data aggregations provide relational information that, over time, yield greater contextual awareness of the data. How has a record, for instance, changed over time? How has the real person underlying the data changed over time? How have associations changed over time, such as a customer lifetime value (CLV), or the propensity to buy or churn?
Lineage that measures dynamic customer information provides marketers with keen insight into how populations change over time. This visibility answers questions about how a campaign itself or external factors have affected customers, for instance, in the context of all the data being collected.
Recognizing the customer in the data sheds light on what’s needed to create an even broader, bigger view of the customer. Whereas measuring data quality, quantity, accuracy, and veracity measures an intrinsic quality of the data source, data lineage in the context of the customer measures an intrinsic quality of the actual, physical world behind the customer, the product, or the engagement.
Trust Your Data
The three main functional areas of data lineage outlined above can also be significantly affected by non-functional areas, such as availability, performance, agility, and real time. From an operational standpoint, it might not be immediately clear why this matters, but if the marketers’ goal is a single point of control over data, decisions, and interactions, then both functional and non-functional aspects of the data solution will have an impact on the effectiveness of engagement. Data lineage, in other words, must not hold up marketers who are intent on moving in the same real-time cadence as a customer.
Robust data lineage is a core functionality of an enterprise-grade CDP. Depending on the objective, having data lineage capabilities is one more reason why data-driven marketers are choosing a CDP over a DMP. Putting data through its paces and attaching metrics to various measurements is important for compliance, and for creating innovative, personalized customer experiences that make a difference.