Contentsquare’s Ultimate Guide to Building and Scaling a Data Governance Program

This blog was originally published on Humans of Data by Mark Manning.

Prukalpa
13 min readJul 28, 2023

At a Glance

  • Contentsquare, a leading digital experience platform, sought to launch a data governance program after years of significant growth
  • Choosing Atlan, Contentsquare launched their program with a single source of context, transparency, and interaction across a diverse range of users
  • Having successfully launched data governance, Contentsquare now benefits from a fully mapped data estate, supports crucial KPIs with single owners and shared definitions, has streamlined collaboration, and has propagated knowledge and standards of data quality across its assets

Contentsquare is a company that lives and breathes data. With over 1,000 customers and over 1,600 employees, their digital experience analytics platform provides rich context and insights into behavior, feelings, and intent at each touchpoint in a customer journey.

With $1.4B in funding, and now in its Series F round, Contentsquare’s meteoric growth has been enabled by data. And at the center of making that data accessible, understandable, and trustworthy are their Data Governance team. Composed of Kenza Zanzouri, Data Governance Strategist, and Otavio Leite Bastos, Global Data Governance Lead, Contentsquare’s team led an Atlan Masterclass to share how Data Governance is capable of accelerating critical business outcomes.

The Importance of Data Governance

Contentsquare’s standards are high. They expect data governance to act as an accelerator, not a hindrance, to their Analytics and Business Intelligence responsibilities.

“Data Governance can help us have a faster time to value,” Otavio explained. “From the time you commit to an analytics project until you deliver it, you can make it faster. If dashboards go live faster, and if analysis can be faster, then decision-making is faster. We can also promote faster onboarding for new data professionals, and speed up remediations of any data downtime that we might have.”

With ambitions as high as Contentsquare’s, their team is mindful of striking a delicate balance between being a business enabler, and building a strong analytics foundation to avoid the pitfalls that a governance function is expected to address.

Contentsquare’s team understood the potential consequences if their commitment to data governance was insufficient, or went unsponsored altogether.

“We might have two different people calculating the same metric using two different methods or formulas, leading to different decisions. We can also have data flows that are poorly known, so we don’t master our data pipelines. Being compliant with data protection guidelines and regulations would be very challenging,” Otavio shared. “And data reliability would be at risk. So you would have data consumers calculating and re-calculating metrics to compare with dashboards, not trusting the dashboards that are live.”

Data Governance at Contentsquare

Contentsquare’s data team consists of 10 people, reporting to the CIO, and segmented into three teams. Data Analytics consists of a Lead and four Analysts, Data Engineering consists of a Lead with two Engineers, and Data Governance consists of a Lead and a Strategist.

“We are the heads of KPI standards. For any formula, for any method of calculating metrics, it’s up to us to orchestrate and lead people to identify the standards for the most important metrics at the company. Second is data quality, testing data to see if it’s behaving correctly, and as expected,” Otavio shared. Also crucial for the Data Governance team is data protection, ensuring that best practices are perfectly executed, and regulations are adhered to.

Living at the center of each of the Data Governance team’s responsibilities is Atlan, advancing their charter of accessibility, understandability, and trustworthiness. “It’s our home for every KPI and dashboard that we have at the company,” Otavio explained. “Our purpose is to make data accessible. Data must be simple, understandable. And data must be trustworthy.”

Contentsquare’s Modern Data Stack

Contentsquare’s data stack includes source systems like Salesforce, Workday, and Hubspot flowing through Matillion, their integration layer. Snowflake serves as their data storage layer, with a data lake containing raw data, a data warehouse with transformed data, and data marts directly connected to Tableau and Google Data Studio for visualization and analysis. Finally, Monte Carlo serves as Contentsquare’s data quality tool.

Contextualizing and activating this data stack, and serving at the center of their data governance function, is Atlan.

“Data domain owners can both read data from Atlan and write data to Atlan. Same for KPI owners, who can read any information, but can also enrich Atlan with documentation,” Otavio explained. “We have a data catalog. We have a glossary that’s home for our most important KPIs and metrics. And we can catalog every dashboard that we have at the company, and do automatic data lineage, which is very nice.”

And to ease collaboration across a spectrum of users, Contentsquare has integrated Atlan with Slack, enabling seamless communication about their data assets.

Treating Data Output as a Data Product

Contentsquare views data governance as a way to bring clarity to a complex, difficult-to-understand landscape of assets, ownership, and context. “You might find in any company across the globe that there are dashboards, KPI owners, data fields, data health checks, data consumers, and not everything is connected. They’re flowing in an ocean, and no one understands their meaning or how they’re related,” Otavio explained.

But with a well-defined governance program, these dots are connected, expressing meaningful links between Contentsquare’s data assets, technology, and people, avoiding inefficiencies across their data operations.

The first step toward launching a governance program was to better define ownership and responsibility using a Data Product Tree, and to ruthlessly prioritize by defining the most important metrics to their business.

Underpinning the concept of data products was carefully defined domain ownership, and the recognition that treating dashboards, analysis, and reports as products required orchestration across technical functions. “The concept is quite simple. You have data products linked to use cases. Use cases are linked to some underlying data. And you must perform some data quality monitoring on each data field,” Otavio explained.

Defining a Data Product Tree

Underpinning the concept of data products was carefully defined domain ownership, and the recognition that treating dashboards, analysis, and reports as products required orchestration across technical functions. “The concept is quite simple. You have data products linked to use cases. Use cases are linked to some underlying data. And you must perform some data quality monitoring on each data field,” Otavio explained.

Prioritizing Metrics

With a framework for cross-functional ownership of data products defined, it was crucial for the team to focus on rolling out products where they would have the highest impact. For Contentsquare, that meant defining the metrics that their business demanded, defining their priority, and executing from most-to-least important.

“Our top priority was our Top KPIs at the company, which measure the health of the business around the globe for any department in the company,” Otavio shared. “That’s where we focused 60–80% of our effort, until now. Our second priority was Strategic KPIs, important at a department level. Maybe these KPIs are only important to the Sales, Marketing, or Finance departments. And the last priority was team-level metrics, maybe measuring the performance of a project in a small team in the company, these are Operational KPIs.”

Executing Data Governance and Data Quality

With technology and ownership defined, and a prioritization framework identified, Contentsquare then started a 5-step process to put their data governance and quality function into practice.

Step 1: Data Prioritization

First was formalizing data prioritization, identifying the most important metrics and use cases for enrichment. Storing this information in a profile for each critical asset in Atlan, Contentsquare’s team documented descriptions, business rules, how metrics were calculated, formulas, and formal owners.

Step 2: Data Cataloging and Data Lineage

Next, with data assets properly enriched, the team utilized Atlan’s automated lineage to map how each data asset comes into being.

“We know where data is coming from, we know where data is going to. Is this data going to new tables, new dashboards, new extractions on Tableau, for example? That’s what we can know using Atlan,” Otavio shared. “And what is very nice is that Atlan will automatically sync to our databases and to our data visualization tool. For all of our mapping, we didn’t do anything manually.

Step 3: Data Protection

With data assets properly defined and contextualized, and with their full lineage understood, the team undertook a classification exercise, defining the sensitivity of data and specifying who would have access to it.

“What’s very interesting is that Atlan will automatically propagate these classifications to the underlying data assets,” Otavio explained. “If you have a metric, and anyone in the company can access that metric, then the data behind that asset can also be accessed by anyone. It just keeps coherence between our metrics and underlying data.”

Step 4: Data Quality

Turning to Monte Carlo, the team built out monitoring and data quality triggers. In the event of a data issue, alerts would be sent directly to Slack, where data owners could click into incidents, triage, and correct them, as needed. Further, using Atlan’s native integration with Monte Carlo, each alert creates an announcement within all affected data asset profiles in Atlan.

Step 5: Data Policies and Processes

Finally, having defined best practices, implemented a glossary and documentations, and built a framework for data quality, the team moved to create data policies in processes, starting the process of training users, and iterating to ensure their hard work would result in a persistent, improved way of working.

Building Personas

“Atlan facilitated the centralization of all critical assets and business definitions, so we could then fulfill the day-to-day challenges of different people at Contentsquare,” Kenza shared. With a diverse array of potential users across different departments, with different operational needs, the team moved their focus to improving data value streams. Referencing the Atlan Value Stream Library, a collection of best practices available to Atlan customers, Kenza and her team started the process by defining personas.

First was Sego, a Business Analyst responsible for delivering reports and dashboards that help Contentsquare’s business teams understand their performance. “She might say ‘Why is it so hard to understand what I can use and who I should reach out to?’ She needs a go-to space where data cataloging, data lineage, and dashboards are just a click away,” Kenza explained.

Next was Hiroto, a Data Engineer responsible for maintaining business intelligence data pipelines and securing data flows. The team learned that Contentsquare’s Data Engineers would benefit from metadata to help them prioritize migrations, optimize data processes, and improve their data warehouse, and automated lineage to understand the downstream impacts of their work.

Jill, a business user from Contentsquare’s Customer Success Operations team, was less technically savvy, but was data-driven in her business decisions. These business users would benefit from more context to better conceptualize the data available to them as they planned the quarters and years ahead for their departments. “All of these departments look forward to a clear way of digesting the data generated by all of our sources,” Kenza shared.

The final persona was Martin, Contentsquare’s CIO, who required full visibility on their data environment, and a view of all of their data, sourced and protected in a single location.

“As you can see, all of our personas are important to consider. But most share common difficulties. It’s why it’s important to listen to data experts and data consumers, who can build a strong data governance framework with you,” Kenza explained.

Overcoming Rollout Challenges

Even with a strategy as well-defined as Contentsquare’s, the team faced early challenges, and suggested taking time to align stakeholders, ensure strong ownership, and thoughtfully address user needs to increase adoption.

Challenge 1: Alignment

The first challenge faced by the Contentsquare team was one of alignment, referring to a single, shared sense of truth around each of their KPIs and BI deliverables, available in a single, simple-to-access location. “We went from three definitions and formulas for the same KPI, which was quite a lot, to a single definition and formula. We wanted to make sure that everyone was aligned, on the same page, on Atlan,” Kenza shared.

Challenge 2: Ownership

Second was ensuring that for each data asset and business-critical KPI, that a single owner would be identified. “Identifying the right person reduced stress, of course, but also time. It would ease adoption, and address correctly, in a timely manner, all data-related issues,” Kenza explained.

Where Contentsquare’s assets once had multiple owners, driving confusion from end users, and extra effort for domain experts, there’s now a single, empowered owner for each data asset and KPI.

Challenge 3: User Engagement

Finally, the team took a thoughtful approach to user engagement, maintaining strong individual relationships and a high standard of service, while also scaling their reach through a community approach.

“It was clear that training and building a strong community around data cataloging, helped us go from low adoption to growing adoption,” Kenza explained. To ensure that the users they needed to engage understood the importance of participating, Kenza and Otavio also enlisted the support of Contentsquare’s leadership team.

Contentsquare’s Governance Strategy in Action

With the right technology in place, a data governance and quality strategy successfully launched, and personas that ensured solutioning was valuable to their end users, Otavio shared an example of what’s possible.

Using the integration between Atlan and Monte Carlo, Contentsquare’s team successfully launched a use case to communicate data quality alerts to an array of downstream stakeholders. When an incident is identified on Monte Carlo, information is pushed into Atlan, where an alert appears in each affected data asset’s profile.

“This script will automatically detect issues, and put an announcement in Atlan telling the user to contact the data owner,” Otavio explained. “Anyone navigating to Atlan and trying to find the right dataset will know if they can use a dataset. Here, we have a complete propagation and link between the business world and the data world. We’re sending alerts to datasets on Atlan, but we’re also sending alerts to any metrics that are using a dataset as a reference to be calculated.”

For more technical users, like data owners or engineers, a simple search for a data asset in Atlan will yield a link directly into Monte Carlo, where further analysis can be conducted, like viewing past incidents, freshness, volume evolution, lineage, and schema changes.

When to Start Data Governance

Crucial to Contentsquare’s success was starting Data Governance and an opportune time. Otavio began by considering, then measuring, the size of his team and the velocity of their output, establishing a baseline.

With a small team of one to two data analysts, Contentsquare found that it would take one month to build a production dashboard. Over time, their data team grew to support the growing needs of their business.

“More people means fewer parallel tasks. So you can deliver value faster. You recruit more people, and instead of spending one month building a dashboard, maybe you spend two weeks and the dashboard will be live.”

The obvious assumption for data teams is that as headcount and resources grow, time-to-value will continue to diminish, but Contentsquare found that this was not the case. As the number of analysts grew, their solutions increased in technical depth, and a growing number of technical assets like scripts would go without updates. Their data lake grew, making it more difficult to understand what data assets were available. Processes changed at a high pace, and with new teams across the globe, new ways of working, including how metrics were calculated or data quality was measured, became misaligned.

Despite a growing headcount, and better technology, the pace of delivery slowed.

“So when is the good time to start data governance? It’s when you start realizing that you’re recruiting more data analysts and maybe even more data engineers, and it’s not solving your problems. You’re actually spending more and more time to deliver a data project, because you have so many blocks, so many pitfalls, and so many traps along the way.”

To identify the perfect time to begin a data governance program, the Contentsquare team emphasized the importance of measurement. Using a project management tool like Jira, data teams should measure the time it takes for projects to complete, from defining specifications until delivery. When time to value slows, the opportunity for data governance to accelerate outcomes is at its highest.

Key Takeaways

Looking back at the successful launch of Data Governance at Contentsquare, Kenza considers four key pieces of advice for data leaders considering a similar journey.

1: Centralize Less, Orchestrate More

“Feedback is very, very important. You need to invest time and effort into a successful framework without taking full responsibility and action. Think about engaging different stakeholders, centralizing less and orchestrating more. Define what good looks like for different personas, people, data teams, and business consumers, so we can all leverage collaboration and embrace change.”

2: Treat Deliverables as Data Products

“Reach out to different stakeholders, collect all the logical links to your data assets, and you’ll definitely optimize costs and generate value.”

3: Ensure Clear Ownership

“Establishing key stakeholders in the business is a key takeaway. They can help you build a strong, and long-term data quality and governance function.”

4: Break Down your Data Governance Program

“Data governance is new for a lot of people. You must take your time. You have to break it up into different trainings. You need to ensure what you pot in place is good for everyone.”

Subscribe to Metadata Weekly to stay updated about upcoming stories, case studies, and best practices from leading data teams.

--

--

Prukalpa

Co-founder of Atlan (atlan.com), the active metadata platform for modern data teams | Weekly newsletter for data leaders: metadataweekly.substack.com