Welcome to my blog post, where we'll be exploring the importance of a strong data strategy and how it can help organizations navigate towards their goals. In today's data-driven world, having a solid strategy is essential for businesses to stay ahead of the competition and make informed decisions. That's where the Data Strategy Navigator: The Pyramid Mission Framework comes in - it serves as a guiding light to help organizations develop a comprehensive data strategy that is relevant, complete, and cohesive.
In this post, we'll be taking a closer look at the Pyramid Mission Framework and how it can be used to shape your organization's data strategy. We'll also explore the key components of the framework and how they can be applied to create a strong foundation for your data initiatives. So, whether you're a data strategy pro or just starting out, read on to discover how the Data Strategy Navigator can help your organization chart a course towards success.
The Data Strategy Navigator is a pyramid-shaped framework that starts with an organization's mission and vision. This drives a set of use cases, which are then supported by capabilities across People-Process-Technology at the bottom of the pyramid.
In this post, we'll cover the following topics:
I. Data Strategy
II. Use Cases
III. People, Process and Technology - Foundational Capabilities and Operating Model
IV. Data Platform Architecture
V. Framework for an Operating Model
Let's get started!
Definition
A data strategy is a comprehensive plan that outlines an organization's approach to managing and utilizing its data assets over the long term. It involves defining the technology, processes, people, and rules required to effectively collect, store, analyze, share, and use data within an organization. With the increasing amount of data collected by businesses today, a data strategy is essential to enable organizations to leverage data effectively to make informed decisions and gain a competitive advantage. The strategy should align with the organization's overall business objectives and be designed to support growth, innovation, and operational efficiency.
A data strategy is crucial for businesses to remain competitive, innovative, and relevant in a constantly changing landscape. It involves collecting, organizing, and utilizing data to achieve business goals, such as operational efficiency, process optimization, faster decision-making, increased revenue streams, and improved customer satisfaction. A data strategy aligns data management with business strategy and governance, enabling companies to make better data architecture decisions and manage data consistently across the organization. By answering key questions about data appropriateness, approved operations, purpose, governance policy, and insights, companies can unlock new value from their data and gain a competitive advantage.
Advantages of implementing a data strategy include
Solve data management challenges
Improve customer experience
Attain analytical maturity
Create an organization-wide data culture
Achieve regulatory compliance
As a CDO, it is crucial to translate the enterprise's overall mission and vision into a specific data strategy. This involves identifying which data capabilities are necessary to achieve the goals outlined in the enterprise strategy.
However, it is important to note that it is not feasible to address every data-related opportunity all at once. As a CDO, it is vital to prioritize and focus on a few key objectives, typically no more than 5. Having fewer objectives may make them too broad to be useful, while having too many can lead to a lack of focus and motivation among team members.
These objectives should be both qualitative and quantifiable, allowing for measurable progress to be made. For example, an objective of "Ensuring that all critical business processes are provisioned with high-quality data from the right source" could be supported by performance metrics such as the "number of business processes certified using high-quality data," "number of trusted sources," and "percentage of completeness and accuracy of critical data."
As a CDO, developing a data strategy involves aligning the organization's mission and vision with specific data capabilities, prioritizing key objectives, and establishing measurable performance metrics to ensure progress is being made.
As a Chief Data Officer, creating a data strategy is critical to ensuring that your organization is successful in achieving its goals. A data strategy should begin by identifying the overall mission and objectives of the organization, department, or business unit. This will provide a sense of purpose and help inform future decision-making.
Once you have established the high-level mission statement, you can then create a specific data strategy. This will involve identifying what data capabilities are needed to ensure that the overall strategy is successful. It's important to keep in mind that you cannot do everything at once, so it's best to identify a couple of key objectives that can be quantified and measured over time.
One component of the data strategy that is often neglected is the identification of specific data-driven use cases. Use cases are the specific ways in which the data strategy is implemented and are crucial to demonstrating the added value of data initiatives. Use cases can vary across domains in the organization, and it's important to identify the use cases that are most important for your organization.
To ensure that your data initiatives are successful, it's crucial to identify the data that is required to power the use cases. This allows you to map use cases against potential data assets and identify the data domains that will be at the heart of your transformation roadmap. Typically, there are a few potential "data assets" that can power the majority of the prioritized use cases.
A data strategy is critical for any organization that wants to leverage data as a strategic asset. By identifying the overall mission and objectives of the organization, creating specific data objectives, and identifying the data-driven use cases, you can ensure that your data initiatives are adding value and driving success.
Example of an objective and a use case
Objective: Increase customer engagement and retention.
Use case: Analyze customer behavior and preferences to personalize marketing messages and offers. By collecting and analyzing customer data, the data platform can identify patterns and trends in customer behavior, allowing businesses to tailor their marketing messages and offers to individual customers. This can lead to increased engagement and retention, as customers are more likely to respond positively to personalized messages and offers that are relevant to their interests and needs.
As a Chief Data Officer, building a successful data strategy requires careful consideration of several key components. We've already discussed the importance of identifying high-level objectives and specific data-driven use cases. Now, we turn our attention to the foundational capabilities necessary to enable these use cases, and the operating model required to orchestrate these capabilities effectively.
At the heart of any capability framework is the People-Process-Technology (PPT) concept. While many thought-leading organizations have created their own capability frameworks, the PPT concept remains a trusted and effective tool. It is crucial to emphasize that the People, Process, and Technology components are not stand-alone entities. Instead, they should be viewed as closely linked and interdependent. As a Chief Data Officer, fostering cooperation and coherence among business, technology, and other departments across the PPT spectrum is a critical goal.
For instance, constructing a strategic data platform that serves business users requires meticulous consideration of all three PPT framework elements. If business users are to choose, ingest, and link datasets by themselves (Process), they may necessitate a low-code or no-code platform (Technology).
On the other hand, if the platform is constructed via IaaS (Technology), a central tech team may have to provide infrastructure services (Process), and business users will require some degree of technical expertise to engage with the platform (People). To deliver value in practice, it is essential to contemplate all three PPT components as a unified whole.
In addition to foundational capabilities, the operating model is critical to success. The operating model involves decisions around which capabilities to prioritize and centralize, and which to allow to exist in a federated structure. Building a framework to inform the operating model is further down in this post.
Ultimately, a successful data strategy requires careful consideration of all components. Identifying high-level objectives, specific data-driven use cases, foundational capabilities, and operating models are all critical components of building a comprehensive and effective data strategy.
The people component is crucial to the success of your data strategy. It doesn't matter how much machinery you have if there are no skilled operators. You need to consider several components such as roles and responsibilities, skills and expertise, data literacy and culture, and talent/recruiting strategy.
Roles and Responsibilities: Identify the key roles related to data in your organization and their high-level responsibilities. Common roles include data owner, process owner, data steward, data custodian, data scientist, business analyst, system/app owner, data quality analyst, and data modeler. Pro Tip: Adopt a RASCI chart
Skills and Expertise: Determine the required skills and expertise for your people through a simple analysis. With this information, you can build a gap analysis to identify missing skills. This analysis can be done in collaboration with your HR organization.
Data Literacy and Culture: Drive data literacy and awareness in your organization through specific transformation programs and at the enterprise level. Everyone in the organization needs to understand data's importance, ownership, and their respective roles.
Talent and Recruiting Strategy: Develop a talent strategy to identify the skills and expertise that need to be grown through training and recruiting. For non-critical skills, outsourcing to another organization may be an option, allowing your organization to focus on developing the critical skills needed for success.
The process layer of the data strategy framework includes several key processes to enable successful data management within the organization. These processes include:
Innovation: Encouraging idea generation and facilitating analysis and prioritization of new initiatives.
Demand management and funding: Prioritizing and funding data initiatives using a defined process and "Book of Work". Use Jira to manage the collection of issues or tasks.
Transformation: Ensuring data considerations are integrated into the organization's transformation methodology.
Stakeholder management: Managing stakeholders through periodic updates, touchpoints, and documentation of key insights.
Knowledge management: Identifying critical content and channels of dissemination, including training schedules and access to strategic materials.
The objective is not to compile an all-inclusive catalog of technologies, but instead to recognize and oversee capabilities and procedures pertaining to every pertinent data technology, thereby ensuring the appropriate technical abilities are established.
Architecture and Technology Strategy: The most crucial aspect is to have an enterprise-level reference architecture that outlines architectural guidelines applicable to everyone in the enterprise. For instance, it may specify preferred cloud service providers or data visualization tools. Interoperability guidelines are also essential. This is complemented by a technology strategy, including choices on infrastructure, platform, or software as a service.
Access Management and Self-Service Enablement: A process and capability must be in place to provide access to business and other users, maximizing self-service functionalities where appropriate.
Security, Reliability, and Continuity: Minimum controls must be in place to protect the data and ensure its continuity in case of issues and disasters. This is typically not owned by the data organization, but there should be a strong connection. Mission-critical data assets or those containing PII require specific data protection and continuity requirements.
Sandboxing, Research, and Experimentation: It is crucial to provision a user-friendly, secure environment for experimentation with different sets of data.
Operations and Maintenance: Different technologies and environments within the data tech landscape must be monitored, and fixes and updates need to be applied where necessary.
Vendor Management: A Vendor Management process related to data capabilities (e.g., dashboarding, data quality, data catalog, lineage capture, etc.) can help promote a cohesive, rationalized set of tools by driving a cohesive vendor analysis, negotiation, and contracting process.
Internal IT Teams: These teams consist of architects and data engineering leads who manage IT technology to support the business. They help ensure the data infrastructure and applications meet business requirements.
Business Units: Professionals in this role contribute to and align corporate strategy with the data strategy. They help identify use cases, prioritize capabilities, and features.
Data Consumers: These stakeholders provide insights into how data is used within the business, which helps to identify the data needs of various business units.
Project Management: Individuals in this position coordinate the cross-functional team to ensure deliverables and timelines are met. They ensure the project stays on track and within budget.
Executive Sponsorship: The Chief Data Officer often plays this pivotal role in overseeing the entire data strategy operation. They provide direction, prioritize objectives, and ensure the data strategy aligns with the overall enterprise strategy.
Finance: This stakeholder plays an important role in developing a data strategy by providing a clear understanding of the financial value created by data applications and identifying data consumers who can benefit from them.
The initial discovery sessions aim to gather information about the current state of data assets, data platform technologies, and existing data use cases. The goal is to identify any gaps or challenges and create a prioritized list of potential future use cases. Discovery is carried out through interviews and documentation reviews with each stakeholder group, and relevant findings are documented. The documentation covers areas such as architecture, roadmaps, process flows, business requirements, business plans, and org charts. Additional interviews may be necessary if new information is revealed during the discovery process.
Questions for Non-IT Stakeholders:
Questions for IT Stakeholders:
Towards the end of the Discovery phase, it's crucial to summarize all identified use cases, gaps, and priorities. This summary will aid in identifying a
Primary Use Case
To drive the value of your data strategy, it is essential to identify a primary use case that aligns with your business's top priority and goals and has the potential to leverage data. The primary use case serves two main objectives:
It exercises and requires implementation of sufficient platform capabilities, accelerating subsequent use cases.
It is impactful enough to showcase to executive leadership and demonstrate the value of the platform, facilitating approval of subsequent use cases.
Consider the following questions while identifying your primary use case:
Creating an initial architecture based on the information gathered during the discovery phase breathes life into the platform. Although it may be incomplete, it provides a visual representation for subsequent discussions. This representation helps to frame people's thinking and communicates the story of the platform. Conversations without a visual representation often lead to repeating discussions, causing confusion and slowing down progress.
The architecture is structured around capabilities, which represent the essential logical components required to fulfill a requirement. For instance, a data warehouse is typically necessary for most data platforms as a capability. It allows for efficient storage and querying of data for business intelligence, advanced analytics, and machine learning.
Follow these three goals for creating an architecture diagram for a data platform:
Goal 1: The initial architecture diagram must present a sequential arrangement of platform capabilities, depicting the data flow from ingestion to consumption. This diagram aims to showcase specific capabilities and data needs, in a manner akin to a compelling narrative. Its purpose is to obtain agreement on a capability-centric approach to the data platform.
Goal 2: The second architecture diagram adds another layer of detail by identifying the specific technologies that support each capability. The technology choices are based on reasoning and justification, which are derived from the discovery phase interviews and documented in the Technology Assessment. The deployment and configuration of technologies can vary in several ways, and it is crucial to have a shared understanding of how the technology will be operated and utilized. To achieve this, a more detailed technical representation of the technology is often necessary.
Goal 3: The final architecture diagram serves as a comprehensive technical reference for the operational use of the platform. It provides a more detailed view of the platform and is primarily intended for technology domain experts. It outlines the scope and scale of the deployment, allowing for accurate cost estimation. As discussions with business and technology stakeholders progress, the architecture should be updated to reflect changes and new information. The iterative process continues until there is general agreement on the final architecture. Detailed diagrams are more informative than high-level diagrams as they accurately represent the effort and configuration required for the platform.
I recommend using the C4 model visual modeling language for describing and communicating the architecture of software systems. It provides a set of hierarchical diagrams that describe the system at different levels of detail, from a high-level context diagram down to detailed class and interface diagrams.
Level 1: System Context
Data Platform (System)
Level 2: Containers
Data Ingestion Service (Container)
Data Processing Service (Container)
Data Warehousing Service (Container)
Analytics and Reporting Service (Container)
Level 3: Components
Data Ingestion API (Component)
Data Streaming Service (Component)
Data Quality Service (Component)
ETL Processing Service (Component)
Data Warehouse API (Component)
Querying and Reporting API (Component)
Business Intelligence Dashboard (Component)
Level 4: Code
Data Ingestion Microservices (Code)
Data Processing Jobs (Code)
Data Warehouse Schema (Code)
Reporting Queries (Code)
As the capability architecture diagram takes shape, an evaluation of the technology stack for the platform will be conducted. This involves creating a document that outlines the available technology options, the selection criteria, and the relevant business factors that influence the final technology selection.
For instance, a company may have to decide whether to stick with Hadoop for their data warehouse or switch to Snowflake.
The evaluation may reveal the need for a Proof of Concept (POC) to gain a better understanding of how the proposed technologies compare to the selection criteria. This is usually mentioned in the proposal and could represent a phase 0 implementation scope for solidifying the technology selection.
Selection Criteria:
Decisions regarding technology are critical to the success of a data platform. The decision impacts everything from costs to the recruitment of team members for the platform.
Define the key capabilities needed: Start by identifying the key capabilities that are required to execute the data strategy. These capabilities should align with the strategic objectives and prioritized use cases.
Determine the level of centralization: Once you have identified the key capabilities, determine which of these capabilities should be centralized and which should be allowed to exist in a federated structure. This decision should be based on factors such as the criticality of the capability, the level of standardization required, and the level of control needed.
Define the governance structure: Establish a clear governance structure that outlines the roles, responsibilities, and decision-making processes for each capability. This structure should also specify the decision rights and escalation paths.
Identify the necessary processes: Determine the necessary processes that are required to support the identified capabilities. This should include the workflows, procedures, and policies needed to ensure the effective execution of each capability.
Define the technology requirements: Determine the technology requirements needed to enable each capability. This includes the hardware, software, and infrastructure required to support the capabilities.
Establish the talent requirements: Identify the necessary talent needed to execute the capabilities. This includes the skills, knowledge, and experience required to effectively deliver each capability.
Develop a roadmap: Develop a roadmap that outlines the implementation plan for the Operating Model. This roadmap should consider factors such as dependencies, timelines, and resource requirements.
Applying the framework presented in this blog can help organizations build a successful data platform. By starting with a clear strategy, identifying primary use cases, and mapping out the required capabilities across people, process, and technology, organizations can ensure alignment between their data platform and business objectives. This approach also allows for iterative refinement as discussions with stakeholders progress and provides a clear visual representation of the platform, which can help drive buy-in and support from executive leadership. Additionally, conducting thorough technology assessments and selecting the right technology stack is critical to a successful data platform. By following this framework, organizations can maximize their technology investments and build a solid foundation for their data platform to support their business goals.