Data Business Intelligence icon

Build a Data Pipeline for Reporting and Analytics

Data architecture best practices to prepare data for reporting and analytics.

Get Instant Access to this Blueprint

Contributors

  • Iryna Roy, Consultant, Architecture and Standards, Gevity Consulting Inc.
  • Nicholas Yee, Chief Strategy Officer, Hubio Technology
  • Shantanu Raje, Director, IT Software Solutions
  • Biplab Sarker, Data Architect, Sun Life
  • Mike Lapenna, Enterprise Information Architect, Manulife
  • Guy Kayembe, Big Data Architect, Pilgrim Data Services

Your Challenge

  • Continuous and disruptive database design updates while trying to have one design pattern to fit all use cases.
  • Sub-par performance while loading, retrieving, and querying data.
  • You want to shorten time-to-market of the projects aimed at data delivery and consumption.
  • Unnecessarily complicated database design limits usability of the data and requires knowledge of specific data structures for their effective use.

Our Advice

Critical Insight

  • Evolve your data architecture. Data pipeline is an evolutionary break away from the enterprise data warehouse methodology.
  • Avoid endless data projects. Building centralized all-in-one enterprise data warehouses takes forever to deliver a positive ROI.
  • Facilitate data self-service. Use-case optimized data delivery repositories facilitate data self-service.

Impact and Result

  • Understand your high-level business capabilities and interactions across them – your data repositories and flows should be just a digital reflection thereof.
  • Divide your data world in logical verticals overlaid with various speed data progression lanes, i.e. build your data pipeline – and conquer it one segment at a time.
  • Use the most appropriate database design pattern for a given phase/component in your data pipeline progression.

Research & Tools

Start here – read the Executive Brief

Build your data pipeline using the most appropriate data design patterns.

1. Understand data progression

Identify major business capabilities, business processes running inside and across them, and datasets produced or used by these business processes and activities performed thereupon.

2. Identify data pipeline components

Identify data pipeline vertical zones: data creation, accumulation, augmentation, and consumption, as well as horizontal lanes: fast, medium, and slow speed.

3. Select data design patterns

Select the right data design patterns for the data pipeline components, as well as an applicable data model industry standard (if available).

Guided Implementations

This guided implementation is a six call advisory process.

Guided Implementation #1 - Understand data progression

Call #1 - Review and discuss typical pitfalls (and their causes) of major Data Management initiatives. Discuss the main business capabilities of the organization and how they interact.
Call #2 - Discuss the business processes running inside and across business capabilities and the datasets involved.

Guided Implementation #2 - Identify data pipeline components

Call #1 - Review and discuss the concept of a Data Pipeline in general, as well as the vertical zones: data creation, accumulation, augmentation, and consumption. Identify these zones in the enterprise business model.
Call #2 - Review and discuss multi-lane data progression and identify different speed lanes in the enterprise business model.

Guided Implementation #3 - Select data design patterns

Call #1 - Review and discuss various data design patterns.
Call #2 - Discuss the data design pattern selection for Data Pipeline components. Discuss applicability of Industry Standard data model (if available).

Onsite Workshop

Unlock This Blueprint

Book Your Workshop

Onsite workshops offer an easy way to accelerate your project. If you are unable to do the project yourself, and a Guided Implementation isn't enough, we offer low-cost onsite delivery of our project workshops. We take you through every phase of your project and ensure that you have a roadmap in place to complete your project successfully.

Module 1: Understand Data Progression

The Purpose

Identify major business capabilities, business processes running inside and across them, and datasets produced or used by these business processes and activities performed thereupon.

Key Benefits Achieved

Indicates the ownership of datasets and the high-level data flows across the organization.

Activities

Outputs

1.1

Review & discuss typical pitfalls (and their causes) of major data management initiatives.

  • Understanding typical pitfalls (and their causes) of major data management initiatives.
1.2

Discuss the main business capabilities of the organization and how they interact.

  • Business capabilities map
1.3

Discuss the business processes running inside and across business capabilities and the datasets involved.

  • Business processes map
1.4

Create the Enterprise Business Process Model (EBPM).

  • Enterprise Business Process Model (EBPM)

Module 2: Identify Data Pipeline Components

The Purpose

Identify data pipeline vertical zones: data creation, accumulation, augmentation, and consumption, as well as horizontal lanes: fast, medium, and slow speed.

Key Benefits Achieved

Design the high-level data progression pipeline.

Activities

Outputs

2.1

Review and discuss the concept of a data pipeline in general, as well as the vertical zones: data creation, accumulation, augmentation, and consumption.

  • Understanding of a data pipeline design, including its zones.
2.2

Identify these zones in the enterprise business model.

  • EBPM mapping to Data Pipeline Zones
2.3

Review and discuss multi-lane data progression.

  • Understanding of multi-lane data progression
2.4

Identify different speed lanes in the enterprise business model.

  • EBPM mapping to Multi-Speed Data Progression Lanes

Module 3: Develop the Roadmap

The Purpose

Select the right data design patterns for the data pipeline components, as well as an applicable data model industry standard (if available).

Key Benefits Achieved

Use of appropriate data design pattern for each zone with calibration on the data progression speed.

Activities

Outputs

3.1

Review and discuss various data design patterns.

  • Understanding of various data design patterns.
3.2

Discuss and select the data design pattern selection for data pipeline components.

  • Data Design Patterns mapping to the data pipeline.
3.3

Discuss applicability of data model industry standards (if available).

  • Selection of an applicable data model from available industry standards.

Search Code: 94863
Published: November 2, 2020
Last Revised: November 2, 2020

Visit our COVID-19 Resource Center and our Cost Management Center
Over 100 analysts waiting to take your call right now: 1-519-432-3550 x2019