- A relational database management system works great under many scenarios, but it has its limitations:
- Volume issues typically arise when there is a need to index large databases.
- In a multi-source environment, data collisions occur and resolving them can be expensive and time consuming.
- Velocity problems arise when large amounts of read/write transactions occur that are expensive to compute.
Our Advice
Critical Insight
- Begin your big data implementation with a baseline Hadoop pilot. This pilot will help build your knowledge of big data, how the Hadoop framework satisfies your use cases, and how it operates in your system. Each component in this baseline stack is well understood in the industry and documentation is readily available.
Impact and Result
- Provide a step-by-step starting point to begin the rollout of big data development based on your business and technical requirements.
- Highlight the challenges, impacts, potential, and mitigations in big data development.
- Identify the key metrics, benchmarks, and instrumentation points to measure the success of your big data rollout.
Workshop: Choose the Right Tools for Big Data Development
Workshops offer an easy way to accelerate your project. If you are unable to do the project yourself, and a Guided Implementation isn't enough, we offer low-cost delivery of our project workshops. We take you through every phase of your project and ensure that you have a roadmap in place to complete your project successfully.
Module 1: Assess fit for big data
The Purpose
- Understand the current big data landscape.
- Identify the project team.
- Assess the current data analytics stack.
Key Benefits Achieved
- Understand the organization’s readiness for big data development.
Activities
Outputs
Document and assess the development process
- Data analytics stack
Assess the data analytics stack
- Development process flow
Address the gaps in the stack
- Big data project team
Module 2: Draw the big data flow
The Purpose
- Map the requirements to big data.
- Draw the big data flow.
Key Benefits Achieved
- Ensure business requirements are mapped to each component of the big data flow.
Activities
Outputs
Document the business requirements and use cases
- List business requirements
Draw the top-down and bottom-up big data flows
- Big data flow target state
Module 3: Build the Hadoop stack
The Purpose
- Choose the appropriate installation approach.
- Import data into Hadoop.
- Develop the MapReduce program.
- Select big data analytics tools.
- Conduct end-to-end testing.
Key Benefits Achieved
- Create a baseline Hadoop stack that fits the organization’s needs.
- Understand the challenges of installing and managing the Hadoop stack.
Activities
Outputs
Select the installation approach
- Complete pilot Hadoop stack fitted for the organization
Classify the imported data
- Test and validate points in the Hadoop stack
Select the data collection tools
Design the relational schema
Test and validate the dataflow
Choose analytics tools
Perform end-to-end testing
Module 4: Roll out Hadoop in the organization
The Purpose
- Prepare the Hadoop stack for deployment.
- Gain institutional learning.
- Create an organizational rollout plan
Key Benefits Achieved
Activities
Outputs
Establish instrumentation points
- Big data instrumentation points to measure value
Optimize the Hadoop stack
- List of tools to improve performance of Hadoop
Develop an organization rollout plan
Establish a stakeholder communication plan