Get Instant Access
to This Blueprint

Applications icon

Choose the Right Tools for Big Data Development

Leverage Hadoop as your pilot project to gain organizational buy-in and build institutional learning.

  • A relational database management system works great under many scenarios, but it has its limitations:
    • Volume issues typically arise when there is a need to index large databases.
    • In a multi-source environment, data collisions occur and resolving them can be expensive and time consuming.
    • Velocity problems arise when large amounts of read/write transactions occur that are expensive to compute.

Our Advice

Critical Insight

  • Begin your big data implementation with a baseline Hadoop pilot. This pilot will help build your knowledge of big data, how the Hadoop framework satisfies your use cases, and how it operates in your system. Each component in this baseline stack is well understood in the industry and documentation is readily available.

Impact and Result

  • Provide a step-by-step starting point to begin the rollout of big data development based on your business and technical requirements.
  • Highlight the challenges, impacts, potential, and mitigations in big data development.
  • Identify the key metrics, benchmarks, and instrumentation points to measure the success of your big data rollout.

Choose the Right Tools for Big Data Development Research & Tools

1. Assess fit and readiness for big data

Minimize the process and technology impacts of introducing big data to the organization.

2. Build the project team

Identify the roles and responsibilities of the big data project team.

3. Roll out the Hadoop pilot

Create a Hadoop stack based on business requirements and the data that needs to be mined and analyzed.

4. Roll out Hadoop in the organization

Customize the Hadoop pilot for fit in other areas of the organization based on instrumentation and pilot experiences.


Workshop: Choose the Right Tools for Big Data Development

Workshops offer an easy way to accelerate your project. If you are unable to do the project yourself, and a Guided Implementation isn't enough, we offer low-cost delivery of our project workshops. We take you through every phase of your project and ensure that you have a roadmap in place to complete your project successfully.

Module 1: Assess fit for big data

The Purpose

  • Understand the current big data landscape.
  • Identify the project team.
  • Assess the current data analytics stack.

Key Benefits Achieved

  • Understand the organization’s readiness for big data development.

Activities

Outputs

1.1

Document and assess the development process

  • Data analytics stack
1.2

Assess the data analytics stack

  • Development process flow
1.3

Address the gaps in the stack

  • Big data project team

Module 2: Draw the big data flow

The Purpose

  • Map the requirements to big data.
  • Draw the big data flow.

Key Benefits Achieved

  • Ensure business requirements are mapped to each component of the big data flow.

Activities

Outputs

2.1

Document the business requirements and use cases

  • List business requirements
2.2

Draw the top-down and bottom-up big data flows

  • Big data flow target state

Module 3: Build the Hadoop stack

The Purpose

  • Choose the appropriate installation approach.
  • Import data into Hadoop.
  • Develop the MapReduce program.
  • Select big data analytics tools.
  • Conduct end-to-end testing.

Key Benefits Achieved

  • Create a baseline Hadoop stack that fits the organization’s needs.
  • Understand the challenges of installing and managing the Hadoop stack.

Activities

Outputs

3.1

Select the installation approach

  • Complete pilot Hadoop stack fitted for the organization
3.2

Classify the imported data

  • Test and validate points in the Hadoop stack
3.3

Select the data collection tools

3.4

Design the relational schema

3.5

Test and validate the dataflow

3.6

Choose analytics tools

3.7

Perform end-to-end testing

Module 4: Roll out Hadoop in the organization

The Purpose

  • Prepare the Hadoop stack for deployment.
  • Gain institutional learning.
  • Create an organizational rollout plan

Key Benefits Achieved

Activities

Outputs

4.1

Establish instrumentation points

  • Big data instrumentation points to measure value
4.2

Optimize the Hadoop stack

  • List of tools to improve performance of Hadoop
4.3

Develop an organization rollout plan

4.4

Establish a stakeholder communication plan

Leverage Hadoop as your pilot project to gain organizational buy-in and build institutional learning.

About Info-Tech

Info-Tech Research Group is the world’s fastest-growing information technology research and advisory company, proudly serving over 30,000 IT professionals.

We produce unbiased and highly relevant research to help CIOs and IT leaders make strategic, timely, and well-informed decisions. We partner closely with IT teams to provide everything they need, from actionable tools to analyst guidance, ensuring they deliver measurable results for their organizations.

What Is a Blueprint?

A blueprint is designed to be a roadmap, containing a methodology and the tools and templates you need to solve your IT problems.

Each blueprint can be accompanied by a Guided Implementation that provides you access to our world-class analysts to help you get through the project.

Need Extra Help?
Speak With An Analyst

Get the help you need in this 1-phase advisory process. You'll receive 3 touchpoints with our researchers, all included in your membership.

  • Call 1: Assess fit and readiness for big data

    Get off to a productive start: Assess your data analytics stacks to determine your readiness for big data. Info-Tech analysts will help you identify your gaps and create a list of tasks to fill these gaps.

  • Call 2: Prepare and roll out the Hadoop pilot

    Build your Hadoop pilot stack: Review the roles and responsibilities for your Hadoop pilot, document your requirements, choose an installation approach, design and build your MapReduce program, select your analytics toolset, and test and validate your Hadoop data flow. Info-Tech analysts will discuss the fit of your Hadoop stack to your business requirements and assist in planning for the stack implementation.

  • Call 3: Roll out Hadoop in the organization

    Monitor and optimize your Hadoop stack for deployment: Identify your instrumentation points and metrics, tweak your Hadoop stack based on your instrumentation to fit other areas of your organization, and apply lessons learned to other development projects. Info-Tech analysts will discuss the success of your big data rollout and help you optimize your big data stack.

Authors

Andrew Kum-Seun

Altaz Valani

Contributors

Individuals who conducted expert interviews with us for this project:

  • Martin Parrest, Foxnet Solutions
  • Mehdi Bahrami, University of California, Merced
  • Michael Hausenblas, MapR Technologies
  • Michael Davison, Davis+Henderson

Vendors who conducted expert interviews with us for this project:

  • Pentaho
  • Informatica



Search Code: 73868
Last Revised: May 19, 2014

Visit our IT Cost Optimization Center
Over 100 analysts waiting to take your call right now: 1-519-432-3550 x2019