Get Instant Access
to This Blueprint

Infrastructure Operations icon

Design Your Cloud Operations

It’s “day two” in the cloud. Now what?

  • Traditional IT capabilities, activities, organizational structures, and culture need to adjust to leverage the value of cloud, optimize spend, and manage risk.
  • Different stakeholders across previously separate teams rely on one another more than ever, but rules of engagement do not yet exist.

Our Advice

Critical Insight

Define your target cloud operations state first, then plan how to get there. If you begin by trying to reconstruct on-prem operations in the cloud, you will build an operations model that is the worst of both worlds.

Impact and Result

  • Assess your key workflows’ maturity for life in the cloud and evaluate your readiness and need for new ways of working
  • Identify the work that must be done to deliver value in cloud services
  • Design your cloud operations framework and communicate it clearly and succinctly to secure buy-in

Design Your Cloud Operations Research & Tools

1. Design Your Cloud Operations Deck – A step-by-step storyboard to help guide you through the activities and tools in this project.

This storyboard will help you assess your cloud maturity, understand relevant ways of working, and create a meaningful design of your cloud operations that helps align team members and stakeholders.

2. Planning and design tools.

Use these templates and tools to assess your current state, design the cloud operations organizing framework, and create a roadmap.

3. Communication tools.

Use these templates and tools to plan how you will communicate changes to key stakeholders and communicate the new cloud operations organizing framework in an executive presentation.


Workshop: Design Your Cloud Operations

Workshops offer an easy way to accelerate your project. If you are unable to do the project yourself, and a Guided Implementation isn't enough, we offer low-cost delivery of our project workshops. We take you through every phase of your project and ensure that you have a roadmap in place to complete your project successfully.

Module 1: Day 1

The Purpose

Establish Context

Key Benefits Achieved

Alignment on target state

Activities

Outputs

1.1

Assess current cloud maturity and areas in need of improvement

  • Cloud maturity assessment
1.2

Identify the drivers for organizational redesign

  • Project drivers
1.3

Review cloud objectives and obstacles

  • Cloud challenges and objectives
1.4

Develop organization design principles

  • Organization design principles

Module 2: Day 2

The Purpose

Establish Context

Key Benefits Achieved

Understanding of cloud workstreams

Activities

Outputs

2.1

Evaluate new ways of working

2.2

Develop a workstream target statement

  • Workstream target statement
2.3

Identify cloud work

  • Cloud operations workflow diagrams

Module 3: Day 3

The Purpose

Design the Organization

Key Benefits Achieved

Visualization of the cloud operations future state

Activities

Outputs

3.1

Design a future-state cloud operations diagram

  • Future-state cloud operations diagram
3.2

Create a current-state cloud operations diagram

  • Current-state cloud operations diagram
3.3

Define success indicators

  • Success indicators

Module 4: Day 4

The Purpose

Communicate the Changes

Key Benefits Achieved

Alignment and buy-in from stakeholders

Activities

Outputs

4.1

Create a roadmap

  • Roadmap
4.2

Create a communication plan

  • Communication plan

Design Your Cloud Operations

It’s “day two” in the cloud. Now what?

EXECUTIVE BRIEF

Analysts’ Perspective

The image contains a picture of Andrew Sharp.

Andrew Sharp

Research Director

Infrastructure & Operations Practice

It’s “day two” in the cloud. Now what?

Just because you’re in the cloud doesn’t mean everyone is on the same page about how cloud operations work – or should work.

You have an opportunity to implement new ways of working. But if people can’t see the bigger picture – the organizing framework of your cloud operations – it will be harder to get buy-in to realize value from your cloud services.

Use Info-Tech’s methodology to build out and visualize a cloud operations organizing framework that defines cloud work and aligns it to the right areas.

The image contains a picture of Nabeel Sherif.

Nabeel Sherif

Principal Research Director

Infrastructure & Operations Practice

The image contains a picture of Emily Sugerman.

Emily Sugerman

Research Analyst

Infrastructure & Operations Practice

Scott Young

Principal Research Director

Infrastructure & Operations Practice

Executive Summary

Your Challenge

Common Obstacles

Info-Tech’s Approach

Widespread cloud adoption has created new opportunities and challenges:

  • Traditional IT capabilities, activities, organizational structures, and culture need to adjust to leverage the value of cloud, optimize spend, and manage risk.
  • Different stakeholders across previously separate teams rely on one another more than ever, but rules of engagement do not yet exist, leading to a lack of direction, employee frustration, missed work, inefficiency, and unacceptable risk.
  • Many organizations have bought their way into a SaaS portfolio. Now, as key applications leave their network, I&O leaders still have accountability for these apps, but little visibility and control over them.
  • Few organizations are, or will ever be, cloud only. Your operations will be both on-prem and in-cloud for the foreseeable future and you must be able to accommodate both.
  • Traditional infrastructure siloes no longer work for cloud operations, but key stakeholders are wary of significant change.

Clearly communicate the need for operations changes:

  • Identify current challenges with cloud operations. Assess your readiness and fit for new ways of working involved in cloud operations: DevOps, SRE, Platform Engineering, and more.
  • Use Info-Tech’s templates to design a cloud operations organizing framework. Define cloud work, and align work to the right work areas.
  • Communicate the design. Gain buy-in from your key stakeholders for the considerable organizational change management required to achieve durable change.

Info-Tech Insight

Define your target cloud operations state first, then plan how to get there. If you begin by trying to reconstruct on-prem operations in the cloud, you will build an operations model that is the worst of both worlds.

Your Challenge

Traditional IT capabilities, activities, organizational structures, and culture need to adjust to leverage the value of cloud, optimize spend, and manage risk.

  • As key applications leave for the cloud, I&O teams are still expected to manage access, spend, and security but may have little or no visibility or control over the applications themselves.
  • The automation and self-service capabilities of cloud aren’t delivering the speed the business expected because teams don’t work together effectively.
  • Business leaders purchase their own cloud solutions because, from their point of view, IT’s processes are cumbersome and ineffective.
  • Accounting practices and governance mechanisms haven’t adjusted to enable new development practices and technologies.
  • Security and cost management requirements may not be accounted for by teams acquiring or developing solutions.
  • All of this contributes to frustration, missed work, wasteful spending, and unacceptable risk.

Obstacles, by the numbers:

85% of respondents reported security in the cloud was a serious concern.

73% reported balancing responsibilities between a central cloud team and business units was a top concern.

The average organization spent 13% more than they’d budgeted on cloud – even when budgets were expected to increase by 29% in the next year.

32% of all cloud spend was estimated to be wasted spend.

56% of operations professionals said their primary focus is cloud services.

81% of security professionals thought it was difficult to get developers to prioritize bug fixes.

42% of security professionals felt bugs were being caught too late in the development process.

1. Flexera 2022 State of the Cloud Report. 2. GitLab DevSecOps 2021 Survey

Cloud operations are different, but IT departments struggle to change

  • There’s no sense of urgency in the organization that change is needed, particularly from teams that aren’t directly involved in operations. It can be challenging to make the case that change is needed.
  • Beware “analysis paralysis”! With so many options, philosophies, approaches, and methodologies, it’s easy to be overwhelmed by choice and fail to make needed changes.
  • The solution to the problem requires organizational changes beyond the operations team, but you don’t have the authority to make those changes directly. Operations can influence the solution, but they likely can’t direct it.
  • Behavior, culture, and organizations take time and work to change. Progress is usually evolutionary – but this can also mean it feels like it’s happening too slowly.
  • It’s not just cloud, and it probably never will be. You’ll need to account for operating both on-premises and cloud technologies for the foreseeable future.

Follow Info-Tech’s Methodology

1. Ensure alignment with the risks and drivers of the business and understand your organization’s strengths and gaps for a cloud operations world.

2. Understand the balance of different types of deliveries you’re responsible for in the cloud.

3. Reduce risk by reinforcing the key operational pillars of cloud operations to your workstreams.

4. Identify “work areas,” decide which area is responsible for what tasks and how work areas should interact in order to best facilitate desired business outcomes.

The image contains a screenshot of a diagram demonstrating Info-Tech's Methodology, as described in the text above.

Info-Tech Insight

Start by designing operations around the main workflow you have for cloud services; i.e. If you mostly build or host in cloud, build the diagram to maximize value for that workflow.

Operating Framework Elements

Proper design of roles and responsibilities for each cloud workflow category will help reduce risk by reinforcing the key operational pillars of cloud operations.

We base this on a composite of the well-architected frameworks established by the top global cloud providers today.

Workflow Categories

  • Build
  • Host
  • Consume

Key Pillars

  • Performance
  • Reliability
  • Cost Effectiveness
  • Security
  • Operational Excellence

Risks to Mitigate

  • Changes to Support Model
  • Changes to Security & Governance
  • Changes to Skills & Roles
  • Replicating Old Habits
  • Misaligned Stakeholders

Cloud Operations Design

Info-Tech’s Methodology

Assess Maturity and Ways of Working

Define Cloud Work

Design Cloud Operations

Communicate and Secure Buy-in

Assess your key workflows’ maturity for “life in the cloud,” related to Key Operational Pillars. Evaluate your readiness and need for new ways of working.

Identify the work that must be done to deliver value in cloud services.

Define key cloud work areas, the work they do, and how they should share information and interact.

Outline the change you recommend to a range of stakeholders. Gain buy-in for the plan.

Blueprint deliverables

Each step of this blueprint is accompanied by supporting deliverables to help you accomplish your goals.

Cloud Maturity Assessment

Assess the intensity and cloud maturity of your IT operations for each of the key cloud workstreams: Consume, Host, and Build

The image contains screenshots of the Cloud Maturity Assessment.

Communication Plan

Identify stakeholders, what’s in it for them, what the impact will be, and how you will communicate over the course of the change.

The image contains a screenshot of the Communcation Plan.

Cloud Operations Design Sketchbook

Capture the diagram as you build it.

The image contains a screenshot of the Cloud Operations Design Sketchbook.

Roadmap Tool

Build a roadmap to put the design into action.

The image contains a screenshot of the Roadmap Tool.

Key deliverable:

Cloud Operations Organizing Framework

The Cloud Operations Organizing Framework is a communication tool that introduces the cloud operations diagram and establishes its context and justification.

The image contains a screenshot of the Cloud Operations Organizing Framework.

Project Outline

Phase 1: Establish Context

1.1: Identify challenges, opportunities, and cloud maturity

1.2: Evaluate new ways of working

1.3: Define cloud work

Phase 2: Design the organization and communicate changes

2.1: Design a draft cloud operations diagram

2.2: Communicate changes

Outputs

Cloud Services Objectives and Obstacles

Cloud Operations Workflow Diagrams

Cloud Maturity Assessment

Draft Cloud Operations Diagram

Communication Plan

Roadmap Tool

Cloud Operations Organizing Framework

Project benefits

Benefits for IT

Benefits for the business

  • Define the work required to effectively deliver cloud services to deliver business value.
  • Define key roles for operating cloud services.
  • Outline an operations diagram that visually communicates what key work areas do and how they interact.
  • Communicate needed changes to key stakeholders.
  • Receive more value from cloud services when the organization is structured to deliver value including:
    • Avoiding cost overruns
    • Securing services
    • Providing faster, more effective delivery
    • Increasing predictability
    • Reducing error rates

Calculate the value of Info-Tech’s Methodology

The value of the project is the delivery of organizational change that improves the way you manage cloud services

Example Goal

How this blueprint can help

How you might measure success/value

Streamline Responsibilities

The operations team is spending too much time fighting applications fires, which is distracting it from needed platform improvements.

  • Identify shared and separate responsibilities for development and platform operations teams.
  • Focus the operations team on securing and automating cloud platform(s).
  • Reduce time wasted on back and forth between development and operations teams (20 hrs. per employee per year x 50 staff = 1000 hrs.).
  • Deliver automation features that reduces development lead time by one hour per sprint (40 devs x 20 sprints per yr. x 1 hr. = 800 hrs.).

Improve Cost Visibility

The teams responsible for cost management today don’t have the authority, visibility, or time to effectively find wasted spend.

The teams responsible for cost management today don’t have the authority, visibility, or time to effectively find wasted spend.

  • Ensure operations contributes to visibility and execution of cost governance.
  • $1,000,000 annual spend on cloud services.
  • Of this, assume 32% is wasted spend ($320k).1
  • New cost management function has a target to cut waste by half next year saving ~$160k.
  • Cost visibility and capture metrics (e.g. accurate tagging metrics, right-sizing execution).
1. Average wasted cloud spend across all organizations, from the 2022 Flexera State of the Cloud Report

Understand your cloud vision and strategy before you redesign operations

Guide your operations redesign with an overarching cloud vision and strategy that aligns to and enables the business’s goals.

Cloud Vision

The image contains a screenshot of the Define Your Cloud Vision.

Cloud Strategy

It is difficult to get or maintain buy-in for changes to operations without everyone on the same page about the basic value proposition cloud offers your organization.

Do the workload and risk analysis to create a defensible cloud vision statement that boils down into a single statement: “This is how we want to use the cloud.”

Once you have your basic cloud vision, take the next step by documenting a cloud strategy.

Establish your steering committee with stakeholders from IT, business, and leadership to work through the essential decisions around vision and alignment, people, governance, and technology.

Your cloud operations design should align to a cloud strategy document that provides guidelines on establishing a cloud council, preparing staff for changing skills, mitigating risks through proper governance, and setting a direction for migration, provisioning, and monitoring decisions.

Key Insights

Focus on the future, not the present

Define your target cloud operations state first, then plan how to get there. If you begin by trying to reconstruct on-prem operations in the cloud, you will build an operations model that is the worst of both worlds.

Responsibilities change in the cloud

Understand what you mean by cloud work

Focus where it matters

Cloud is a different way of consuming IT resources and applications and it requires a different operational approach than traditional IT.

In most cases, cloud operations involves less direct execution and more service validation and monitoring

Work that is invisible to the customer can still be essential to delivering customer value. A lot of operations work is invisible to your organization’s customers but is required to deliver stability, security, efficiency, and more.

Cloud work is not just applications that have been approved by IT. Consider how unsanctioned software purchased by the business will be integrated and managed.

Start by designing operations around the main workflow you have for cloud services. If you mostly build or host in the cloud, build the diagram to maximize value for that workflow.

Design principles will often change over time as the organization’s strategy evolves.

Identify skills requirements and gaps as early as possible to avoid skills gaps later. Whether you plan to acquire skills via training or cross-training, hiring, contracting, or outsourcing, effectively building skills takes time.

Info-Tech offers various levels of support to best suit your needs

DIY Toolkit

Guided Implementation

Workshop

Consulting

“Our team has already made this critical project a priority, and we have the time and capability, but some guidance along the way would be helpful.”“Our team knows that we need to fix a process, but we need assistance to determine where to focus. Some check-ins along the way would help keep us on track.”“We need to hit the ground running and get this project kicked off immediately. Our team has the ability to take this over once we get a framework and strategy in place.”“Our team does not have the time or the knowledge to take this project on. We need assistance through the entirety of this project.”

Diagnostics and consistent frameworks used throughout all four options

Guided Implementation

What does a typical GI on this topic look like?

Phase 1

Phase 2

Call #1: Scope requirements, objectives, and your specific challenges

Calls #2&3: Assess cloud maturity and drivers for org. redesign

Call #4: Review cloud objectives and obstacles

Call #5: Evaluate new ways of working and identify cloud work

Calls #6&7: Create your Cloud Operations diagram

Call #8: Create your communication plan and build roadmap

A Guided Implementation (GI) is a series of calls with an Info-Tech analyst to help implement our best practices in your organization.

Workshop Overview

Contact your account representative for more information.
workshops@infotech.com 1-888-670-8889

Day 1

Day 2

Day 3

Day 4

Day 5

Establish Context

Design the Organization and Communicate Changes

Next Steps and
Wrap-Up (offsite)

Activities

1.1 Assess current cloud maturity and areas in need of improvement

1.2 Identify the drivers for organizational redesign

1.3 Review cloud objectives and obstacles

1.4 Develop organization design principles

2.1 Evaluate new ways of working

2.2 Develop a workstream target statement

2.3 Identify cloud work

3.1 Design a future-state cloud operations diagram

3.2 Create a current state cloud operations diagram

3.3 Define success indicators

4.1 Create a roadmap

4.2 Create a communication plan

5.1 Complete in-progress deliverables from previous four days.

5.2 Set up review time for workshop deliverables and to discuss next steps.

Deliverables

  1. Cloud Maturity Assessment
  2. Cloud Challenges and Objectives
  1. Workstream target statement
  2. Cloud Operations Workflow Diagrams
  1. Future and current state cloud operations diagrams
  1. Roadmap
  2. Communication Plan

Cloud Operations Organizing Framework.

Phase 1:

Establish context

Phase 1

Phase 2

1.1 Establish operating model design principals by identifying goals & challenges, workstreams, and cloud maturity

1.2 Evaluate new ways of working

1.3 Identify cloud work

2.1 Draft an operating model

2.2 Communicate proposed changes

Phase Outcomes:

Define current maturity and which workstreams are important to your organization.

Understand new operating approaches and which apply to your workstream balance.

Identify a new target state for IT operations.

Before you get started

Set yourself up for success with these three steps:

  • This methodology and the related slides are intended to be executed via intensive, collaborative working sessions using the rest of this slide deck.
  • Ensure the working sessions are successful by working through these steps before you start work on defining your cloud operations.

1. Identify an operations design working group

2. Review cloud vision and strategy

3. Create a working folder

This should be a group with insight into current cloud challenges, and with the authority to drive change. This group is the main audience for the activities in this blueprint.

Review your established planning work and documentation.

Create a repository to house your notes and any work in progress.

Create a working folder

15 minutes

Create a central repository to support transparency and collaboration. It’s an obvious step, but one that’s often forgotten.

  1. Download all the documents associated with this blueprint to a shared repository accessible to all participants. Keep separate folders for templates and work-in-progress.
  2. Share the link to the repository with all attendees. Include links to the repository in any meeting invites you set up as working sessions for the project.
  3. Use the repository for all the work you do in the activities listed in this blueprint.

Step 1.1: Identify goals and challenges, workstreams, and cloud maturity

Participants

  • Operations Design Working Group, which may include:
    • Cloud owners
    • Platform/Applications Team leads
    • Infra & Ops managers

Outcomes

  • Identify your current cloud maturity and areas in need of improvement.
  • Define the advantages you expect to realize from cloud services and any obstacles you have to overcome to meet those objectives.
  • Identify the reasons why redesigning cloud operations is necessary.
  • Develop organization design principles.

“Start small: Begin with a couple services. Then, based on the feedback you receive from Operations and the business, modify your approach and keep increasing your footprint.” – Nenad Begovic

Cloud changes operational activities, tactics, and goals

As you adopt cloud services, the operations core mission remains . . .

  • IT operations are expected to deliver stable, efficient, and secure IT services.

. . . but operational activities are evolving.

  • Core IT operational processes remain relevant, such as incident or capacity management, but opportunities to automate or outsource operations tasks will change how that work is done.
  • As you rely more on automation and outsourcing, the team may see less direct execution in its day-to-day work and more solution design and validation.
  • Outsourcing frees the team from operational toil but reduces the direct control over your end-to-end solution and increases your reliance on your vendor.
  • Pay-as-you-go pricing models present opportunities for streamlined delivery and cost rationalization but require you to rethink how you do cost and asset management.
  • It’s very easy for the business to buy a SaaS solution without consulting IT, which can lead to duplicated functionality, integration challenges, security threats, and more.

Design a model for cloud operations that helps you achieve value from your cloud environment.

“As operating models shift to the cloud, you still need the same people and processes. However, the shift is focused on a higher level of operations. If your people no longer focus on server uptime, then their success metrics will change. When security is no longer protected by the four walls of a datacenter, your threat profile changes.

(Microsoft, “Understand Cloud Operating Models,” 2022)

Operational responsibilities are shared with a range of stakeholders

When using a vendor-operated public cloud, IT exists in a shared responsibility model with the cloud service provider, one that is further differentiated by the type of cloud service model in use: broadly, software-as a service (SaaS), platform-as-a-service (PaaS), or infrastructure-as-a-service (IaaS).

Your IT operations organization may still reflect a structure where IT retains control over the entire infrastructure stack from facilities to application and defines their operational roles and processes accordingly.

If the organization chooses a co-location facility, they outsource facility responsibility to a third-party provider, but much of the rest of the traditional IT operating model remains the same. The operations model that worked for an entirely premises-based environment is very different from one that is made up of, for instance, a portfolio of SaaS applications, where your control is limited to the top of the infrastructure stack at the application layer.

Once an organization migrates workloads to the cloud, IT gives up an increasing amount of control to the vendor, and its traditional operational roles & responsibilities necessarily change.

The image contains a screenshot that demonstrates what the cloud service models are.

Align operations with customer value

  • Decisions about operational design should be made with customer value in mind. Remember that cloud adoption should be an enabler of adaptability in the face of changing business needs!
  • Think about how the operations team is indispensable to the value received by your customer. Think about the types of changes that can add to the value your customers receive.
  • A focus on value will help you establish and explain the rationale and urgency required to deliver on needed changes. If you can’t explain how the changes you propose will help deliver value, your proposal will come across as change for the sake of change.
The image contains a screenshot of a diagram to demonstrate how operational design decisions need to be made with customer value in mind.

Info-Tech Insight

Work that is invisible to the customer can still be essential to delivering customer value. A lot of operations work is invisible to your organization’s customers but required to deliver stability, security, efficiency, and more.

A new consumption model means a different mix of activities

Evolving to cloud-optimal operations also means re-assessing and adapting your team’s approach to achieving cloud maturity, especially with respect to how automation and standardization can be leveraged to best achieve optimization in cloud.

Traditional ITDesignExecuteValidateSupportMonitor
CloudDesignExecuteValidateSupportMonitor

Info-Tech Insight

Cloud is a different way of consuming IT resources and applications and requires a different operational approach than traditional IT.

In most cases, cloud operations involves less direct execution and more service validation and monitoring.

The Service Models in cloud correspond to the way your organization delivers IT

Service Model

Example

Function

Software-as-a-Service (SaaS)

Salesforce.com

Office 365

Workday

Consume

Platform-as-a-Service (PaaS)

Azure Stack

AWS SageMaker

WordPress

Build

Infrastructure-as-a-Service (IaaS)

Microsoft Azure

Amazon EC2

Google Cloud Platform

Host

Define how you plan to use cloud services

Your cloud operations will include different tasks, teams, and workflows, depending on whether you consume cloud services, build them, or host on them.

Function

Business Need

Service Model

Example Tasks

Consume

“I need a commodity, off-the-shelf service that we can configure to our organization’s needs.

Software-as-a-Service (SaaS)

Onboard and add users to a new SaaS offering. Vendor management of SaaS providers. Configure/integrate the SaaS offering to meet business needs.

Build

“I need to create significantly customized or net-new products and services.”

Platform-as-a-Service (PaaS) & Infrastructure as-a-Service (IaaS)

Create custom applications. Build and maintain a container platform. Manage CI/CD pipelines and tools. Share infrastructure and applications patterns.

Host

“I need compute, storage, and networking components that reflect key cloud characteristics (on-demand self-service, metered usage, etc.).”

Infrastructure-as-a-Service (IaaS)

Stand up compute, networking, and storage resources to host a COTS application. Plan to increase storage capacity to support future demand.

Align to the well-architected framework

  • Each cloud provider has defined a well-architected framework (WAF) that defines effective deployment and operations for their services.
  • WAFs embody a set of best practices and design principles to leverage the cloud in a more efficient, secure, and cost-effective manner.
  • While each vendor’s WAF has its own definitions and nuances, they collectively share a set of key principles, or “pillars,” that define the desired outcome of any cloud deployment.
  • These pillars address the key areas of risk when migrating to a public cloud platform.

“In order to accelerate public cloud adoption, you need to focus on infrastructure-as-code and script everything you can. Unlike traditional operations, CloudOps focuses on creating scripts: a script for task A, a script for task B, etc.”

– Nenad Begovic

Pillars

  • Reliability
  • Security
  • Cost Optimization
  • Operational Excellence
  • Performance Efficiency

General Best Practice Capability Areas

  • Host
  • Network
  • Data
  • Identity Management
  • Cost/Subscription Management

Assess cloud maturity

2 hours

  1. Download a copy of the Cloud Maturity Assessment Tool.
  2. As a group, work through:
    • The balance of your operations activities from a Host/Build/Consume perspective. What are you responsible for delivering now? How do you expect things will change in the future?
    • Which workstreams to focus on. Are there activity categories that are critical or non-critical or that don’t represent a significant portion of overall work? Conversely, are there workstreams that you feel are subject to particular risk when moving to cloud?
  3. Fill out the Maturity Quiz tab in the Cloud Maturity Assessment Tool for the workstreams you have chosen to focus on.
InputOutput
  • Insight into and experience with your current cloud environment.
  • Maturity scoring for key workload streams as they align to the pillars of a general well-architected cloud framework
MaterialsParticipants
  • Whiteboard/Flip chart
  • Operating model template
  • Cloud platform SMEs

Download theCloud Maturity Assessment Tool

Identify the drivers for organizational redesign

Whiteboard Activity

An absolute must-have in any successful redesign is a shared understanding and commitment to changing the status quo.

Without a clear and urgent call to action, the design changes will be seen as change for the sake of change and therefore entirely safe to ignore.

Take up the following questions as a group:

  1. What kind of organizational change is needed?
  2. Why do we think the need for this change is urgent?
  3. What do we think will happen if no change occurs? What’s the worst-case scenario?

Record your answers so you can reference and use them in the communication materials you’ll create in Phase 2.

InputOutput
  • Cloud maturity assessment
  • Objectives and obstacles
  • Insight into existing challenges stemming from organizational design challenges
  • A list of reasons that form a compelling argument for organizational change
MaterialsParticipants
  • Whiteboard/Flip chart
  • Cloud Operations Design Working Group

“We know, for example, that 70 percent of change programs fail to achieve their goals, largely due to employee resistance and lack of management support. We also know that when people are truly invested in change it is 30 percent more likely to stick.”

– Ewenstein, Smith, Sologar

McKinsey (2015)

Consider the value of change from advantage and obstacle perspectives

Consider what you intend to achieve and the obstacles to overcome to help identify the changes required to achieve your desired future state.

Advantage Perspective

Ideas for Change

Obstacle Perspective

What advantages do cloud services offer us as an organization?

For example:

  • Enhance service features.
  • Enhance user experience.
  • Provide ubiquitous access.
  • Scalability to align with demand.
  • Automate or outsource routine tasks.

What obstacles prevent us from realizing value in cloud services?

For example:

  • Inadequate stability and reliability
  • Difficult to observe or monitor workloads
  • Challenges ensuring cloud security
  • Insufficient access to relevant skills

Review risks and challenges

Changes to Support Model

  • Have we identified who is on the cloud ops team?
  • Do we know where we are procuring skills (internal IT vs. third party) and for how long?
  • Do we know where we are in the migration process?

Changes to security & governance

  • Have we identified how our attack surface changes in the cloud?
  • Do we have guardrails in place to govern self-provisioning users?
  • Are we managing cost overage risks?

Replicating old habits

  • Have we made concrete plans to leverage cloud capabilities to standardize and automate outputs?
  • Are we simply reproducing existing systems in the cloud?

Changes to Skills & Roles

  • Is our staff excited to learn new skills and technologies? Are our specialists prepared to acquire generalist skills to support cloud services?
  • Do we have training plans created and aligned to our technology roadmap?
  • Do we know what head count we need?

Misaligned stakeholders

  • Have we identified our key stakeholders and teams? Have we considered what changes will impact them and how?
  • Are we meeting regularly and collaborating effectively with our peers, or are we siloed?

Review cloud objectives and obstacles

Whiteboard Activity

1 hour

  1. With your working group, review why you’re using cloud in the first place. What advantages do you expect to realize by adopting cloud services? If we achieve what we’ve set out to do, what should that look and feel like to us, our organization, and our organization’s customers?
    • You should have identified cloud drivers and objectives in your cloud vision and strategy – leverage and validate what you already have!
  2. Next, identify obstacles that are preventing you from fully realizing the value of cloud services.
  3. Finally, brainstorm initial ideas for change. What could we start doing that could help us better use cloud in the future? Are there changes to how we need to organize ourselves to collaborate more effectively?
InputOutput
  • Insight into and experience with your current cloud environment
  • Identified key business outcomes you expect to realize by adopting cloud services
  • Identified challenges and obstacles that are preventing you from realizing key outcomes
MaterialsParticipants
  • Whiteboard/Flip chart
  • Cloud operations design working group.

Commonly cited advantages and obstacles

Cloud Advantages/Objectives

  • Deliver faster on commitments to the business by removing infrastructure provisioning as a bottleneck.
  • Simplify capacity management on flexible cloud-based infrastructure.
  • Reduce capital spending on IT infrastructure.
  • Create sandboxes/innovation practices to experiment with and develop new functionality on cloud platforms.
  • Easily enable ubiquitous access to key corporate services.
  • Minimize the expense and effort required to maintain a data center – power & cooling, cabling, or physical hardware.
  • Leverage existing automation tools from cloud vendors to speed up integration and deployment.
  • Direct costs for specific services can improve transparency and cost allocation, allowing IT to directly “show-back” or charge-back cloud costs to specific cost centers.

Obstacles

Need to speed up provisioning of PaaS/IaaS/data resources to development and project teams.

No time to develop and improve platform services and standards due to other responsibilities.

We constantly run up unexpected cloud costs.

Not enough time for continuous learning and development.

The business will buy SaaS apps and only let us know after they’ve been purchased, leading to overlapping functionality; gaps in compliance, security, or data protection requirements; integration challenges; cost inefficiencies; and more.

Role descriptions haven’t kept up with tech changes.

Obvious opportunities to rationalize costs aren’t surfaced (e.g. failing to make use of existing volume licensing agreements).

Skills needed to properly operate cloud solutions aren’t identified until breakdowns happen.

Establish organization design principles

You’ve established a need for organizational change. What will that change look like?

Design principles are concise, direct statements that describe how you will design your organization to achieve key objectives and address key challenges.

This is a critically important step for several reasons:

  • A set of clear, concise statements that describe what the design should achieve provides parameters that will help you create and evaluate different design options.
  • A focused, facilitated discussion to create those statements will help uncover conflicting assumptions between key stakeholders.
  • A comprehensive description of the various ways the organization should change makes it easier to identify misaligned or incompatible objectives.
  • A description of what your organization should look like in the future will help you identify where changes will be required .

Examples of design principles:

  1. We will create a path to review and publish effective application/platform patterns.
  2. A single governing body should have oversight into all cloud costs.
  3. Development must happen only on approved cloud platforms.
  4. Application teams must address operational issues that derive from the applications they’ve created.
  5. Security practices should be embedded into approved cloud platforms and be automatically applied wherever possible.
  6. Focus is on improving developer experience on cloud platforms.

Info-Tech Insight

Design principles will often change as the organization’s strategy evolves.

Align design principles to your objectives

Developing design principles starts with your key objectives. What do we absolutely have to get right to deliver value through cloud services?

Once you have your direction set, work through the points in the star model to establish how you will meet your objectives and deliver value. Each point in the star is an important element in your design – taken together, it paints a holistic picture of your future-state organization.

The changes you choose to implement that affect capabilities, structure, processes, rewards, and people should be self-reinforcing. Each point in the star is connected to, and should support, the other points.

“There is no one-size-fits-all organization design that all companies – regardless of their particular strategy needs – should subscribe to.”

– Jay Galbraith, “The Star Model”

The image contains a screenshot of a modified versio of Jay Galbraith's Star Model of Organizational Design.

Establish design principles

Track your findings in the table on the next slide.

  1. Review the cloud objectives and challenges from the previous activity. As a group, decide from that list: what are the key objectives you are trying to achieve? What are the things you absolutely must get right to get value from cloud services?
  2. Work through the following questions as a group:
    • What capabilities or technologies do we need to adopt or leverage differently?
    • How must our structure change? How will power shift in the new structure?
    • Will our new structure require changes to processes or information sharing?
    • How must we change how we motivate or reward employees?
    • What new skills or knowledge is required? How will we acquire those skills or knowledge?
InputOutput
  • Cloud objectives and challenges
  • Different viewpoints into how your organization must change to realize objectives and overcome challenges
  • Organizational design principles for cloud operations
MaterialsParticipants
  • Whiteboard/Flip charts
  • Cloud operations design working group

Design principles (example)

What is our key objective?

  • Rapidly develop innovative cloud services aligned to business value.

What capabilities or technologies do we need to adopt or leverage differently?

  • We will adopt more agile development techniques to make smaller changes, faster.
  • We will standardize and automate tasks that are routine and repeatable.

How must our structure change? How will power shift in the new structure?

  • Embed development teams within business units to better align to business unit needs.
  • Create a focused cloud platform team to develop infrastructure services.

Will our new structure require changes to processes or information sharing?

  • Development teams will take on responsibility for application support.
  • Platform teams will be deeply embedded with development teams on new projects to build new infrastructure functionality.

How must we change how we motivate or reward employees?

  • We will highlight innovative work across the company.
  • We will encourage experimentation and risk-taking.

What new skills or knowledge is required, and how will we acquire it?

  • We will focus on acquiring skills most closely aligned to our technology roadmap.
  • We will ensure budget is available for training employees who ask for it.
  • We will contract to find skills we cannot develop in-house and use engagements as an opportunity to learn internally.

Step 1.2: Evaluate new ways of working

Participants

Cloud Operations Design Working Group

Outcomes

Shared understanding of the horizon of work possibilities:

  • Ways to work
  • Ways to govern and learn

Consider the different approaches on the following slides, how they change operational work, and decide which approaches are the right fit for you.

Evaluate new ways of working

Cut through the hype

  • There are new approaches/ways of working that deal head on with the persistent breakdowns and headaches that come with operations management – work thrown over the wall from development, manual and repetitive work, siloed teams, and more.
  • Many of these approaches emphasize an operations-aware approach to solutions development and apply techniques traditionally associated with AppDev to Operations.
  • Cloud services present opportunities to outsource/automate away routine tasks.

“DevOps is a set of practices, tools, and a cultural philosophy that automates and integrates the processes between software development and IT teams. It emphasizes team empowerment, cross-team communication and collaboration, and technology automation.”

– Atlassian, “DevOps”

“ITIL 4 brings ITIL up to date by…embracing new ways of working, such as Lean, Agile, and DevOps.”

– ITIL Foundation: ITIL 4 Edition

“Over time, left to their own devices, the SRE team should end up with very little operational load and almost entirely engage in development tasks, because the service basically runs and repairs itself.”

– Ben Treynor Sloss, “Site Reliability Engineering”

The more things change, the more they stay the same:

  • Core processes remain, but they may be done differently, and new technologies and services create new challenges.
  • Not all approaches are right for all organizations, and what’s right for you depends on how you use cloud services.
  • The best solution draws from these management ideas to build an approach to operations that is right for you.

Leverage patterns to think about new ways of approaching operations work

Patterns are strategies, approaches, and philosophies that can help you imagine new ways of working in your own organization.

  • The following slides provide an overview of organizing patterns that are applicable to cloud operations.
  • These are strategies that have been applied successfully elsewhere. Review what they can and cannot do and decide whether they are something you can use in your own organizational design.
  • Not every pattern will apply to every organization. For example, an organization which typically consumes SaaS applications will likely have very little need for SRE approaches and techniques.

Ways to work

  • What work do we do? What skills do we need?
  • How do we create and support systems?

Ways to govern and learn

  • How do we set and enforce rules?
  • How do we create and share knowledge?

Explore Applicable Patterns

Ways to work

Ways to govern and learn

1. DevOps

2. Site Reliability Engineering

3. Platform Engineering

4. Cloud Centre of Excellence

5. Cloud Community of Practice

What is DevOps?

“Look for obstacles constantly and treat them as opportunities to experiment and learn.” – Jez Humble, et al. Lean Enterprise: How High Performance Organizations Innovate at Scale

What it is NOT

What it IS

Why Use It

  • Another word for automation or CI/CD tools.
  • A specific role.
  • A fix-all to address friction between existing siloed application and development teams.
  • An approach that will be successful without getting the basics right first.
  • The right fit for every IT organization or every team.

An operational philosophy that seeks to:

  • Converge accountability for development and operations to align all teams to the goal of delivering customer value.
  • Improve the relationship between Development and Operations teams.
  • Increase the rate of deployment of valuable functionality into production.
  • “A cultural shift giving development teams more control over shipping code to production.” 1
  • You’re doing a lot of custom development.
  • There are opportunities for operations and development teams to work more closely.
  • You want to improve coding quality and throughput.
  • You want to shift the culture of the team to focus on customer value rather than exclusively uptime or new features.
1 DevOps, SRE, and Platform Engineering

What is Site Reliability Engineering (SRE)?

“Hope is not a strategy” – Benjamin Treynor Sloss, Site Reliability Engineering: How Google Runs Production Systems

What it is NOT

What it IS

Why Use It

  • Deeply focussed on a specific technical domain; SRE work “does not discriminate between infrastructure, software, networking, or platforms.” 2
  • A different name for a team of sysadmins.
  • A programming framework or a specific set of technologies.
  • A way to manage COTS software. SRE is less useful when you’re using applications out-of-the-box with minimal customization, integration, or development.
  • An application of skills and approaches from software engineering to improve system reliability.
  • A team responsible for “availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning.”3
  • A team responsible for building systems that become “a platform and workflow that encompasses monitoring, incident management, eliminating single points of failure, [and] failure mitigation.”1
  • You are building services and providing them at scale.
  • You want to improve reliability and reduce “the frequency and impact of failures that can impact the overall reliability of a cloud application.”1
  • You need to define related service metrics and SLOs.
  • To increase the use of automation in operations to avoid mistakes and minimize toil. 3
1 SRE vs Platform Engineering
2. Lakhani, Usman. “ISite Reliability Engineering: What Is It? Why Is It Important for Online Businesses?,” 2020.
3. Sloss, “Introduction,” 2017

What4 is Platform Engineering?

“Platform engineers can act as a shield between developers and the infrastructure”

– Carlos Schults, “What is Platform Engineering? The Concept Behind the Term”

What it is NOT

What it IS

Why Use It

  • A team that manages every aspect of each application on a particular platform.
  • Focussed solely on platform reliability and availability.
  • A different name for a team of sysadmins.
  • Needed for all cloud service deployments. Platform engineers are most useful when you’re building extensively on a particular platform (e.g. AWS, Azure, or your internal cloud).
  • Platform engineers design, build, and manage the infrastructure that supports and hosts work done by developers.
  • The work done by platform engineering allows developers to avoid the repetitive work of setting everything up anew each time.
  • Requires engineers with a deep understanding of cloud services and other platform technologies (e.g. Kubernetes).
  • The big public cloud platforms are built for everyone. You need platform engineering when you need to extensively adapt or manage standard cloud services to support your own requirements.
  • Platform engineers are responsible for creating a secure, stable, maintainable environment that enables developers to do their work faster and without having to manage the underlying technology infrastructure.
1 DevOps, SRE, and Platform Engineering

What is a Cloud Center of Excellence?

You need a strong core to grow a cloud culture.

What it is NOT

What it IS

Why Use It

  • A project management office (PMO) for cloud services.
  • An easy, quick, or temporary fix to cloud governance problems. The CCoE requires champions who provide ongoing support to realize value over time.
  • An approach that’s only for enterprise-sized IT organizations.
  • A standing meeting – members of the CCoE may meet regularly to review progress on their mandate, but work and collaboration need to happen outside of meetings.
  • A cross-functional team responsible for oversight of all cloud initiatives, including architectural, technical, security, financial, contractual, and operational aspects of planned and deployed solutions.
  • The CCoE’s responsibilities typically include governance and continuous improvement; alignment between technical and accounting practices; documentation, training, best practices and standards development; and vendor management.
  • CCoE duties are often part of an existing role rather than a full-time responsibility.
  • You want to enable a core group of cloud experts to promote collaboration and accelerate adoption of cloud services, including members from infrastructure, applications, and security.
  • You need to manage cloud risks, set guidelines and policies, and govern costs across cloud environments.
  • There is an unmet need for training, knowledge sharing, and best practice development across the organization.

What is a Cloud Community of Practice?

“We have to stop optimizing for programmers and start optimizing for users”

– Jeff Atwood

What it is NOT

What it IS

Why Use It

  • A replacement for effective oversight and governance practices, though they may help users navigate and understand governance requirements.
  • A way to advertise cloud to potential new practitioners – engaged members of a CoP are typically already using a particular service.
  • Always exclusively composed of internal staff; in certain cases, a CoP could have external members as well.
  • A network of engaged users and experts who share knowledge and best practices for related technologies, crowdsource solutions to problems, and suggest improvements.
  • Often supported by communication and collaboration tools (e.g. chat channels, knowledge base, forums). May use a range of techniques (e.g. drop-ins, vendor-led training, lunch and learns).
  • Communities of practice may be deliberately created by the organization or develop organically.
  • Communities of practice are an effective way for practitioners to support one another and share ideas and solutions.
  • A CoP can help “shift left” work and help practitioners help themselves.
  • An engaged CoP can help IT to identify improvement opportunities and can also be a channel to communicate updates or changes to practitioners.

Reinforce what we mean by patterns

Patterns are . . .

Ways of Working

  • Sets of habits, processes, and methodologies you want to adopt as part of your operational guidelines and commonly agreed upon definitions.

Patterns are also . . .

Ways to Govern and Learn

  • The formal and informal practices and groups that focus on enabling governance, risk management, and adoption.

Review the implications of each pattern for organizational design

Ways of Working

DevOps

Development teams take on operational work to support the services they create after they are launched to production.

Some DevOps teams may be aligned around a particular function or product rather than a technology – there are individuals with skills on a number of technologies that are part of the same team.

Site Reliability Engineering (SRE)

In the beginning, you can start to adopt SRE practices within existing teams. As demand grows for SRE skills and services, you may decide to create focused SRE roles or teams.

SRE teams may work across applications or be aligned to just infrastructure services or a particular application, or they may focus on tools that help developers manage reliability. SREs may also be embedded long-term with other teams or take on an internal consulting roles with multiple teams.1

Platform Engineering

Platform engineering will often, though not always, be the responsibility of a dedicated team. This team must work very closely with, and tuned into the needs of, its internal customers. There is a constant need to find ways to add value that aren’t already part and parcel of the platform – or its external roadmap.

This team will take on responsibility for the platform, in terms of feature development, automation, availability and reliability, security, and more. They may also be internal consultants or advisors on the platform to developers.

1. Gustavo Franco and Matt Brown, “How SRE teams are organized and how to get started.”

Review the implications of each pattern for organizational design

Ways to Govern and Learn

Cloud Center of Excellence

  • A CCoE is a cross-functional group with technical experts from security, infrastructure, applications, and more.
  • There should, ideally, be someone focused on leading the CCoE full-time – often someone with an architecture background. Team members may work on the CCoE part-time alongside their main role, and dedicate more of their time to the CCoE as needed.
  • As the CCoE is a governance function, it will typically bridge and sit above teams working on cloud services, reporting to the CIO, CTO, or to an architecture function.

Cloud Community of Practice

  • Participation in a community of practice is often above and beyond a core role – it’s a leadership activity taken on by technologically adept experts with a drive to help others.
  • Some organizations will create a role to foster community collaboration, run events, raise opportunities and issues identified by the community with product or technology teams, manage collaboration tools, and more.

Evolve your organization to meet the needs of increased adoption

Your operating model should evolve as you increase adoption of cloud services.

Least Adoption Greatest Adoption

Initial Adoption

Early Centralization

Scaling Up

Full Steam Ahead

  • One or more small agile teams design, build, manage, and operate individual solutions on cloud resources. Solutions provide early value, and identify new opportunities using small, safe-to-fail experiments.
  • Governance is likely done locally to each team. Knowledge sharing, guidelines, and standards are likely informal.
  • Early experience with cloud services help the organization identify where to invest in cloud services to best meet business demands.
  • Accountability and governance over the platform are more clearly defined, possibly still separate from core IT governance processes. Best practices may be shared across teams through a Community of Practice.
  • Operations may be centralized, where valuable, to support monitoring and incident response.
  • Additional product/service-aligned development teams are created to keep up with demand.
  • There is a focused effort to consolidate best practices and platform knowledge, which can be supported through a culture of learning, effective automation, and appropriate tools.
  • The CCoE takes on additional roles in cloud governance, security, operations, and administration.
  • The organization has reached a relatively steady-state for cloud adoption. Innovation and new service development takes place on a stable platform.
  • A Cloud Center of Excellence is accountable for cloud governance across the organization.
Adapted from Microsoft, “Get Started: Align your organization,” 2021

Choose new ways of working that make sense for your team

1 hour

Consider if, and how, the approaches to management and governance you’ve just reviewed can offer value to your organization.

  1. List the organizing/managing ideas listed in the previous slides in the table below.
  2. Define why it’s for you. What benefits do you expect to realize? What challenges do you expect this will help you overcome? How does this align with your key benefits and drivers for moving to cloud?
  3. List risks or challenges to adoption. Why will it be hard to do? What could get in the way of adoption? Why might it not be a good fit?
  4. Identify next steps to adopt proposed practices.

Why it’s for us (drivers)

Risks or challenges to adoption

Next steps to build/adopt it

CCoE

DevOps

InputOutput
  • Related Info-Tech slides on new ways of working.
  • Opportunities and challenges in your own cloud deployment that may be addressed through new ways of working.
  • Identify new ways of working aligned to your goals.
MaterialsParticipants
  • Whiteboard/Flip chart
  • Cloud Operations Design Working Group

Step 1.3: Identify cloud work

Participants

  • Operations Design Working Group

Outcomes

  • Identify core work required to deliver value in key cloud workstreams.

“At first, for many people, the cloud seems vast. But what you actually do is carve out space.”

–DevOps Manager

Identify work

Before you can identify roles and responsibilities, you have to confirm what work you do as an organization and how that work enables you to meet your goals.

  • A comprehensive approach that connects the work you do to your organizational goals will help you identify work that’s falling through the cracks.
  • Identifying work is an opportunity to look at the tasks you regularly execute and ensure they actually drive value.
  • Working through the exercise as a group will help you develop a common language around the work you do.
  • To make the evident obvious: you can’t decide who should be responsible for something if you don’t know about it in the first place.

Defining work can be a lot of … work! We recommend you start by identifying work for the workstream you do most – Build, Consume, or Host – to focus your efforts. You can repeat the exercise as needed.

Map work in workstream diagrams

The image contains a screenshot of the map work in workstream diagrams.

The five Well-Architected Framework pillars. These are principles/directions/guideposts that should inform all cloud work.

The work being done to achieve the workstream target. These are roughly aligned with the three streams on the right.

Workstream Target: A concise statement of the value you aim to achieve through this workstream. All work should help deliver value (directly or indirectly).

Define the scope of the exercise

Whiteboard Activity

20 minutes

Over the next few exercises, you’ll do a deep dive into the work you do in one specific workstream. In this exercise, we’ll decide on a workstream to focus on first.

  1. Are you primarily building, hosting on, or consuming cloud services? Start with the workstream where you’re doing the most work.
  2. If this isn’t sufficient to narrow your focus, look at the workstream that is most closely tied to mission critical applications, or that is most in need of review in terms of what work is done and who does it.
  3. You can narrow the scope further if there’s a very specific sub-area that differs from the rest (e.g. managing your O365 environment vs. managing all SaaS applications).
InputOutput
  • Insight into and experience with your current cloud environment.
  • Your completed cloud maturity assessment.
  • Identify one workstream where you’ll define work first.
MaterialsParticipants
  • None
  • Cloud Operations Design Working Group

Create a workstream target statement

Whiteboard Activity

30 minutes

In this activity, come up with a short sentence to describe what all this work you do is building toward. The target statement helps align participants on why work is being done and helps focus the activity on work that is most important to achieving the target statement.

Start with this common workstream target statement:

“Deliver valuable, secure, available, reliable, and efficient cloud services.”

Now, review and adjust the target statement by working through the questions below:

  1. Return to the earlier exercises in Phase 1.1 where you reviewed your key objectives for cloud services. Does the target statement align with what you’d identified previously?
  2. Who is the customer for the work you do? Would they see the target differently than you’ve described it?
  3. Can you be more specific? Are there value drivers that are more specific to your industry, organization, business functions, or products that are key to the value your customers receive from this workstream?
InputOutput
  • Previous exercises.
  • Workstream target statement.
MaterialsParticipants
  • Whiteboard/Flip chart
  • Cloud Operations Design Sketchbook
  • Cloud Operations Design Working Group

Identify cloud work

1-2 hours

  1. Use the workstream diagram template in the Cloud Operations Design Sketchbook, or draw the template out on a whiteboard and use sticky notes to identify work.
  2. Identify the workstream at the top of the slide. Update the template value statement on the right with the value statement you created in the previous exercise.
  3. Review one or more of the examples in the Cloud Operations Design Sketchbook to get a sense of the level of detail required for this exercise.

Activity instructions continue on the next slide.

Some notes to the facilitator:

  • Working directly from the Cloud Operations Design Sketchbook will save you time with transcription. Sharing the document with participants (e.g. via OneDrive) will allow you to collaborate and edit the document together in real-time.
  • Don’t worry about being too tidy for the moment, just get the information written down and you can clean up the diagram later.
InputOutput
  • Previously identified design principles.
  • An understanding of the work done, and that needs to be done, in your cloud environment.
  • Identify the work that needs to be done to support your key cloud services workstream in the future.
MaterialsParticipants
  • Cloud Operations Design Sketchbook
  • Whiteboard and sticky notes (optional)
  • Cloud Operations Design Working Group

Identify cloud work (cont’d)

4. Work together to identify work, documenting one work item per box. This should focus on future state, so record work whether it’s actually done today or not. Your space is limited on the sheet, so focus on work that is indispensable to delivering the value statement. Use the lists on the right as a reminder of key IT practice areas.

5. As much as possible, align the work items to the appropriate row (Govern & Align, Design & Execute, or Validate, Support & Monitor). You can overlap boxes between rows if needed.

Have you captured work related to:

ITIL practices, such as:

  • Request management
  • Incident & problem management
  • Service catalog
  • Service level management
  • Configuration management

Security-aligned practices, such as:

  • Identity & access management
  • Vulnerability management
  • Security incident management

Financial practices, such as:

  • IT asset management
  • Cost management & budgeting
  • Vendor management
  • Portfolio management

Data-aligned practices, such as:

  • Data integrations
  • Data governance

Technology-specific tasks, such as:

  • Network, Server & Storage
  • Structured/unstructured DBs
  • Composite services
  • IDEs and compilers

Other key practices:

  • Monitoring & observability
  • Continuous improvement
  • Testing & quality assurance
  • Training and knowledge management
  • Manage shadow IT

Info-Tech Insight

Cloud work is not just applications that have been approved by IT. Consider how unsanctioned software purchased by the business will be integrated and managed.

Identify cloud work (cont’d)

6. If you have decided to adopt any of the new ways of working outlined in Step 1.2 (e.g. DevOps, SRE, etc.) review the next slide for examples of the type of work that frequently needs to be done in each of those work models. Add any additional work items as needed.

7. Consolidate boxes and clean up the diagram (e.g. remove duplicate work items, align boxes, clarify language).

8. Do a final review. Is all the work in the diagram truly aligned with the value statement? Is the work identified aligned with the design principles from Step 1.1?

If you used a whiteboard for this exercise, transcribe the output to a copy of the Cloud Operations Design Sketchbook, and repeat the exercise for other key workstreams. You will use this diagram in Phase 2.

Examples of work

Examples of work in the "Host" workstream:

  • Bulk patch servers
  • Add a server
  • Add capacity
  • Develop a new server template
  • Incident management

Examples of work in the "Build" workstream:

  • Provision a production server
  • Provision a test environment
  • Test recovery procedures
  • Add capacity for a service
  • Publish a new pattern
  • Manage capacity/performance for a service
  • Identify wasted spend across services
  • Identify performance bottlenecks
  • Review and shut down idle/unneeded services

Examples of work in the "Consume" workstream:

  • Conduct vendor risk assessments
  • Develop a standard evaluation matrix to compare solutions to existing or potential in-house offerings
  • Onboard a solution
  • Offboard a solution
  • Conduct a renewal
  • Review and negotiate a contract
  • Rationalize software titles

Phase 2:

Design the organization and communicate changes

Phase 1

Phase 2

1.1 Establish operating model design principals by identifying goals & challenges, workstreams, and cloud maturity

1.2 Evaluate new ways of working

1.3 Identify cloud work

2.1 Draft an operating model

2.2 Communicate proposed changes

Phase Outcomes:

Draft your cloud operations diagram, identify key messages and impacts to communicate to your stakeholders, and build out the Cloud Operations Organizing Framework communication deck.

Step 2.1: Identify groups and responsibilities

Participants

  • Operations Design Working Group

Outcomes

  • Cloud Operations Diagram
  • Success Indicators
  • Roadmap

“No-one ever solved a problem by restructuring.”

– Anonymous

Visualize your cloud operations

Create a visual to help you abstract, analyze, and clarify your vision for the future state of your organization in order to align and instruct stakeholders.

Create a visual, high-level view of your organization to help you answer questions such as:

  • “What work do we do? What are the roles and responsibilities of different teams?”
  • “How do we interact between work areas?”
  • “How has our organization changed already, and what additional changes may be needed?”
  • “How do we make technology decisions?”
  • “How do we provide services?”
  • “How might this change be received by people on the ground?”
The image contains a screenshot of the Cloud Operations Diagram Example.

Decide whether to centralize or decentralize

Specialization & Focus: A group or work unit developing a focused concentration of skills, expertise, and activities aligned with an area of focus (such as the ones at right).

Decentralization: Operational teams that report to a decentralized IT or business function, either directly or via a “dotted line” relationship.

Decentralization and Specialization can:

  • Duplicate work.
  • Localize decision-making authority, which can increase agility and responsiveness.
  • Transfer authority and accountability to local and typically smaller teams, clarifying responsibilities and encouraging staff to take ownership for service delivery.
  • Enable the team to focus on complex and rapidly changing technologies or processes.
  • Create islands of expertise, which can get in the way of collaboration, innovation, and decision making across groups and work units and make oversight difficult.
  • Complicate the transfer of resources and knowledge between groups.

Examples: Areas of Focus

Business unit

  • Manufacturing
  • R&D
  • Sales & Marketing

Region

  • Americas
  • EMEA
  • APAC

Service

  • ERP
  • Commercial website

Technology

  • On-premises servers/storage
  • Network
  • Cloud services

Operational process focus

  • Capacity management & planning
  • Incident management
  • Problem management

“The concept of organization design is simple in theory but highly complex in practice. Like any strategic decision, it involves making multiple trade-offs before choosing what is best suited to a business context.”

– Nitin Razdan & Arvind Pandit

Identify key work areas

Balance specialization with effective collaboration

  • Much is said about breaking down organizational silos. But at some level, silos are inevitable – any company with more than one employee will have to divide work up somehow.
  • Dividing up work is a delicate balancing act – ensuring individuals and groups are able to do work that is related, meaningful, and that allows autonomy while allowing for effective collaboration between groups that need to work together to achieve business goals.

Why “work areas”?

Why don’t we just use teams, groups, squads, or departments, or some other more common term for groups of people working together?

  • We are not yet at the point of deciding who in the organization should be aligned to which areas in the design.
  • Describing work areas as teams can shift the conversation to the organizational chart – to who does the work, rather than what needs to be done.

That’s not the goal of this exercise. If the conversation gets stuck on what you do today, it can get in the way of thinking about what you need to do in the future.

Create a future-state cloud operations diagram

1-3 hours

  1. Review the example cloud operations diagram example in your copy of the Cloud Operations Design Sketchbook.
  2. Identify key work areas (e.g. applications, infrastructure, platform engineering, DevOps, security). Add the name of each work area in one of the larger boxes.
    • Go back to your design principles. Did you define any work areas in your design principles that should be represented here?
    • If you have several groups or teams with similar responsibilities, consider lumping them together in one box (e.g. applications teams, 3x DevOps teams).
  3. Copy the tasks from any workstream diagrams you’ve created to the same slide as the organization design diagram. Keep the workstream diagram intact, as you’ll want to be able to refer back to it later.

Activity instructions continue on the next slide.

InputOutput
  • Insight into and experience with your current cloud environment.
  • Cloud Operations Diagram
MaterialsParticipants
  • Whiteboard/Flip charts
  • Cloud Operations
  • Cloud Operations Design Working Group

Cloud operations diagram (cont’d)

1-3 hours

4. As a group, move the work boxes from the workstream diagram into the appropriate work area.

  • Don’t worry about being too tidy for the moment – clean up the diagram when the exercise is done.
  • Make adjustments to the wording of the work boxes if needed.

5. Use the space between work areas to describe how work areas must interact to achieve organizational goals. For example:

  • What information should be shared between groups?
  • What information sharing channels may be used?
  • What processes will be handed-off between groups and how?
  • How often will teams interact?
  • Will interactions be formal or informal?

Create a current-state operations diagram

1 -2 hours

This exercise can be done by one person, then reviewed with the working group at a later time.

This current state diagram helps clarify the changes that may need to happen to get to your future state.

  1. Color code the work boxes for each work area. For example, if you have a “DevOps” work area, make all the work boxes assigned to “DevOps” the same color.
  2. On a separate slide, sketch your existing organization indicating your current teams.
  3. Copy the tasks from the future-state diagram to this current-state chart. Align the tasks to the appropriate groups.
  4. Review the chart with the working group. Discuss: are there teams that are doing work today that will also be done by different teams? Are there groups that may merge into one team? What types of changes may be required?
InputOutput
  • Future-state cloud operations diagram
  • Current-state cloud operations diagram
MaterialsParticipants
  • Cloud Operations Design Sketchbook
  • Projector/screen/virtual meeting
  • Project lead
  • Cloud Operations Design Working Group

Check for biases to make better choices

Use the strategies below to spot and address flaws in your team’s thinking about your future-state design.

Biases

What’s the risk?

Mitigation strategies

Is the team making mistakes due to self-interest, love of a single idea, or groupthink?

Important information may be ignored or left unspoken.

Rigorously check for the other biases, below. Tactfully seek dissenting opinions.

Do recommendations use unreasonable analogies to other successes or failures?

Opportunities or challenges in the current situation may not be sufficiently understood.

Ask for other examples, and check whether the analogies are still valid.

Is the team blinkered by the weight of past decisions?

Doubling-down on bad decisions (sunk costs) or ignoring new opportunities.

Ask yourself what you'd do if you were new to the position or organization.

Does the data support the recommendations?

Data used to make the case isn't a good fit for the challenge, is based on faulty assumptions, or is incomplete.

If you had a year to make the decision, what data would you want? How much can you get?

Are there realistic alternative recommendations?

Alternatives don't exist or are "strawman" options.

Ask for additional options.

Is the recommendation too risk averse or cautious?

Recommendations that may be too risky are ignored, leading to missed opportunities.

Review options to accept, transfer, distribute, or mitigate the risk of the decision.

Framework above adapted from Kahneman, Lovallo, and Sibony (2011)

Be specific with metrics

Thinking of ways you could measure success can help uncover what success actually means to you.

Work collectively to generate success indicators for each key cloud initiative. Success indicators are metrics, with targets, aligned to goals, and if you are able to measure them accurately, they should help you report your progress toward your objectives.

For example, if your driver is “faster access to resources” you might consider indicators like developer satisfaction, project completion time, average time to provision, etc.

There are several reasons you may not publicize these metrics. They may be difficult to calculate or misconstrued as targets, warping behavior in unexpected ways. But managed properly, they have value in measuring operational success!

Examples: Operations redesign project metrics

Key stakeholder satisfaction scores

IT staff engagement scores

Support Delivery of New Functionality

Double number of accepted releases per cycle

80% of key cloud initiatives completed on time, on budget, and in scope

Improve Operational Effectiveness

<1% of servers have more than two major versions out of date

No more than one capacity-related incident per Q

Define success indicators

Whiteboard Activity

45 minutes

  1. On a whiteboard, draw a table with key objectives for the design across the top.
    • What cloud objectives should the redesign help you achieve? Refer back to the design principles from Phase 1.
    • Think about the redesign itself. How will you measure whether the project itself is proceeding according to plan? Consider metrics such as employee engagement scores and satisfaction scores from key stakeholders.
  2. Consider whether the metrics are feasible to track. Record your decisions in your copy of the Cloud Operations Organizing Framework deck.
InputOutput
  • Key design goals
  • Success indicators for your design
MaterialsParticipants
  • Whiteboard
  • Markers
  • Cloud Operations Design Working Group

Populate a roadmap

Tool Activity

45 minutes

  1. In the Roadmap Tool, populate the data entry tab with the initiatives you will take to support changes toward the new cloud operations organizing framework.
  2. Input each of the tasks in the data entry tab and provide a description and rationale behind the task (as needed).
  3. Assign an effort, priority, and cost level to each task (high, medium, low).
  4. Assign an owner to each task – someone who can take points and shepherd the task to completion.
  5. Identify the timeline for each task based on the priority, effort, and cost (short, medium, and long term).
  6. Highlight risk for each task if it will be deferred.
  7. Track the progress of each task with the status column.
InputOutput
  • Cloud Operations Organizing Framework
  • Roadmap/ implementation plan
MaterialsParticipants
  • Roadmap Tool
  • Cloud Operations Design Working Group

Download the Roadmap Tool

Step 2.2: Communicate changes

Participants

  • Operations Design Working Group

Outcomes

  • Build a communication plan for key stakeholders
  • Complete the communication deck Cloud Operations Organizing Framework
  • Build a roadmap

“Words, words, words.”

– Shakespeare

Communicate changes

Which stakeholders will be affected by the changes?

Decision makers: Who do you ultimately need to convince to proceed with any changes you’ve outlined?

Peers: How will managers of other areas be affected by the changes you’re proposing? If you are you suggesting changes to the way that they, or their teams, do their work, you will have to present a compelling case that there’s value in it for them.

Staff: Are you dictating changes or looking for feedback on the path forward?

The image contains a screenshot of the Five Elements of Change that is displayed in a cycle. The five elements are: What is the change? Why are we doing it? How are we going to go about it? How long will it take us? What is the role of each team and individual.

Source: The Qualities of Leadership: Leading Change

Follow these guidelines for good communication

Be relevant

  • Talk about what matters to each stakeholder group.
  • Talk about what matters to the initiative.
  • IT thinks in processes but stakeholders only care about results: talk in terms of results.
  • IT wants to be “understood” but this does not matter to stakeholders; think “what’s in it for them?”
  • Communicate truthfully; do not make false promises or hide bad news.
  • If you expect objections, create a plan to handle them.

Be clear

  • Lead with the point you’re trying to make.
  • Don’t use jargon.
  • Avoid idiomatic language and clichés.
  • Have a third party review draft communications and ask them to tell you the key messages in their own words. If they’re missing the main points, there’s a good chance the draft isn’t clear.

Be consistent

  • Ensure the core message is consistent regardless of audience, channel, or medium.
  • Changing the core message from one group to another can be interpreted as incompetence or an attempt at deception. This will damage your credibility and can lead to a loss of trust.

Be concise

  • Get to the point.
  • Minimize word count wherever possible.

“We tend to use a lot of jargon in our discussions, and that is a sure fire way to turn people away. We realized the message wasn’t getting out because the audience wasn’t speaking the same language. You have to take it down to the next level and help them understand where the needs are.”

– Jeremy Clement, Director of Finance, College of Charleston

Create a communication plan

1 hour

Fill out the table below.

Stakeholder group: Identify key stakeholders who may be impacted by changes to the operations team. This might include IT leadership, management, and staff.

Benefits: What’s in it for them?

Impact: What are we asking in return?

How: What mechanisms or channels will you use to communicate?

When: When (and how often) will you get the message out?

Benefits

Impact

How

When

IT Mgrs.

  • Improve agility, stability
  • Deliver faster against business goals
  • Respond to identified needs
  • Improve confidence in IT
  • Must support the process
  • Change and engagement issues during restructuring may affect staff engagement and productivity
  • Training budget required
  • Present at leadership meeting
  • Kick-off email
  • Sept. leadership meeting
  • Weekly touchpoints
  • Informally throughout project

Ops Staff

  • Clearer direction and clear priorities (Operations mission statement and RACI)
  • Higher-value work – address problems, contribute to plans
  • New skills and training
  • More personal accountability
  • Push toward process consistency
  • Must make time and plan for training during work hours
  • Present at operations team’s offsite meeting
  • AMA channel on Slack
  • 1:1 meetings
  • Add RACI, org. sketch to shared folder
  • Operations offsite
  • Sept. all-hands meeting
  • Ongoing coaching and informal conversations
InputOutput
  • Discussion
  • Communication Plan
MaterialsParticipants
  • Whiteboard/Flip Chart
  • Cloud Operations Design Working Group

Download the Communication Plan Template

Support the transition with a plan to acquire skills

Identify the preferred way to acquire needed skill sets: contracting, outsourcing, training, or hiring.

  • Some cloud projects will change the demand for some skills in the organization, and not all skills should be cultivated internally. Uncertainty about future skills and jobs will cause anxiety for your team and can lead to employee exit.
  • Use Info-Tech’s research to conduct a demand analysis to identify which new and critical skills should be acquired via training or hiring (rather than outsourcing or contracting).
  • Create a roadmap to clarify when training needs to be completed, a budget plan that accounts for training costs, and role descriptions that paint a picture of future work.
  • Within the confines of a collective agreement, managers may be required to retrain staff into new roles before those staff are required to do work in their new jobs. Failing to plan can be more consequential.
  • Remember that in cloud, a wealth of automation opportunities present a great option for offloading tasks as well!
The image contains a screenshot of a multiple bar graph titled: Failing to leverage and develop skills leads to employee exit.

Info-Tech Insight

Identify skills requirements and gaps as early as possible to avoid skills gaps later. Whether you plan to acquire skills via training or cross-training, hiring, contracting, or outsourcing, effectively building skills takes time. Use Info-Tech’s methodology to address skills gaps in a prioritized and rational way.

Involve HR for implementation

Your HR team should help you work through:

  • Which staff and managers will move to which roles, and any headcount changes.
  • Job descriptions, performance metrics, career paths, compensation, and succession planning.
  • Organizational change management and implementation plans.

When do you need to involve HR?

Role changes will result in job description changes.

  • New or changed job descriptions need to be evaluated for impact on pay, title, exempt status, career pathing, and more.
  • This is especially true in more traditional or unionized organizations that require specific and granular job descriptions of responsibilities.
  • Changed jobs will likely require union review and approval.

You anticipate changes to the reporting structure.

  • Work with HR to develop a transition plan including communications, training to new managers, and support to new teams.

You anticipate redundancies.

  • Your HR department can prepare you for difficult discussions, help you navigate labor laws, and support the offboarding process.

You anticipate new positions.

  • Recruitment and hiring takes time. Give HR advance notice to support recruitment, hiring, and onboarding to ensure you hire the right people, with the right skills, at the right time.

Training and development budget is required.

  • If training is a critical part of the onboarding process, don’t just assume funding is available. Work with HR to build your case.

Related Info-Tech Research

Define Your Cloud Vision

Define your cloud vision before it defines you.

Document Your Cloud Strategy

Drive consensus by outlining how your organization will use the cloud.

Map Technical Skills for a Changing Infrastructure & Operations Organization

Be practical and proactive – identify needed technical skills for your future-state environment and the most efficient way to acquire them.

Bibliography

“2021 GitLab DevSecOps Survey.” Gitlab, 2021.
“2022 State of the Cloud Report.” Flexera, 2022.
“DevOps.” Atlassian, ND. Web. 21 July 2022.
Atwood, Jeff. “The 2030 Self-Driving Car Bet.” Coding Horror, 4 Mar 2022. Web. 5 Aug 2022.
Campbell, Andrew. “What is an operating model?” Operational Excellence Society, 12 May 2016. Web. 13 July 2022.
“DevOps.” Atlassian, ND. Web. 21 July 2022.
Ewenstein, Boris, Wesley Smith, Ashvin Sologar. “Changing change management” McKinsey, 1 July 2015. Web. 8 April 2022.
Franco, Gustavo and Matt Brown. “How SRE teams are organized, and how to get started.” Google Cloud Blog, 26 June 2019. Web. July 13 2022.
“Get started: Build a cloud operations team.” Microsoft, 10 May 2021.
ITIL Foundation: ITIL 4 Edition. Axelos, 2019.
Humble, Jez, Joanne Molesky, and Barry O’Reilly. Lean Enterprise: How High Performance Organizations Innovate at Scale. O’Reilly Media, 2015.
Franco, Gustavo and Matt Brown. “How SRE teams are organized and how to get started.” 26 June 2019. Web. 21 July 2022.
Galbraith, Jay. “The Star Model”. ND. Web. 21 July 2022.
Kahnemanm Daniel, Dan Lovallo, and Olivier Sibony. “Before you make that big decision.” Harv Bus Rev. 2011 Jun; 89(6): 50-60, 137. PMID: 21714386.
Kesler, Greg. “Star Model of Organizational Design.” YouTube, 1 Oct 2018. Web Video. 21 Jul 2022.
Lakhani, Usman. “Site Reliability Engineering: What Is It? Why Is It Important for Online Businesses?” Info-Tech. Web. 25 May 2020.
Mansour, Sherif. “Product Management: The role and best practices for beginners.” Atlassian Agile Coach, n.d.
Murphy, Annie, Jamie Kirwin, Khalid Abdul Razak. “Operating Models: Delivering on strategy and optimizing processes.” EY, 2016.
Shults, Carlos. “What is Platform Engineering? The Concept Behind the Term.” liatrio, 3 Aug 2021. Web. 5 Aug 2022.
Sloss, Benjamin Treynor. Site Reliability Engineering Part I: Introduction. O’Reilly Media, 2017.
“SRE vs. Platform Engineering.” Ambassador Labs, 8 Feb 2021.
“The Qualities of Leadership: Leading Change.” Cornelius & Associates, n.d. Web.
“Understand cloud operating models.” Microsoft, 02 Sept. 2022.
Velichko, Ivan. “DevOps, SRE, and Platform Engineering.” 15 Mar 2022.

Research Contributors and Experts

Nenad Begovic

Executive Director, Head of IT Operations

MUFG Investor Services

Desmond Durham

Manager, ICT Planning & Infrastructure

Trinidad & Tobago Unit Trust Corporation

Virginia Roberts

Director, Enterprise IT

Denver Water

Denis Sharp

IT/LEAN Consultant

Three anonymous contributors

It’s “day two” in the cloud. Now what?

About Info-Tech

Info-Tech Research Group is the world’s fastest-growing information technology research and advisory company, proudly serving over 30,000 IT professionals.

We produce unbiased and highly relevant research to help CIOs and IT leaders make strategic, timely, and well-informed decisions. We partner closely with IT teams to provide everything they need, from actionable tools to analyst guidance, ensuring they deliver measurable results for their organizations.

What Is a Blueprint?

A blueprint is designed to be a roadmap, containing a methodology and the tools and templates you need to solve your IT problems.

Each blueprint can be accompanied by a Guided Implementation that provides you access to our world-class analysts to help you get through the project.

Need Extra Help?
Speak With An Analyst

Get the help you need in this 2-phase advisory process. You'll receive 6 touchpoints with our researchers, all included in your membership.

Guided Implementation #1 - Establish Context
  • Call #1 - Scope requirements, objectives, and your specific challenges
  • Call #2 - Assess cloud maturity and drivers for org. redesign
  • Call #3 - Review cloud objectives and obstacles
  • Call #4 - Evaluate new ways of working and identify cloud work

Guided Implementation #2 - Design the Organization and Communicate Changes
  • Call #1 - Create your cloud operations diagram
  • Call #2 - Create your communication plan and build roadmap

Authors

Nabeel Sherif

Scott Young

Andrew Sharp

Contributors

  • Nenad Begovic, Executive Director, Head of IT Operations, MUFG Investor Services
  • Desmond Durham, Manager, ICT Planning & Infrastructure, Trinidad & Tobago Unit Trust Corporation
  • Virginia Roberts, Director, Enterprise IT, Denver Water
  • Denis Sharp, IT/LEAN Consultant
  • 3 anonymous contributors
Visit our IT Cost Optimization Center
Over 100 analysts waiting to take your call right now: 1-519-432-3550 x2019