Get Instant Access
to This Blueprint

Infrastructure Operations icon

Build a Minimum Viable Product for Data Classification With Microsoft 365

Kick-start your governance with data classification users will actually use!

  • Resources are the primary obstacle to getting a foot hold in O365 governance, whether it is funding or FTE resources.
  • Data is segmented and is difficult to analyze when you can’t see it or manage the relationships between sources.
  • Organizations expect results early and quickly and a common obstacle is that building a proper data classification framework can take more than two years and the business can't wait that long.

Our Advice

Critical Insight

  • Data classification is the lynchpin to ANY effective governance of O/M365 and your objective is to navigate through this easily and effectively and build a robust, secure, and viable governance model.
  • Start your journey by identifying what and where your data is and how much data you have. You need to understand what sensitive data you have and where it is stored before you can protect it or govern that data.
  • Ensure there is a high-level leader who is the champion of the governance objective.

Impact and Result

  • Using least complex sensitivity labels in your classification are your building blocks to compliance and security in your data management schema; they are your foundational steps.

Build a Minimum Viable Product for Data Classification With Microsoft 365 Research & Tools

1. Build a Data Classification MVP for M365 Deck – A guide for how to build a minimum-viable product for data classification that end users will actually use.

Discover where your data resides, what governance helps you do, and what types of data you're classifying. Then build your data and security protection baselines for your retention policy, sensitivity labels, workload containers, and both forced and unforced policies.


Build a Minimum Viable Product for Data Classification With Microsoft 365

Kick-start your governance with data classification users will actually use!

Executive Summary

Info-Tech Insight

  • Creating an MVP gets you started in data governance
    Information protection and governance are not something you do once and then you are done. It is a constant process where you start with the basics (a minimum-viable product or MVP) and enhance your schema over time. The objective of the MVP is reducing obstacles to establishing an initial governance position, and then enabling rapid development of the solution to address a variety of real risks, including data loss prevention (DLP), data retention, legal holds, and data labeling.
  • Define your information and protection strategy
    The initial strategy is to start looking across your organization and identifying your customer data, regulatory data, and sensitive information. To have a successful data protection strategy you will include lifecycle management, risk management, data protection policies, and DLP. All key stakeholders need to be kept in the loop. Ensure you keep track of all available data and conduct a risk analysis early. Remember, data is your highest valued intangible asset.
  • Planning and resourcing are central to getting started on MVP
    A governance plan and governance decisions are your initial focus. Create a team of stakeholders that include IT and business leaders (including Legal, Finance, HR, and Risk), and ensure there is a top-level leader who is the champion of the governance objective, which is to ensure your data is safe, secure, and not prone to leakage or theft, and maintain confidentiality where it is warranted.

Executive Summary

Your Challenge
  • Today, the amount of data companies are gathering is growing at an explosive rate. New tools are enabling unforeseen channels and ways of collaborating.
  • Combined with increased regulatory oversight and reporting obligations, this makes the discovery and management of data a massive undertaking. IT can’t find and protect the data when the business has difficulty defining its data.
  • The challenge is to build a framework that can easily categorize and classify data yet allows for sufficient regulatory compliance and granularity to be useful. Also, to do it now because tomorrow is too late.
Common Obstacles

Data governance has several obstacles that impact a successful launch, especially if governing M365 is not a planned strategy. Below are some of the more common obstacles:

  • Resources are the primary obstacle to starting O365 governance, whether it is funding or people.
  • Data is segmented and is difficult to analyze when you can’t see it or manage the relationships between sources.
  • Organizations expect results early and quickly and a common obstacle is that building a "proper data classification framework” is a 2+ year project and the business can't wait that long.
Info-Tech’s Approach
  • Start with the basics: build a minimum-viable product (MVP) to get started on the path to sustainable governance.
  • Identify what and where your data resides, how much data you have, and understand what sensitive data needs to be protected.
  • Create your team of stakeholders, including Legal, records managers, and privacy officers. Remember, they own the data and should manage it.
  • Categorization comes before classification, and discovery comes before categorization. Use easy-to-understand terms like high, medium, or low risk.

Info-Tech Insight

Data classification is the lynchpin to any effective governance of O/M365 and your objective is to navigate through this easily and effectively and build a robust, secure, and viable governance model. Start your journey by identifying what and where your data is and how much data do you have. You need to understand what sensitive data you have and where it is stored before you can protect or govern it. Ensure there is a high-level leader who is the champion of the governance objectives. Data classification fulfills the governance objectives of risk mitigation, governance and compliance, efficiency and optimization, and analytics.

Questions you need to ask

Four key questions to kick off your MVP.

1

Know Your Data

Do you know where your critical and sensitive data resides and what is being done with it?

Trying to understand where your information is can be a significant project.

2

Protect Your Data

Do you have control of your data as it traverses across the organization and externally to partners?

You want to protect information wherever it goes through encryption, etc.

3

Prevent Data Loss

Are you able to detect unsafe activities that prevent sharing of sensitive information?

Data loss prevention (DLP) is the practice of detecting and preventing data breaches, exfiltration, or unwanted destruction of sensitive data.

4

Govern Your Data

Are you using multiple solutions (or any) to classify, label, and protect sensitive data?

Many organizations use more than one solution to protect and govern their data, making it difficult to determine if there are any coverage gaps.

Classification tiers

Build your schema.

Pyramid visualization for classification tiers. The top represents 'Simplicity', and the bottom 'Complexity' with the length of the sides at each level representing the '# of policies' and '# of labels'. At the top level is 'MVP (Minimum-Viable Product) - Confidential, Internal (Subcategory: Personal), Public'. At the middle level is 'Regulated - Highly Confidential, Confidential, Sensitive, General, Internal, Restricted, Personal, Sub-Private, Public'. And a the bottom level is 'Government (DOD) - Top Secret (TS), Secret, Confidential, Restricted, Official, Unclassified, Clearance'

Info-Tech Insight

Deciding on how granular you go into data classification will chiefly be governed by what industry you are in and your regulatory obligations – the more highly regulated your industry, the more classification levels you will be mandated to enforce. The more complexity you introduce into your organization, the more operational overhead both in cost and resources you will have to endure and build.

Microsoft MIP Topology

Microsoft Information Protection (MIP), which is Microsoft’s Data Classification Services, is the key to achieving your governance goals. Without an MVP, data classification will be overwhelming; simplifying is the first step in achieving governance.

A diagram of multiple offerings all connected to 'MIP Data Classification Service'. Circled is 'Sensitivity Labels' with an arrow pointing back to 'MIP' at the center.
(Source: Microsoft, “Microsoft Purview compliance portal”)

Info-Tech Insight

Using least-complex sensitivity labels in your classification are your building blocks to compliance and security in your data management schema; they are your foundational steps.

MVP RACI Chart

Data governance is a "takes a whole village" kind of effort.

Clarify who is expected to do what with a RACI chart.

End User M365 Administrator Security/ Compliance Data Owner
Define classification divisions R A
Appy classification label to data – at point of creation A R
Apply classification label to data – legacy items R A
Map classification divisions to relevant policies R A
Define governance objectives R A
Backup R A
Retention R A
Establish minimum baseline A R

What and where your data resides

Data types that require classification.

Logos for 'Microsoft', 'Office 365', and icons for each program included in that package.
M365 Workload Containers
Icon for MS Exchange. Icon for MS SharePoint.Icon for MS Teams. Icon for MS OneDrive. Icon for MS Project Online.
Email
  • Attachments
Site Collections, Sites Sites Project Databases
Contacts Teams and Group Site Collections, Sites Libraries and Lists Sites
Metadata Libraries and Lists Documents
  • Versions
Libraries and Lists
Teams Conversations Documents
  • Versions
Metadata Documents
  • Versions
Teams Chats Metadata Permissions
  • Internal Sharing
  • External Sharing
Metadata
Permissions
  • Internal Sharing
  • External Sharing
Files Shared via Teams Chats Permissions
  • Internal Sharing
  • External Sharing

Info-Tech Insight

Knowing where your data resides will ensure you do not miss any applicable data that needs to be classified. These are examples of the workload containers; you may have others.

Discover and classify on- premises files using AIP

AIP helps you manage sensitive data prior to migrating to Office 365:
  • Use discover mode to identify and report on files containing sensitive data.
  • Use enforce mode to automatically classify, label, and protect files with sensitive data.
Can be configured to scan:
  • SMB files
  • SharePoint Server 2016, 2013
Stock image of a laptop uploading to the cloud with a padlock and key in front of it.
  • Map your network and find over-exposed file shares.
  • Protect files using MIP encryption.
  • Inspect the content in file repositories and discover sensitive information.
  • Classify and label file per MIP policy.
Azure Information Protection scanner helps discover, classify, label, and protect sensitive information in on-premises file servers. You can run the scanner and get immediate insight into risks with on-premises data. Discover mode helps you identify and report on files containing sensitive data (Microsoft Inside Track and CIAOPS, 2022). Enforce mode automatically classifies, labels, and protects files with sensitive data.

Info-Tech Insight

Any asset deployed to the cloud must have approved data classification. Enforcing this policy is a must to control your data.

Understanding governance

Microsoft Information Governance

Information Governance
  • Retention policies for workloads
  • Inactive and archive mailboxes

Arrow pointing down-right

Records Management
  • Retention labels for items
  • Disposition review

Arrow pointing down-left

Retention and Deletion

‹——— Connectors for Third-Party Data ———›

Information governance manages your content lifecycle using solutions to import, store, and classify business-critical data so you can keep what you need and delete what you do not. Backup should not be used as a retention methodology since information governance is managed as a “living entity” and backup is a stored information block that is “suspended in time.” Records management uses intelligent classification to automate and simplify the retention schedule for regulatory, legal, and business-critical records in your organization. It is for that discrete set of content that needs to be immutable.
(Source: Microsoft, “Microsoft Purview compliance portal”)

Retention and backup policy decision

Retention is not backup.

Info-Tech Insight

Retention is not backup. Retention means something different: “the content must be available for discovery and legal document production while being able to defend its provenance, chain of custody, and its deletion or destruction” (AvePoint Blog, 2021).

Microsoft Responsibility (Microsoft Protection) Weeks to Months Customer Responsibility (DLP, Backup, Retention Policy) Months to Years
Loss of service due to natural disaster or data center outage Loss of data due to departing employees or deactivated accounts
Loss of service due to hardware or infrastructure failure Loss of data due to malicious insiders or hackers deleting content
Short-term (30 days) user error with recycle bin/ version history (including OneDrive “File Restore”) Loss of data due to malware or ransomware
Short-term (14 days) administrative error with soft- delete for groups, mailboxes, or service-led rollback Recovery from prolonged outages
Long-term accidental deletion coverage with selective rollback

Understand retention policy

What are retention policies used for? Why you need them as part of your MVP?

Do not confuse retention labels and policies with backup.

Remember: “retention [policies are] auto-applied whereas retention label policies are only applied if the content is tagged with the associated retention label” (AvePoint Blog, 2021).

E-discovery tool retention policies are not turned on automatically.

Retention policies are not a backup tool – when you activate this feature you are unable to delete anyone.

“Data retention policy tools enable a business to:

  • “Decide proactively whether to retain content, delete content, or retain and then delete the content when needed.
  • “Apply a policy to all content or just content meeting certain conditions, such as items with specific keywords or specific types of sensitive information.
  • “Apply a single policy to the entire organization or specific locations or users.
  • “Maintain discoverability of content for lawyers and auditors, while protecting it from change or access by other users. […] ‘Retention Policies’ are different than ‘Retention Label Policies’ – they do the same thing – but a retention policy is auto-applied, whereas retention label policies are only applied if the content is tagged with the associated retention label.

“It is also important to remember that ‘Retention Label Policies’ do not move a copy of the content to the ‘Preservation Holds’ folder until the content under policy is changed next.” (Source: AvePoint Blog, 2021)

Definitions

Data classification is a focused term used in the fields of cybersecurity and information governance to describe the process of identifying, categorizing, and protecting content according to its sensitivity or impact level. In its most basic form, data classification is a means of protecting your data from unauthorized disclosure, alteration, or destruction based on how sensitive or impactful it is.

Once data is classified, you can then create policies; sensitive data types, trainable classifiers, and sensitivity labels function as inputs to policies. Policies define behaviors, like if there will be a default label, if labeling is mandatory, what locations the label will be applied to, and under what conditions. A policy is created when you configure Microsoft 365 to publish or automatically apply sensitive information types, trainable classifiers, or labels.

Sensitivity label policies show one or more labels to Office apps (like Outlook and Word), SharePoint sites, and Office 365 groups. Once published, users can apply the labels to protect their content.

Data loss prevention (DLP) policies help identify and protect your organization's sensitive info (Microsoft Docs, April 2022). For example, you can set up policies to help make sure information in email and documents is not shared with the wrong people. DLP policies can use sensitive information types and retention labels to identify content containing information that might need protection.

Retention policies and retention label policies help you keep what you want and get rid of what you do not. They also play a significant role in records management.

Data examples for MVP classification

  • Examples of the type of data you consider to be Confidential, Internal, or Public.
  • This will help you determine what to classify and where it is.
Internal Personal, Employment, and Job Performance Data
  • Social Security Number
  • Date of birth
  • Marital status
  • Job application data
  • Mailing address
  • Resume
  • Background checks
  • Interview notes
  • Employment contract
  • Pay rate
  • Bonuses
  • Benefits
  • Performance reviews
  • Disciplinary notes or warnings
Confidential Information
  • Business and marketing plans
  • Company initiatives
  • Customer information and lists
  • Information relating to intellectual property
  • Invention or patent
  • Research data
  • Passwords and IT-related information
  • Information received from third parties
  • Company financial account information
  • Social Security Number
  • Payroll and personnel records
  • Health information
  • Self-restricted personal data
  • Credit card information
Internal Data
  • Sales data
  • Website data
  • Customer information
  • Job application data
  • Financial data
  • Marketing data
  • Resource data
Public Data
  • Press releases
  • Job descriptions
  • Marketing material intended for general public
  • Research publications

New container sensitivity labels (MIP)

New container sensitivity labels

Public Private
Privacy
  1. Membership to group is open; anyone can join
  2. “Everyone except external guest” ACL onsite; content available in search to all tenants
  1. Only owner can add members
  2. No access beyond the group membership until someone shares it or changes permissions
Allowed Not Allowed
External guest policy
  1. Membership to group is open; anyone can join
  2. “Everyone except external guest” ACL onsite; content available in search to all tenants
  1. Only owner can add members
  2. No access beyond the group membership until someone shares it or changes permissions

What users will see when they create or label a Team/Group/Site

Table of what users will see when they create or label a team/group/site highlighting 'External guest policy' and 'Privacy policy options' as referenced above.
(Source: Microsoft, “Microsoft Purview compliance portal”)

Info-Tech Insights

Why you need sensitivity container labels:
  • Manage privacy of Teams Sites and M365 Groups
  • Manage external user access to SPO sites and teams
  • Manage external sharing from SPO sites
  • Manage access from unmanaged devices

Data protection and security baselines

Data Protection Baseline

“Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline" (Microsoft Docs, June 2022). This baseline assessment has a set of controls for key regulations and standards for data protection and general data governance. This baseline draws elements primarily from NIST CSF (National Institute of Standards and Technology Cybersecurity Framework) and ISO (International Organization for Standardization) as well as from FedRAMP (Federal Risk and Authorization Management Program) and GDPR (General Data Protection Regulation of the European Union).

Security Baseline

The final stage in M365 governance is security. You need to implement a governance policy that clearly defines storage locations for certain types of data and who has permission to access it. You need to record and track who accesses content and how they share it externally. “Part of your process should involve monitoring unusual external sharing to ensure staff only share documents that they are allowed to” (Rencore, 2021).

Info-Tech Insights

  • Controls are already in place to set data protection policy. This assists in the MVP activities.
  • Finally, you need to set your security baseline to ensure proper permissions are in place.

Prerequisite baseline

Icon of crosshairs.
Security

MFA or SSO to access from anywhere, any device

Banned password list

BYOD sync with corporate network

Icon of a group.
Users

Sign out inactive users automatically

Enable guest users

External sharing

Block client forwarding rules

Icon of a database.
Resources

Account lockout threshold

OneDrive

SharePoint

Icon of gears.
Controls

Sensitivity labels, retention labels and policies, DLP

Mobile application management policy

Building baselines

Sensitivity Profiles: Public, Internal, Confidential; Subcategory: Highly Confidential

Microsoft 365 Collaboration Protection Profiles

Sensitivity Public External Collaboration Internal Highly Confidential
Description Data that is specifically prepared for public consumption Not approved for public consumption, but OK for external collaboration External collaboration highly discouraged and must be justified Data of the highest sensitivity: avoid oversharing, internal collaboration only
Label details
  • No content marking
  • No encryption
  • Public site
  • External collaboration allowed
  • Unmanaged devices: allow full access
  • No content marking
  • No encryption
  • Private site
  • External collaboration allowed
  • Unmanaged devices: allow full access
  • Content marking
  • Encryption
  • Private site
  • External collaboration allowed but monitored
  • Unmanaged devices: limited web access
  • Content marking
  • Encryption
  • Private site
  • External collaboration disabled
  • Unmanaged devices: block access
Teams or Site details Public Team or Site open discovery, guests are allowed Private Team or Site members are invited, guests are allowed Private Team or Site members are invited, guests are not allowed
DLP None Warn Block

Please Note: Global/Compliance Admins go to the 365 Groups platform, the compliance center (Purview), and Teams services (Source: Microsoft Documentation, “Microsoft Purview compliance documentation”)

Info-Tech Insights

  • Building baseline profiles will be a part of your MVP. You will understand what type of information you are addressing and label it accordingly.
  • Sensitivity labels are a way to classify your organization's data in a way that specifies how sensitive the data is. This helps you decrease risks in sharing information that shouldn't be accessible to anyone outside your organization or department. Applying sensitivity labels allows you to protect all your data easily.

MVP activities

PRIMARY
ACTIVITIES
Define Your Governance
The objective of the MVP is reducing barriers to establishing an initial governance position, and then enabling rapid progression of the solution to address a variety of tangible risks, including DLP, data retention, legal holds, and labeling.
Decide on your classification labels early.

CATEGORIZATION





CLASSIFICATION

MVP
Data Discovery and Management
AIP (Azure Information Protection) scanner helps discover, classify, label, and protect sensitive information in on-premises file servers. You can run the scanner and get immediate insight into risks with on-premises data.
Baseline Setup
Building baseline profiles will be a part of your MVP. You will understand what type of information you are addressing and label it accordingly. Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline.
Default M365 settings
Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline. This baseline assessment has a set of controls for key regulations and standards for data protection and general data governance.
SUPPORT
ACTIVITIES
Retention Policy
Retention policy is auto-applied. Decide whether to retain content, delete content, or retain and then delete the content.
Sensitivity Labels
Automatically enforce policies on groups through labels; classify groups.
Workload Containers
M365: SharePoint, Teams, OneDrive, and Exchange, where your data is stored for labels and policies.
Unforced Policies
Written policies that are not enforceable by controls in Compliance Manager such as acceptable use policy.
Forced Policies
Restrict sharing controls to outside organizations. Enforce prefix or suffix to group or team names.

ACME Company MVP for M/O365

PRIMARY
ACTIVITIES
Define Your Governance


Focus on ability to use legal hold and GDPR compliance.

CATEGORIZATION





CLASSIFICATION

MVP
Data Discovery and Management


Three classification levels (public, internal, confidential), which are applied by the user when data is created. Same three levels are used for AIP to scan legacy sources.

Baseline Setup


All data must at least be classified before it is uploaded to an M/O365 cloud service.

Default M365 settings


Turn on templates 1 8 the letter q and the number z

SUPPORT
ACTIVITIES
Retention Policy


Retention policy is auto-applied. Decide whether to retain content, delete content, or retain and then delete the content.

Sensitivity Labels


Automatically enforce policies on groups through labels; classify groups.

Workload Containers


M365: SharePoint, Teams, OneDrive, and Exchange, where your data is stored for labels and policies.

Unforced Policies


Written policies that are not enforceable by controls in Compliance Manager such as acceptable use policy.

Forced Policies


Restrict sharing controls to outside organizations. Enforce prefix or suffix to group or team names.

Build a Minimum Viable Product for Data Classification With Microsoft 365 preview picture

About Info-Tech

Info-Tech Research Group is the world’s fastest-growing information technology research and advisory company, proudly serving over 30,000 IT professionals.

We produce unbiased and highly relevant research to help CIOs and IT leaders make strategic, timely, and well-informed decisions. We partner closely with IT teams to provide everything they need, from actionable tools to analyst guidance, ensuring they deliver measurable results for their organizations.

What Is a Blueprint?

A blueprint is designed to be a roadmap, containing a methodology and the tools and templates you need to solve your IT problems.

Each blueprint can be accompanied by a Guided Implementation that provides you access to our world-class analysts to help you get through the project.

Talk to an Analyst

Our analyst calls are focused on helping our members use the research we produce, and our experts will guide you to successful project completion.

Book an Analyst Call on This Topic

You can start as early as tomorrow morning. Our analysts will explain the process during your first call.

Get Advice From a Subject Matter Expert

Each call will focus on explaining the material and helping you to plan your project, interpret and analyze the results of each project step, and set the direction for your next project step.

Unlock Sample Research

Authors

John Donovan

John Annand

Contributors

  • Björn Erkens, Product Owner, Rencore governance
Visit our Exponential IT Research Center
Over 100 analysts waiting to take your call right now: 1-519-432-3550 x2019