Get Instant Access
to This Blueprint

Security icon

Build an Effective Data Retention Program

Treat the data risks that will derail your retention schedule

Data is created every day, and the sheer volume of it means many organizations struggle to manage their data, leading to:

  • Unstructured data sprawl
  • Difficulties maintaining compliance obligations
  • Retaining sensitive data unnecessarily

Our Advice

Critical Insight

Focus your efforts on the data where the highest risk levels are hiding, and work towards implementing an automated process. Manual efforts will always carry the most risk.

Impact and Result

The best way to resolve the difficulties with building and effective data retention program is to:

  • Identify your retention requirements
  • Develop a retention schedule and risk profile for your data processes and types
  • Use the above outputs to determine where the greatest risks lie and plan to reduce them as much as possible.

By focusing on the high-risk areas, you won't lose precious time managing data retention.


Build an Effective Data Retention Program Research & Tools

1. Build an Effective Data Retention Program – A brief deck that helps organizations treat the data risks that will derail your retention schedule.

This project will help you identify your retention requirements; develop a retention schedule and risk profile for your data processes and types; manage your data retention processes more efficiently by focusing on the high-risk areas.

2. Data Retention Schedule and Risk Identification Tool– A tool to establish a data retention schedule and generate a dashboard.

This tool helps you draft your data retention schedule and identify the level of retention-related risk with each of your data types.

3. Data Retention Policy Template – A template to set the foundational requirements of data retention within the organization.

This policy template is designed to outline data retention requirements in alignment with laws and regulations, and to eliminate data that no longer requires storage, security, and resources.

4. Data Retention RACI Tool – A tool to allocate ownership and responsibility for implementing an existing or new data retention program.

This tool will help you allocate ownership and responsibility for implementing an existing or new data retention program. Each task has specific individuals allocate their time and effort to it; they are listed as responsible, accountable, consulted, or informed.


Build an Effective Data Retention Program

Treat the data risks that will derail your retention schedule

Analyst Perspective

Overcome retention challenges by identifying and treating high-risk data types.

Data retention is a challenge for many organizations. Ideally, we would be able to fully automate our retention and deletion of records and never think about it again. But even organizations that do data retention well are often forced to use some semi-automated or manual processes to adhere to their retention schedules. In other words, there is no perfect solution to resolve data-retention challenges once and for all.

However, by identifying the data types and processes which are most prone to failure (i.e. those that cannot be fully automated), we can explore options to reduce that risk as much as possible and make those remaining manual and semi-automated processes more manageable. By prioritizing our high-risk data flows, we will be able to work more efficiently and determine which data repositories need the most oversight.

Photo of Logan Rohde, Senior Research Analyst, Security, Info-Tech Research Group.

Logan Rohde
Senior Research Analyst, Security
Info-Tech Research Group

Photo of Alan Tang, Principal Research Director, Security, Info-Tech Research Group.

Alan Tang
Principal Research Director, Security
Info-Tech Research Group

Photo of Isabelle Hertanto, Principal Research Director, Security, Info-Tech Research Group.

Isabelle Hertanto
Principal Research Director, Security
Info-Tech Research Group

Build an Effective Data Retention Program

Treat the data risks that will derail your retention schedule

EXECUTIVE BRIEF

Executive Summary

Your Challenge

Data is created every day, and the sheer volume of it means many organizations struggle to manage their data, leading to:

  • Unstructured data sprawl.
  • Difficulties maintaining compliance obligations.
  • Retaining sensitive data unnecessarily.

Because the problem continues to grow over time, many organizations struggle even to identify where the greatest data risks lie.

Common Obstacles

Data retention is a full-time job that usually receives less than part-time hours, meaning that organizations:

  • Do not have the resources to validate whether the data retention schedule is being followed.
  • Do not have the ability to automate the process.
  • Are directed to retain everything.

Taken together, these factors prevent many organizations from following their own retention schedule.

Info-Tech’s Approach

The best way to resolve the difficulties with building an effective data retention program is to:

  • Identify your retention requirements.
  • Develop a retention schedule and risk profile for your data processes and types.
  • Use the above outputs to determine where the greatest risks lie, and plan to reduce them as much as possible.

By focusing on the high-risk areas, you will manage your data retention processes more efficiently.

Info-Tech Insight

Focus your efforts on data with the highest risk levels, and work towards implementing an automated process. Manual efforts will always carry the most risk.

Your challenge

This research is designed to help organizations that are looking to:

  • Identify data retention requirements
  • Develop a data retention schedule
  • Reduce retention-related risks
  • Choose data repositories for automation
  • Manage manual deletion processes
Prioritizing and treating data processes and types that are most prone to retention issues will allow you to simultaneously lower risk and establish a smoother retention-deletion process.

Cost of a data breach in 2021

$4.24 Million

($161 per record)

(Source: IBM & Ponemon)

Common obstacles

These barriers make this challenge difficult to address for many organizations:

  • Lack of visibility into data flows
  • No established system to classify data by type and sensitivity
  • Need to satisfy multiple (conflicting) regulations
  • No way to determine the retention-related risk of data repositories
  • Administrative overhead in managing a manual data-deletion process

Organizations must manage impossibly high volumes of data dispersed across multiple internal and external repositories, making it difficult to maintain visibility and control over data flows and retention obligations.

42.5% — Annual increase in data volume for organizations.

70% — Percentage of data distributed across edge and cloud repositories. (Source: Seagate, 2020)

Info-Tech’s methodology to build an effective data retention program

1. Set your governance requirements 2. Complete data retention schedule and risk assessment 3. Manage manual data retention processes
Phase steps

1.1 Identify data retention laws and regulations
1.2 Identify sensitive data and data types
1.3 Identify existing data repositories
1.4 Develop a retention policy
1.5 Determine roles and responsibilities

2.1 Develop a data retention schedule
2.2 Plan to address risky data types

3.1 Identify cases where manual retention is necessary

Phase outcomes
  • List of compliance obligations
  • Identified data types and associated repositories
  • Data retention policy
  • RACI Chart
  • Data retention schedule
  • List of high-risk repositories
  • Prioritized list of actions for risk treatment
  • Manual process tracker

The greatest risks will be in data repositories without automation

Prioritize the riskiest data

Focus your efforts on data with the highest risk levels, and work towards implementing an automated process; manual efforts will always carry the most risk.

Link retention to governance

Successful data retention is closely linked with security governance, compliance, and data classification. Without these guardrails, most organizations struggle to establish a reliable data retention schedule.

Retention schedules do not delete data on their own

A retention schedule is necessary, but having one won’t ensure retention-related risks are managed effectively. Rather, the key lies in identifying risky data processes, types, and repositories, and finding solutions to lower those risks.

Not everything is automatable

Some manual deletion should be expected. Very few retention programs run on automation alone. Manual deletion is manageable provided we have a plan to deal with it.

Two kinds of regulation

Identify conflicts in your obligations. Some regulations and laws dictate that data must be retained, while others demand that data be deleted once it is no longer in active use. You may have to make a judgment call regarding the most appropriate retention period. This should be done in conjunction with your legal counsel.

Find your data’s first instance

Establish a single source of truth for your data. This will allow you to go to the source and delete the first instance of the data (as per your retention schedule), and then plan to purge the secondary, tertiary, etc. instances on a regular basis.

Key deliverable:
Data Retention Schedule and Risk Identification Tool

Sample of Info-Tech's key deliverable, the Data Retention Schedule and Risk Identification Tool.

Blueprint deliverables

Each step of this blueprint is accompanied by supporting deliverables to help you accomplish your goals:

Data Retention Policy Template

Sample of Info-Tech's Data Retention Policy Template deliverable.

Data Retention RACI

Sample of Info-Tech's Data Retention RACI deliverable.

Blueprint benefits

IT Benefits

  • Improved data security.
  • Increased insight into risky data processes, types, and repositories.
  • Enhanced or established data governance practices.
  • Strategically managed compliance obligations.

Business Benefits

  • Reduced physical storage costs by balancing retention requirements with the ability to purge data that is no longer needed.
  • Reduced litigation risks, regulatory fines, and penalties by keeping information for only the length of time that is legally required.
  • Improved security by protecting business-critical information, minimizing data leakage, and ensuring the availability of information when needed.

Measure the value of this blueprint

This blueprint helps organizations to:
  • Reduce physical storage costs by balancing retention requirements with the ability to purge data that is no longer needed.
  • Reduce litigation risks, regulatory fines, and penalties by keeping information for only the length of time that is legally required.
  • Improve security posture by protecting business-critical information, minimizing data leakage, and ensuring the availability of information when needed.
Key exercises:

1.1 Identify and document data retention laws and regulations

1.2 Identify data types and sensitive data

1.3 Identify data repositories

1.4 Develop a data retention policy

1.5 Determine roles and responsibilities

2.1 Complete data retention schedule

2.2 Plan to address risky data types

3.1 Identify cases where manual retention is necessary

Info-Tech Project Value

$45.00 Average hourly wage of a privacy and compliance officer
x 760 hours
= $34,200
Average total time to complete the following data retention related projects:
Using this Blueprint:

10 hours (GI Calls)
+ 40 hours (GI Activities)
= 50 hours
x $45.00 (avg. wage)
= $2,250

  • Identify and document data retention laws and regulations
  • Identify data types and sensitive data
  • Identify data repositories
  • Develop a data retention policy
  • Determine roles and responsibilities
  • Complete data retention schedule
  • Plan to address risky data types
  • Identify cases where manual retention is necessary
$34,200 – 2,250 =
$31,950
Estimated cost and time savings from this blueprint

Info-Tech offers various levels of support to best suit your needs

DIY Toolkit

Guided Implementation

Workshop

Consulting

"Our team has already made this critical project a priority, and we have the time and capability, but some guidance along the way would be helpful." "Our team knows that we need to fix a process, but we need assistance to determine where to focus. Some check-ins along the way would help keep us on track." "We need to hit the ground running and get this project kicked off immediately. Our team has the ability to take this over once we get a framework and strategy in place." "Our team does not have the time or the knowledge to take this project on. We need assistance through the entirety of this project."

Diagnostics and consistent frameworks used throughout all four options

Guided Implementation

A Guided Implementation (GI) is a series of calls with an Info-Tech analyst to help implement our best practices in your organization.

A typical GI is 4 to 8 calls over the course of 1 to 3 months.

What does a typical GI on this topic look like?

Phase 1

Phase 2

Phase 3

Call #1: Gather data retention requirements Call #2: Draft data retention schedule

Call #3: Identify risky data.

Call #4: Prioritize data repositories for risk treatment

Call #5: Determine where manual process is necessary

Build an Effective Data Retention Program

Phase 1

Set your governance requirements

Phase 1

1.1 Identify and document data retention laws and regulations

1.2 Identify data types and sensitive data

1.3 Identify data repositories

1.4 Develop a data retention policy

1.5 Determine roles and responsibilities

Phase 2

2.1 Complete data retention schedule

2.2 Plan to address risky data types

Phase 3

3.1 Identify cases where manual retention is necessary

This phase will walk you through the following activities:

  • Identify and document data retention laws and regulations
  • Identify data types and sensitive data
  • Identify data repositories
  • Develop a data retention policy
  • Determine roles and responsibilities

Outcomes of this phase

  • Awareness of applicable laws and regulations
  • Understanding of the relevant data classifications and data types
  • Knowledge of all data repositories and locations
  • Formalized data retention policy
  • Consensus of individual accountabilities and responsibilities for data retention across the organization

1.1 Identify and document data retention laws and regulations

1 hour

Input: Ask participants to identify and document applicable data retention laws and regulations. Identify relevant retention requirements from the laws and regulations.

Output: Documented list of data retention laws, regulations and relevant requirements

Materials: Whiteboard/flip charts, Sticky notes, Pen/marker

Participants: IT representative, Security officer, Privacy officer, Legal counsel, Senior management team (optional), Business representative (optional)

  1. Bring together relevant stakeholders from the organization. This can include those mentioned in the participants list.
  2. Identify applicable laws and regulations.
  3. Identify articles that set forth data retention requirements.
  4. Identify data types that are regulated by the laws.
  5. Identify the specific data retention requirements.
  6. Document all the information above in the table below.
Law/Regulation Article Data Type Retention Requirement

Data retention laws and regulations

  • Determine which laws or regulations you are currently subject to or will be obligated to comply with in the future. If you are not subject to anything now, align your target-state compliance objectives with the most restrictive regulation currently in place. This will set you up to handle any new laws passed in your jurisdiction.
  • Consider planned expansion into new markets and how data sovereignty or data residency laws influence data retention rules.
  • Review your privacy program to identify the laws or regulations that dictate how data containing personally identifiable information (PII) should be retained or deleted.

Info-Tech Insight

Beware of conflicts in your obligations. Some regulations and laws dictate that data must be retained, while others demand that data be deleted once it is no longer in active use. You may have to make a judgment call regarding what the right retention period is. This should be done in conjunction with your legal counsel.

Examples of laws that set forth requirements for data retention

An organization must not retain personal information for a period longer than necessary to fulfil the purposes described in the notice or to comply with applicable laws.

EU - GDPR Canada - The Privacy Act US - HIPAA security rules Norway - Regulation 1107/2018
GDPR Recital 39, Article 5(1)(e), and Article 17 stipulate that personal data should not be stored for longer than is necessary for the purposes for which the personal data are processed; personal data may be stored for longer periods insofar as the personal data will be processed solely for archiving purposes in the public interest, scientific or historical research purposes, or statistical purposes. Personal information concerning an individual that has been used by a government institution for an administrative purpose shall be retained by the institution:
a) for at least two years following the last time the personal information was used for an administrative purpose unless the individual consents to its disposal; and,
b) where a request for access to the information has been received, until such time as the individual has had the opportunity to exercise all his rights under the Act.
Maintain 6 years of policies, procedures, and records

German - Federal Data Protection Act (BDSG)

According to the German Federal Data Protection Act (BDSG), personal data shall be erased if they are processed for own purposes, as soon as knowledge of them is no longer needed to carry out the purpose for which they were recorded.
Norway Regulation 1107/2018 on camera surveillance in the workplace (“the Camera Surveillance Regulation”) stipulates that camera recordings must be deleted no later than 7 days after the recordings have been made and may only be stored for up to 30 days if it is likely that the recordings will be handed to law enforcement agencies in connection with the investigation of criminal offences.

1.2 Identify data types and sensitive data

1 hour

Input: Ask participants to identify data types within the organization. Ask participants to identify personal data. Ask participants to document data sensitivity and classification level (optional).

Output: Documented data types, personal data, data sensitivity and classification level (optional)

Materials: Whiteboard/flip charts, Sticky notes, Pen/marker

Participants: IT representative, Security officer, Privacy officer, Legal counsel, Senior management team (optional), Business representative (optional)

  1. Bring together relevant stakeholders from the organization. This can include those mentioned in the participants list.
  2. Identify and document data types within the organization.
  3. Identify and document types of personal data.
  4. Identify and document data sensitivity and classification level (optional).
  5. Document all the information above in the Data Retention Schedule and Risk Identification Tool.

Download the Data Retention Schedule and Risk Identification Tool

Data types and sensitive data

  • Identifying data types will help you organize your data in groups for which general retention periods can be determined, allowing you to deal with larger chunks of data rather than individual records. These groups should be formed based on similarity of content and policy requirements (e.g. the data must be held for the minimum period).
  • Data sensitivity will factor into decisions around how long a given data type should be retained, as well as the level of protection it will need while it is retained. Remember, certain types of data, like intellectual property, will need to be retained indefinitely. But these data types are also highly sensitive and will always require a higher degree of protection.
  • Personal data retention is an integral part of the overall data retention program.
  • Draw from your data classification scheme and use your predefined record types and data sensitivity levels to set retention requirements for the retention schedule.
  • Draw from your information security and privacy programs to help you quickly identify PII and other risky data more quickly.

Info-Tech Insight

Mistakes do happen. Be sure to review the records within each data type to ensure no important legal or regulatory stipulations will interfere with your plans to treat all this data the same.

Personal data retention is part of the overall data retention program

Examples of personal data include
Traditional PII:
Personally identifiable information
Personal data:
Any information relating to an identified or identifiable person
Sensitive personal data:
Special categories of personal data (some regulations, like GDPR, expand their scope to include these)
Full name (if not common) First, middle, and last names Biometrics data: retinal scans, voice signatures, or facial geometry
Home address IP address Health information: patient identification number or health records
Date of birth Email address or another online identifier Political opinions
Social Security number Social media post Trade union membership
Banking information Location data Sexual orientation and/or gender identity
Passport number Photograph Religious and/or philosophical beliefs
Etc. Etc. Ethnic origin and/or race

Use data classification to simplify data retention

Data retention is easier when you have a classification scheme and can match your sensitivity levels to different data types.

Both data type and sensitivity level need to be considered when determining your data retention schedule:

  • Identifying data types will help you organize your data in groups for which general retention periods can be determined, allowing you to deal with larger chunks of data rather than individual records. These groups should be formed based on similarity of content and policy requirements (e.g. the data must be held for the minimum period).
  • Each data type will have a set of security requirements based on its classification level. Thus, when deciding on your data types, you will need to strike a balance between granularity and practicality. Each record within a given data type will be treated the same way – so you need to make sure your data types are granular enough to accommodate differing security and privacy requirements, but not so granular that records cannot be grouped at all.
  • Organizational size and maturity, as well as the number of records the organization has, may affect how data types are determined. Consider the examples on the following slides to get a sense of how data can be bundled together or separated. Identifying data types will also help you to set data retention requirements, which will be influenced by data type (more on this later).

Info-Tech Insight

Personal data is high risk and should be treated as such. Some business leaders will perceive indefinite retention as a benefit for business intelligence reasons (there is always another potential use for data). However useful it may be, unnecessary personal data will cause additional headaches in the event of a breach. If retention is deemed worth the risk, such data should be anonymized or obfuscated and then archived.

Examples of data types in a typical organization

The list below provides some example types of records/data that your organization may be responsible for. This list is not comprehensive and not applicable to every organization. Use this list as a starting point for documenting your organization's data and corresponding retention requirements.

Example table of data types. Column headers are 'Example Department' and 'Example Record/Data Type'. Example departments are 'Accounting', 'Human Resources & Payroll', 'Marketing', and 'Corporate/Legal'. Example types include 'AR/AP Ledger', 'Complaints', 'Marketing Strategy', and 'Contracts'.

Examples of data types in a typical organization (continued)

The list below provides some example types of records/data that your organization may be responsible for. This list is not comprehensive and not applicable to every organization. Use this list as a starting point for documenting your organization's data and corresponding retention requirements.

Example table of data types. Column headers are 'Example Department' and 'Example Record/Data Type'. Example departments are 'Shipping & Receiving', 'IT/Security', 'Purchasing & Sales', and 'Facilities'. Example types include 'Freight Bills', 'Incident Reports', 'Requisitions', and 'Construction Records'.

1.3 Identify data repositories

1 hour

Input: Ask participants to identify existing data inventories. Ask participants to identify data repositories from the existing inventories. Ask participants to identify other data repositories.

Output: Documented data repositories

Materials: Whiteboard/flip charts, Sticky notes, Pen/marker

Participants: IT representative, Security officer, Privacy officer, Legal counsel, Senior management team (optional), Business representative (optional)

  1. Bring together relevant stakeholders from the organization. This can include those mentioned in the participants list.
  2. Identify and document existing data inventories.
  3. Identify and document data repositories from the existing inventories.
  4. Identify and document other data repositories.
  5. Document all the information above in the Data Retention Schedule and Risk Identification Tool as shown below.

Download the Data Retention Schedule and Risk Identification Tool

Data repositories

  • Data tends to sprawl from one location to another as different stakeholders use it for different purposes. However, regardless of its location, we must adhere to the data retention schedule. Knowing where to look is half the battle.
  • Use your data-flow maps to identify storage locations for every instance of your data.
  • If your organization has a personal data inventory that was built and used for personal data protection purposes, you can obtain the relevant information about data repositories from the inventory.
    • A personal data inventory normally includes data elements such as personal data types, lawful basis, processing purposes, where data resides, etc.
  • If your organization has a data inventory that was built and used for backup and disaster recovery purposes, you can obtain the relevant information about data repositories from the inventory.
    • A data backup and disaster recovery inventory includes data elements such as repository ID, repository name, repository description, size of the repository, repository owner, etc.

Info-Tech Insight

Establish a single source of truth for your data. This will allow you to go to the source and delete the first instance of the data (as per your retention schedule), and then plan to purge the secondary, tertiary, etc. instances regularly.

Leverage existing data inventories to identify data repositories

Personal data mapping inventory and data disaster recovery inventory are the two common examples of sources where you can find information around data repositories.

Name of business process Purpose of processing Lawful basis Functional personal data categories Data subject categories Data sensitivity categorization Which system the data resides in
Enter the name and description of the business process. Specify the purpose of the processing activity in the drop-down menu. Specify the basis for lawfulness of processing. Enter the categories of data processing. If applicable, enter multiple data processing categories. Indicate the data subject categories. If applicable, enter multiple data subject categories. Leverage outcome of risk map activity to input the sensitivity level of the personal data collected within the specific business process. Indicate the data subject categories. If applicable, enter multiple data subject categories.
Repository ID Repository name Description Size of repository (GB) Owner
1 Network Drive Used for file sharing across local network. Located on the NAS. Primarily used by the sales and marketing teams for internal documents. 1350 John Smith
2 SharePoint On-premises document management and collaboration tool. Stores proprietary IP and will remain on-premises for the next three years. Primarily used by the R&D team. 300 John Smith
3 Dropbox Business Personal cloud storage. Dropbox capabilities are given to users on an as-needed basis. Primarily used by marketing and sales department for mobile use. 800 John Smith
4 EMC Isilon X Series Scale-out NAS storage. Responsible for file shares, HPC, backup, Hadoop analytics, mobile, and cloud apps. Used by pricing team to set dynamic product pricing through the day. 20000 John Smith

Identify your data repository by building data flows

  • Think about the major business processes that make up your operations, then refine them by the common set of data types within subprocesses.
    • What business processes support your operations?
    • What is the purpose of these business processes?
    • What data is collected?
    • Where does the data reside?
    • Is the data shared with third parties?
  • Determine the appropriate level of granularity with your processing activities.
  • Knowing is half the battle. Ensure that all high-level gaps identified via this method are assessed for risk.
Example: Data Flow Diagram – eCommerce Workflow Processing (Use)
Example data flow diagram of eCommerce Workflow Processing (Use).

1.4 Develop a data retention policy

1-2 hours

Input: Regulatory and legal requirements, Business stakeholders’ understanding of data elements and retention requirements

Output: Formalized data retention policy

Materials: Data Retention Policy Template

Participants: Privacy officer, Core privacy team, InfoSec representative (optional), IT representative (optional)

Be mindful that data should be retained for as short a time period as possible. Once retained for the requisite time period, processes should exist to anonymize or erase the data.

  1. Download Info-Tech’s Data Retention Policy Template and customize it to define your:
    • Definitions
    • Policy statements
    • Exceptions
    • Governing laws and regulations
    • Non-compliance actions

Download Info-Tech’s Data Retention Policy Template

Sample of Info-Tech's Data Retention Policy Template.

Data retention policy

A typical data retention policy includes core components such as purpose, scope, policy statements, etc. Examples are shown below.

Purpose

The data retention policy is designed to outline the data retention requirements, in alignment with laws and regulations. Retention of certain data may be required by law or permitted for designated purposes. However, defining and managing the length of time that data is retained allows for organizations to eliminate data that no longer requires storage, security, and resources.

Scope

The data retention policy is applicable to all data users within the organization. Data users include all employees, including full-time and part-time staff, contractors, interns, volunteers, and any user with access to organizational data and systems. The policy applies to all data processed by the organization.

Policy Statements

  1. The organization employs the principle of data minimization by only processing personal data for purposes that are adequate, relevant, and limited to what is necessary.
    1. The purpose of processing each data type is documented formally in the Record of Processing.
  2. Data is stored for the shortest possible time, taking into consideration any legal obligations to retain data for an extended period of time.
  3. Personal data may be stored for longer periods if it will be processed solely for archiving purposes in the public interest, scientific or historical research purposes, or statistical purposes.
    1. In the event of prolonged storage, the organization must document the purpose as the organization must implement and maintain technical and organizational measures to protect the said data. Technical measures may include anonymization, encryption, and other controls.
  4. The organization has established a retention schedule to document all data processed, taking into account any reasons why data must be retained. For each data type, the schedule also documents the following:
  5. The retention period defined in the retention schedule dictates when the data will be erased/sufficiently anonymized or when the storage requirement will be reviewed.
  6. When the data meets the end of the permitted retention period, it may be erased or sufficiently anonymized. Anonymization may include practices such as the following:
    1. Deleting specific data element(s) or unique identifier(s) that would otherwise identify the data subject.
    2. Separating personal data from non-identifying data (e.g. separating order number from name/address).
    3. Aggregating personal data of enough individuals so that specific data cannot be attributed to a specific individual.
  7. Data backups will be executed in accordance with the backup and recovery schedule and executed by IT.
  8. The organization ensures that data stored is accurate and kept up-to-date through the means of auditing, review processes, and the implementation of security controls (e.g. integrity monitoring).

1.5 Determine roles and responsibilities

Complete a RACI matrix using the Data Retention RACI Tool to help you assign high-level accountability and responsibility.

Info-Tech Insight

Data retention is often relegated to IT Operations or IT Security to carry out. However, it requires partnership with data owners throughout the business. Using the RACI can help to understand who needs to take care of which jobs.

Download the Data Retention RACI Tool

Sample RACI matrix from the Data Retention RACI Tool.

Build an Effective Data Retention Program

Phase 2

Complete data retention schedule

Phase 1

1.1 Identify and document data retention laws and regulations

1.2 Identify data types and sensitive data

1.3 Identify data repositories

1.4 Develop a data retention policy

1.5 Determine roles and responsibilities

Phase 2

2.1 Complete data retention schedule

2.2 Plan to address risky data types

Phase 3

3.1 Identify cases where manual retention is necessary

This phase will walk you through the following activities:

  • Complete data retention schedule
  • Plan to address risky data types

Outcomes of this phase

  • A clear retention schedule
  • Risk treatment plan

2.1 Develop a data retention schedule

1-2 hours

Input: Regulatory and legal requirements, Business stakeholders’ understanding of data elements and retention requirements

Output: Formalized data retention schedule

Materials: Data Retention Schedule and Risk Identification Tool

Participants: Privacy officer, Core privacy team, InfoSec representative (optional), IT representative (optional)

Use this tool to document the retention requirements for each data type and to identify data types with retention-related risk.

  1. Download the tool using the link below and review the sample data on each tab to get a sense of what the tool’s outcome will be.
  2. Customize the information on Tab 1 to match the conventions of your organization (e.g. department names, data classification tiers, data repositories used, and purposes of processing).
  3. Complete Tab 2 by working left to right and inputting the requested details for each data type.
  4. Progress to Tab 4 to review the dashboard and identify high-risk data types.

Download Info-Tech's Data Retention Schedule and Risk Identification Tool

2.1.1 Set up the Data Retention Schedule and Risk Identification Tool

1) Use this section to list all of the in-scope departments for your data retention initiative.

2) Indicate the data classification labels used by your organization (optional)

3) List all repositories where your in-scope data resides.

Sample tables from the Data Retention Schedule and Risk Identification Tool.

4) Customize the list for purposes of processing. These will help you justify the retention decisions you have made.

Info-Tech Insight

Data is complex and changes over time. Be sure to review your data retention decisions and recordkeeping at least annually.

2.1.2 Define your data types and their repositories

Sample data retention schedule with two sections outlined. Column headers in section 1 are 'Process', 'Data Types', 'Record Contains Personal Data (Y/N)', and 'Data Owner (Name/Role)', and in section 2 are 'Retention', 'Auditability', 'Repository 1', and 'Automation Level'.
1) Record the data type and the process that collects it as well as whether it contains PII or PHI and who owns the data. 2) Indicate whether retention has been applied for the data types and if it can be audited, and list up to three repositories for the data and the extent to which they can be automated for data deletion.

Info-Tech Insight

More repositories mean more risk. If a data type resides in more than three repositories, you should consider your data flow to help reduce risk and simplify your retention-deletion process.

2.1.3 Determine the value, retention time, and disposition for your data types

Sample data retention schedule continued with two different sections outlined. Column headers in section 3 are 'Opportunity', 'Trigger', 'Minimum Retention Time', and 'Perpetual Retention Justification', and in section 2 are 'Disposition', 'Lawful Basis of Processing Data', 'Purpose of Processing', and 'Data Sensitivity'. One of the 'Min Ret Time' fields is highlighted in red, its contents are 'Perpetual'.
3) Determine the level of opportunity for the insights the data may carry (i.e. business value), state the trigger (action or event) that causes the data type to move from active use to retention, and record the retention time (if perpetual, the cell will turn red to note the risk). If perpetual retention is used, provide a justification in the adjacent column. 4) Describe the disposition for the data type (deletion instructions), note the lawful basis for retaining that data type, indicate the purpose of processing (why data is retained), and data sensitivity.

2.1.4 Record the sensitivity and classification of your data types

5) Indicate the sensitivity of the data type from the dropdown list and indicate the classification level used by your organization.Sample data retention schedule continued with one different section outlined. Column headers in section 5 are 'Classification Level' and 'Other Notes'.

Info-Tech Insight

Retention programs without data classification tend to struggle. If you do not have an established data classification scheme, you can still proceed based on your determination of data sensitivity. However, data classification will help you to prioritize risky data types and repositories for improvements. Thus, plan on establishing a data classification scheme if one is not already in place.

2.1.5 Monitor your data retention program

Tab 4 will provide you with the following details based on your inputs on Tab 2. The dashboard will help you monitor your data retention program at a glance. Also, you will be able to use the filters provided for “Process,” “Record Types,” “Risk Rating,” and “Classification” to help you make a prioritized list of data types and repositories that present the greatest level of risk.

The risk rating has been calculated based on:

  • Whether the data type contains PII
  • The data’s overall sensitivity
  • Whether or not retention has been applied
  • The extent to which it can be audited
  • The repository’s capacity for automated retention/deletion
  • Whether or not perpetual retention is used.
Output dashboards on Tab 4 of the Data Retention Schedule and Risk Identification Tool

2.2 Plan to address risky data types

1 hour

Input: Regulatory and legal requirements, Business stakeholders’ understanding of data elements and retention requirements

Output: Prioritized risk-treatment strategy, List of manual processes

Materials: Data Retention Schedule and Risk Identification Tool, Sticky notes

Participants: Privacy officer, Core privacy team, InfoSec representative (optional), IT representative (optional)

Use the outputs on Tab 4 of the Data Retention Schedule and Risk Identification Tool to help you plan your risk treatment strategy

  1. Review the status of your data retention program using the dashboard on Tab 4 of the tool.
  2. Using the filters on the Data Risk Summary, determine which data types pose the greatest risks to your data retention program, write each data type on a sticky note, and group them by risk rating (see next slide for an example).
  3. Rearrange your sticky notes as necessary to create a prioritized list of data types in need of risk treatment and your proposed solution (e.g. move it, secure it, automate its deletion, etc.)
    1. In spite of your best efforts, some processes may have to be managed manually. Be sure to set these cases aside, as we’ll return to them in Phase 3.

Download Info-Tech's Data Retention Schedule and Risk Identification Tool

2.2.1 Assign a risk rating to your data types

Decide which category you want to filter by, and then click the icon as shown below. Then, check or uncheck as necessary to filter the results the table shows.

Info-Tech Insight

It is unlikely that we will be able to reach a perfect solution for every risk identified. In these cases, the best tactic is to alert senior leadership of the risk and provide recommended actions for risk mitigation (e.g. move data to an automatable repository). Nevertheless, we should plan on having to use some manual processes to ensure the data retention schedule is followed.

Dropdown menu from the Risk Rating column of the Data Risk Summary table below.
Sample of the Data Risk Summary table with columns 'Process', 'Data Types', 'Risk Rating', and 'Classification'. The dropdown arrow on 'Risk Rating' is circled.

Build an Effective Data Retention Program

Phase 3

Manage manual data retention processes

Phase 1

1.1 Identify and document data retention laws and regulations

1.2 Identify data types and sensitive data

1.3 Identify data repositories

1.4 Develop a data retention policy

1.5 Determine roles and responsibilities

Phase 2

2.1 Complete data retention schedule

2.2 Plan to address risky data types

Phase 3

3.1 Identify cases where manual retention is necessary

This phase will walk you through the following activities:

  • Identify cases where manual retention is necessary

Outcomes of this phase

  • An awareness of which data types require manual processing and where they are located.
  • The ability to track retention status for all data types.
  • Processes for manual data retention are documented and repeatable.

3.1 Identify cases where manual retention is necessary

Input: List of data types that cannot be automated, Data types which cannot be semi-automated

Output: List of data types for manual retention and their repositories

Materials: Data Retention Schedule and Risk Identification Tool

Participants: Data owners, Data stewards, Data custodians, IT operations

Many organizations struggle to automate their data retention processes. When deletion cannot be done automatically, you will likely need to determine a manual process to verify that data is getting deleted at regular intervals.

  1. Using the output from the Data Retention Schedule and Risk Identification Tool (Tab 4), determine which of your data types cannot be automated or semiautomated, and select “Yes” from the dropdown menu in the column labeled “Manual Retention Required?” (See next slide for example).
  2. Enter the deletion date in the provided space and use the Manual Process Tracker on Tab 4 to help you manage these manual processes. Use the data repository information on Tab 2 to ensure you’ve deleted the data in all locations.

Download the Data Retention Schedule and Risk Identification Tool

3.1.1 Identify and track any manual processes for data retention-deletion

Sample of the extended Data Risk Summary table with columns 'Process', 'Data Types', 'Risk Rating', 'Classification', 'Opportunity', and 'Manual Process Required? (Y/N)'.1) After you have determined that a given data process/type must be managed with a manual retention-deletion process, select “Yes” from the drop-down menu in Column F. Do not use this column for any data processes that will not be managed manually
2) Once you have selected “Yes” in Column F, the process, data type, risk rating, and classification will be transferred to the Manual Process Tracker automatically. Use this section to help manage your data deletion schedule.Sample of the Manual Process Tracker table with columns 'Process', 'Data Type', 'Risk Rating', 'Classification', 'Opportunity', and 'Deletion Date'.

Info-Tech Insight

Consider keeping a separate record of your data deletions to be able to respond to audit requests. While automated processes often produce a log to demonstrate compliance with your retention schedule, manual processes typically do not.

3) The Deletion Date (Column W) is a blank field. Use this section to set the date the data is scheduled for deletion. Make sure to update this field as necessary to account for future deletions.

Summary of Accomplishment

Problem Solved

Now that you have successfully:

  • Determined your data retention requirements
  • Drafted your retention policy and schedule
  • Identified high-risk data processes and types
  • Evaluated risk-treatment options
  • Planned for manual process management

You have implemented the core elements of your data retention program. Using your requirements, your policy, and a data retention schedule, you will be able to effectively govern your data retention program. The Data Retention Schedule and Risk Identification Tool will help you to carry out the required actions to execute, manage, and improve your data retention tasks.

If you would like additional support, have our analysts guide you through other phases as part of an Info-Tech workshop.

Contact your account representative for more information.

workshops@infotech.com 1-888-670-8889

Research Contributors and Experts

Photo of Steve Ferrigni, Executive Director, Cyber Security, Risk and Enterprise Architecture, WSIB.

Steve Ferrigni
Executive Director, Cyber Security, Risk and Enterprise Architecture
WSIB

Photo of Fritz Jean Louis, Director, Information Security, The Globe and Mail.

Fritz Jean Louis
Director, Information Security
The Globe and Mail

Blank photo.

Amy Meger
Information and Cyber Governance Manager
Platte River Power Authority

Photo of Frank Sargent, Executive Director – Risk Program, Optiv.

Frank Sargent
Executive Director – Risk Program
Optiv

Blank photo.

2 anonymous contributors

Bibliography

Bean, Warren. "Policy into Practice — Strategies for Operationalizing Your Records Retention Schedule." Zasio, 8 July 2020. Web.

Callaghan, Peter. “What Is the Difference Between Data Retention and Data Preservation?” Pagefreezer, 10 Nov. 2020. Web.

“Cost of a Data Breach 2021.” IBM and Ponemon. Accessed 25 Nov. 2021.

“Data Retention Best Practices.” Neal Analytics, 25 Mar. 2022. Web.

“Email Retention Policy Best Practices for This Year." Jatheon, Feb 2021. Web.

“GDPR Data Mapping: Documenting Basis and Retention.” Soveren, 9 Mar. 2021. Web.

Harvey, Sarah. "3 Data Retention Best Practices." Kirkpatrick Price, 27 May 2019. Web.

“Information Governance Retention Schedule.” Privacy International, Apr. 2018. Web.

Kirwan, Kelly. “What is Data Retention? How to Create a Policy that Protects Privacy.” Segment, 27 Aug. 2021. Web.

“Policies and Best Practices.” Egnyte, May 2021. Web.

Qureshi, Azam. “Data Retention Policy 101.” Intradyn, Dec. 2020. Web.

“Record Keeping Requirements for Financial Entities.” FINTRAC – Government of Canada, 23 Mar. 2021. Web.

“Retention Obligations and Periods.” Meissner & Meissner, 1 Jan. 1, 2015. Web.

“Rethink Data: Put More of Your Data to Work – From Edge to Cloud.“ Seagate, 2020. Accessed 23 Nov. 2021

Steele, Kimberly. “How to Build a Data Retention Program.” BigID, 5 Nov. 2021. Web.

“The Importance of a Data Retention and Deletion Officer.“ Proton Data Security, 9 Feb. 2018.

“Top 5 Trends in Enterprise Data Archiving and eDiscovery for 2021.“Jatheon, 26 Dec. 2020. Web. Accessed 10 Nov. 2021

“UK Data Retention Guidance.” Data Protection Network, 4 June 2020. Web.

Wallen, Dave. “Data Retention Policy: What It Is and How to Create One.” Spanning Cloud Apps, 16 Dec. 2020. Web.

“What is Records Retention: Why It Is a Must for Organizations." The Number One IM Blog, 24 Feb. 2021. Web. Accessed 24 Nov. 2021.

Willman, Bryan. “Data Retention Best Practices.” Techfino, 26 Sept. 2016. Web.

About Info-Tech

Info-Tech Research Group is the world’s fastest-growing information technology research and advisory company, proudly serving over 30,000 IT professionals.

We produce unbiased and highly relevant research to help CIOs and IT leaders make strategic, timely, and well-informed decisions. We partner closely with IT teams to provide everything they need, from actionable tools to analyst guidance, ensuring they deliver measurable results for their organizations.

What Is a Blueprint?

A blueprint is designed to be a roadmap, containing a methodology and the tools and templates you need to solve your IT problems.

Each blueprint can be accompanied by a Guided Implementation that provides you access to our world-class analysts to help you get through the project.

Need Extra Help?
Speak With An Analyst

Get the help you need in this 3-phase advisory process. You'll receive 5 touchpoints with our researchers, all included in your membership.

Guided Implementation #1 - Set your governance requirements
  • Call #1 - Gather data retention requirements.

Guided Implementation #2 - Complete data retention schedule
  • Call #1 - Draft data retention schedule.
  • Call #2 - Identify risky data.

Guided Implementation #3 - Manage manual data retention processes
  • Call #1 - Prioritize data repositories for risk treatment.
  • Call #2 - Determine where manual process is necessary.

Authors

Logan Rohde

Alan Tang

Isabelle Hertanto

Contributors

  • Steve Ferrigni, Executive Director, Cyber Security, Risk and Enterprise Architecture, WSIB
  • Fritz Jean Louis, Director, Information Security, The Globe and Mail
  • Amy Meger, Information and Cyber Governance Manager, Platte River Power Authority
  • Frank Sargent, Executive Director – Risk Program, Optiv
  • 2 anonymous contributors
Visit our COVID-19 Resource Center and our Cost Management Center
Over 100 analysts waiting to take your call right now: 1-519-432-3550 x2019