Carnegie Mellon University

Guidelines for Data Classification

Purpose

The purpose of this guideline is to establish a framework for classifying institutional data based on its level of sensitivity, value, and criticality to the university as required by the university's Information Security Policy. Classification of data will aid in determining baseline security controls for the protection of data.

Applies To

This policy applies to all faculty, staff, students, and third-party agents of the university and any other university affiliate authorized to access institutional data. In particular, this guideline applies to those who are responsible for classifying and protecting institutional data, as defined by Information Security Roles and Responsibilities.

Note: This Guideline applies to all operational and research data.

Definitions

The definitions below are for use within the Guidelines for Data Classification.

An affiliate is anyone associated with the university, including students, staff, faculty, emeritus faculty, and any sponsored guests. Most individuals affiliated with the university have an Andrew userID.

Confidential data is a generalized term typically representing data classified as restricted according to the data classification scheme defined in this guideline. This term is often used interchangeably with sensitive data.

A data steward is a senior-level employee of the university who oversees the lifecycle of one or more sets of institutional data. See the Information Security Roles and Responsibilities for more information.

Institutional data is defined as all data owned or licensed by the university. 

Non-public information is defined as any information that is classified as private or restricted information according to the data classification scheme defined in this guideline.

Sensitive data is a generalized term typically representing data classified as restricted according to the data classification scheme defined in this guideline. This term is often used interchangeably with confidential data.

Data Classification

Data classification, in the context of information security, is the classification of data based on its level of sensitivity and the impact to the university should that data be disclosed, altered, or destroyed without authorization. Data classification helps determine what baseline security controls are appropriate for safeguarding that data. All institutional data should be classified into one of four sensitivity levels or classifications:

Classification Definition Example
Restricted-Specific Data that is classified as restricted but also has additional requirements for protection based on sponsors, contracts, regulations, and/or data use agreements. Health or credit card information
Restricted Data should be classified as restricted when the unauthorized disclosure, alteration, or destruction of that data could cause a significant level of risk to the University or its affiliates. Examples of restricted data include data protected by state or federal privacy regulations and data protected by confidentiality agreements. The highest level of security controls should be applied to restricted data. Social security numbers
Private Data should be classified as private when the unauthorized disclosure, alteration, or destruction of that data could result in a moderate level of risk to the university or its affiliates. By default, all institutional data that is not explicitly classified as restricted or public should be treated as private. A reasonable level of security controls should be applied to private data. Home addresses
Public Data should be classified as public when the unauthorized disclosure, alteration, or destruction of that data would result in little or no risk to the university and its affiliates. Examples of public data include press releases, course information, and research publications. While little or no controls are required to protect the confidentiality of public data, some control is required to prevent unauthorized modification or destruction of public data. Course schedule

Classification of data should be performed by an appropriate data steward. Data stewards are senior-level university employees who govern the lifecycle of one or more sets of institutional data. See Information Security Roles and Responsibilities for more information on the data steward role and associated responsibilities.

Visit the Data Classification Workflow for a process on how to classify data.

Data Collections

Data stewards may wish to assign a single classification to a collection of data that is common in purpose or function. When classifying a data collection, the most restrictive classification of any of the individual data elements should be used. For example, if a data collection consists of a student's name, CMU email address, and social security number, the data collection should be classified as restricted even though the student's name and CMU email address may be considered public information.

Reclassification

Periodically, it is important to reevaluate the classification of institutional data to ensure the assigned classification is still appropriate based on changes to legal and contractual obligations as well as changes in the use of the data or its value to the university. This evaluation should be conducted by the appropriate data steward. Conducting an evaluation on an annual basis is encouraged; however, the data steward should determine what frequency is most appropriate based on available resources. If a data steward determines that the classification of a certain data set has changed, an analysis of security controls should be performed to determine whether existing controls are consistent with the new classification. If gaps are found in existing security controls, they should be corrected in a timely manner, commensurate with the level of risk presented by the gaps.

Calculating Classification

The goal of information security, as stated in the university's Information Security Policy, is to protect the confidentiality, integrity, and availability of institutional data. Data classification reflects the level of impact to the university if confidentiality, integrity, or availability is compromised.

Unfortunately, there is no perfect quantitative system for calculating the classification of a particular data element. In some situations, the appropriate classification may be more obvious, such as when federal laws require the university to protect certain types of data (e.g., personally identifiable information). If the appropriate classification is not inherently obvious, consider each security objective using the following table as a guide. It is an excerpt from Federal Information Processing Standards (FIPS) publication 199, published by the National Institute of Standards and Technology, which discusses the categorization of information and information systems.

POTENTIAL IMPACT
Security Objective LOW MODERATE HIGH
Confidentiality
Preserving authorized restrictions on information access and disclosure, including means for protecting personal privacy and proprietary information.
The unauthorized disclosure of information could be expected to have a limited adverse effect on organizational operations, organizational assets, or individuals. The unauthorized disclosure of information could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals. The unauthorized disclosure of information could be expected to have a severe or catastrophic adverse effect on organizational operations, organizational assets, or individuals.
Integrity
Guarding against improper information modification or destruction includes ensuring information non-repudiation and authenticity.
The unauthorized modification or destruction of information could be expected to have a limited adverse effect on organizational operations, organizational assets, or individuals. The unauthorized modification or destruction of information could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals. The unauthorized modification or destruction of information could be expected to have a severe or catastrophic adverse effect on organizational operations, organizational assets, or individuals.
Availability
Ensuring timely and reliable access to and use of information.
The disruption of access to or use of information or an information system could be expected to have a limited adverse effect on organizational operations, organizational assets, or individuals. The disruption of access to or use of information or an information system could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals. The disruption of access to or use of information or an information system could be expected to have a severe or catastrophic adverse effect on organizational operations, organizational assets, or individuals.

As the total potential impact on the university increases from low to high, data classification should become more restrictive, moving from public to restricted. If an appropriate classification is still unclear after considering these points, contact the Information Security Office for assistance.

The Information Security Office and the Office of General Counsel have defined several types of Restricted data based on state and federal regulatory requirements. This list does not encompass all types of restricted data. Predefined types of restricted information are defined as follows:


1. Authentication Verifier
An Authentication Verifier is a piece of information that is held in confidence by an individual and used to prove that the person is who they say they are. In some instances, an Authentication Verifier may be shared amongst a small group of individuals. An Authentication Verifier may also be used to prove the identity of a system or service. Examples include, but are not limited to:
  • Passwords
  • Shared secrets
  • Cryptographic private keys

Restricted
2. Covered Financial Information
See the University's Gramm-Leach-Bliley Information Security Program. Restricted Specified
3. Electronic Protected Health Information (EPHI)
EPHI is defined as any Protected Health Information (PHI) that is stored in or transmitted by electronic media. For the purpose of this definition, electronic media includes:
  • Electronic storage media includes computer hard drives and any removable and/or transportable digital memory medium, such as magnetic tape or disk, optical disk, or digital memory card.
  • Transmission media used to exchange information already in electronic storage media.  Transmission media includes, for example, the Internet, an extranet (using Internet technology to link a business with information accessible only to collaborating parties), leased lines, dial-up lines, private networks and the physical movement of removable and/or transportable electronic storage media. Certain transmissions, including of paper, via facsimile, and of voice, via telephone, are not considered to be transmissions via electronic media because the information being exchanged did not exists in electronic form before the transmission.
Restricted Specified
4. Export Controlled Materials

Export Controlled Materials are defined as any information or materials that are subject to the United States export control regulations, including, but not limited to, the Export Administration Regulations (EAR) published by the US Department of Commerce and the International Traffic in Arms Regulations (ITAR) published by the US Department of State. See the Office of Research Integrity and Compliance's FAQ on Export Control for more information.

Restricted Specified

5. Federal Tax Information (FTI)
FTI is defined as any return, return information, or taxpayer return information that is entrusted to the University by the Internal Revenue Services. See Internal Revenue Service Publication 1075 Exhibit 2 for more information.
Restricted
6. Payment Card Information

Payment card information is defined as a credit card number (also referred to as a primary account number or PAN) in combination with one or more of the following data elements:

  • Cardholder name
  • Service code
  • Expiration date
  • CVC2, CVV2, or CID value
  • PIN or PIN block
  • Contents of a credit card’s magnetic stripe

Payment Card Information is also governed by the University's PCI DSS Policy and Guidelines (login required).

Restricted Specified

7. Personally Identifiable Education Records
Personally Identifiable Education Records are defined as any Education Records that contain one or more of the following personal identifiers:
  • Name of the student
  • Name of the student’s parent(s) or other family member(s)
  • Social security number
  • Student number
  • A list of personal characteristics that would make the student’s identity easily traceable
  • Any other information or identifier that would make the student’s identity easily traceable

See Carnegie Mellon's Policy on Student Privacy Rights for more information on what constitutes an Education Record.

Restricted Specified
8. Personally Identifiable Information
For the purpose of meeting security breach notification requirements, PII is defined as a person’s first name or first initial and last name in combination with one or more of the following data elements:
  • Social security number
  • State-issued driver’s license number
  • State-issued identification card number
  • Financial account number in combination with a security code, access code, or password that would permit access to the account
  • Medical and/or health insurance information

Restricted
9. Protected Health Information (PHI)
PHI is defined as individually identifiable health information transmitted by electronic media, maintained in electronic media, or transmitted or maintained in any other form or medium by a Covered Component, as defined in Carnegie Mellon’s HIPAA Policy. PHI is considered individually identifiable if it contains one or more of the following identifiers:
  • Name
  • Address (all geographic subdivisions smaller than state, including street address, city, county, precinct, or zip code)
  • All elements of dates (except year) related to an individual, including birth date, admissions date, discharge date, date of death, and exact age if over 89)
  • Telephone numbers
  • Fax numbers
  • Electronic mail addresses
  • Social security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate/license numbers
  • Vehicle identifiers and serial numbers, including license plate number
  • Device identifiers and serial numbers
  • Universal Resource Locators (URLs)
  • Internet protocol (IP) addresses
  • Biometric identifiers, including finger and voice prints
  • Full-face photographic images and any comparable images
  • Any other unique identifying number, characteristic, or code that could identify an individual

Per Carnegie Mellon's HIPAA Policy, PHI does not include education records or treatment records covered by the Family Educational Rights and Privacy Act or employment records held by the University in its role as an employer.

Restricted Specified
10. Controlled Technical Information (CTI)
Controlled Technical Information means technical information with military or space applications that is subject to controls on the access, use, reproduction, modification, performance, display, release, disclosure, or dissemination per DFARS 252.204-7012. Restricted Specified
11. For Official Use Only (FOUO)
Documents and data labeled or marked For Official Use Only are a pre-cursor of Controlled Unclassified Information (CUI) as defined by the National Archives (NARA). Restricted Specified
12. Personal Data from European Union (EU)

The EU’s General Data Protection Regulation (GDPR) defines personal data as any information that can identify a natural person, directly or indirectly, by reference to an identifier, including:

  • Name
  • An identification number
  • Location data
  • An online identifier
  • One or more factors specific to the physical, physiological, genetic, mental, economic, cultural, or social identity of that natural person

Any personal data that is collected from individuals in European Economic Area (EEA) countries is subject to GDPR.  For questions, send an email to gdpr-info@andrew.cmu.edu

 


Restricted

13.

Controlled Unclassified Information (CUI)

 

Controlled Unclassified Information (CUI), as defined by National Archives (NARA), is a designation from the US government for information that must be protected according to specific requirements (see NIST 800-171).

CUI is an umbrella term for multiple other data types, such as Controlled Technical Information (CTI), For Official Use Only (FOUO), and Export Controlled information. Personally Identifiable Information can also be CUI when given to the University as part of a Federal government contract or sub-contract.

Restricted Specified

Revision History

Version

Date Published

Description

1.0

11/16/22

Guideline moved from the ISO site.

2.0

4/14/23

Guideline was updated and approved by the Data Stewardship Council.