

# SEC 7  How do you classify your data?
<a name="sec-07"></a>

Classification provides a way to categorize data, based on criticality and sensitivity in order to help you determine appropriate protection and retention controls.

**Topics**
+ [SEC07-BP01 Identify the data within your workload](sec_data_classification_identify_data.md)
+ [SEC07-BP02 Define data protection controls](sec_data_classification_define_protection.md)
+ [SEC07-BP03 Automate identification and classification](sec_data_classification_auto_classification.md)
+ [SEC07-BP04 Define data lifecycle management](sec_data_classification_lifecycle_management.md)

# SEC07-BP01 Identify the data within your workload
<a name="sec_data_classification_identify_data"></a>

 You need to understand the type and classification of data your workload is processing, the associated business processes, data owner, applicable legal and compliance requirements, where it’s stored, and the resulting controls that are needed to be enforced. This may include classifications to indicate if the data is intended to be publicly available, if the data is internal use only such as customer personally identifiable information (PII), or if the data is for more restricted access such as intellectual property, legally privileged or marked sensitive, and more. By carefully managing an appropriate data classification system, along with each workload’s level of protection requirements, you can map the controls and level of access or protection appropriate for the data. For example, public content is available for anyone to access, but important content is encrypted and stored in a protected manner that requires authorized access to a key for decrypting the content. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
<a name="implementation-guidance"></a>
+  Consider discovering data using Amazon Macie: Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property. 
  +  [Amazon Macie](https://aws.amazon.com/macie/) 

## Resources
<a name="resources"></a>

 **Related documents:** 
+  [Amazon Macie](https://aws.amazon.com/macie/) 
+  [Data Classification Whitepaper](https://docs.aws.amazon.com/whitepapers/latest/data-classification/data-classification.html) 
+  [Getting started with Amazon Macie](https://docs.aws.amazon.com/macie/latest/user/getting-started.html) 

 **Related videos:** 
+  [Introducing the New Amazon Macie](https://youtu.be/I-ewoQekdXE) 

# SEC07-BP02 Define data protection controls
<a name="sec_data_classification_define_protection"></a>

 Protect data according to its classification level. For example, secure data classified as public by using relevant recommendations while protecting sensitive data with additional controls. 

By using resource tags, separate AWS accounts per sensitivity (and potentially also for each caveat, enclave, or community of interest), IAM policies, AWS Organizations SCPs, AWS Key Management Service (AWS KMS), and AWS CloudHSM, you can define and implement your policies for data classification and protection with encryption. For example, if you have a project with S3 buckets that contain highly critical data or Amazon Elastic Compute Cloud (Amazon EC2) instances that process confidential data, they can be tagged with a `Project=ABC` tag. Only your immediate team knows what the project code means, and it provides a way to use attribute-based access control. You can define levels of access to the AWS KMS encryption keys through key policies and grants to ensure that only appropriate services have access to the sensitive content through a secure mechanism. If you are making authorization decisions based on tags you should make sure that the permissions on the tags are defined appropriately using tag policies in AWS Organizations.

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
<a name="implementation-guidance"></a>
+  Define your data identification and classification schema: Identification and classification of your data is performed to assess the potential impact and type of data you store, and who can access it. 
  +  [AWS Documentation](https://docs.aws.amazon.com/) 
+  Discover available AWS controls: For the AWS services you are or plan to use, discover the security controls. Many services have a security section in their documentation. 
  +  [AWS Documentation](https://docs.aws.amazon.com/) 
+  Identify AWS compliance resources: Identify resources that AWS has available to assist. 
  +  [https://aws.amazon.com/compliance/](https://aws.amazon.com/compliance/?ref=wellarchitected) 

## Resources
<a name="resources"></a>

 **Related documents:** 
+  [AWS Documentation](https://docs.aws.amazon.com/) 
+  [Data Classification whitepaper](https://docs.aws.amazon.com/whitepapers/latest/data-classification/data-classification.html) 
+  [Getting started with Amazon Macie](https://docs.aws.amazon.com/macie/latest/user/getting-started.html) 
+  [AWS Compliance](https://aws.amazon.com/compliance/) 

 **Related videos:** 
+  [Introducing the New Amazon Macie](https://youtu.be/I-ewoQekdXE) 

# SEC07-BP03 Automate identification and classification
<a name="sec_data_classification_auto_classification"></a>

 Automating the identification and classification of data can help you implement the correct controls. Using automation for this instead of direct access from a person reduces the risk of human error and exposure. You should evaluate using a tool, such as [Amazon Macie](https://aws.amazon.com/macie/), that uses machine learning to automatically discover, classify, and protect sensitive data in AWS. Amazon Macie recognizes sensitive data, such as personally identifiable information (PII) or intellectual property, and provides you with dashboards and alerts that give visibility into how this data is being accessed or moved. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
<a name="implementation-guidance"></a>
+  Use Amazon Simple Storage Service (Amazon S3) Inventory: Amazon S3 inventory is one of the tools you can use to audit and report on the replication and encryption status of your objects. 
  +  [Amazon S3 Inventory](https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-inventory.html) 
+  Consider Amazon Macie: Amazon Macie uses machine learning to automatically discover and classify data stored in Amazon S3.
  +  [Amazon Macie](https://aws.amazon.com/macie/) 

## Resources
<a name="resources"></a>

 **Related documents:** 
+  [Amazon Macie](https://aws.amazon.com/macie/) 
+  [Amazon S3 Inventory](https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-inventory.html) 
+  [Data Classification Whitepaper](https://docs.aws.amazon.com/whitepapers/latest/data-classification/data-classification.html) 
+  [Getting started with Amazon Macie](https://docs.aws.amazon.com/macie/latest/user/getting-started.html) 

 **Related videos:** 
+  [Introducing the New Amazon Macie](https://youtu.be/I-ewoQekdXE) 

# SEC07-BP04 Define data lifecycle management
<a name="sec_data_classification_lifecycle_management"></a>

 Your defined lifecycle strategy should be based on sensitivity level as well as legal and organization requirements. Aspects including the duration for which you retain data, data destruction processes, data access management, data transformation, and data sharing should be considered. When choosing a data classification methodology, balance usability versus access. You should also accommodate the multiple levels of access and nuances for implementing a secure, but still usable, approach for each level. Always use a defense in depth approach and reduce human access to data and mechanisms for transforming, deleting, or copying data. For example, require users to strongly authenticate to an application, and give the application, rather than the users, the requisite access permission to perform action at a distance. In addition, ensure that users come from a trusted network path and require access to the decryption keys. Use tools, such as dashboards and automated reporting, to give users information from the data rather than giving them direct access to the data. 

 **Level of risk exposed if this best practice is not established:** Low 

## Implementation guidance
<a name="implementation-guidance"></a>
+  Identify data types: Identify the types of data that you are storing or processing in your workload. That data could be text, images, binary databases, and so forth. 

## Resources
<a name="resources"></a>

 **Related documents:** 
+  [Data Classification Whitepaper](https://docs.aws.amazon.com/whitepapers/latest/data-classification/data-classification.html) 
+  [Getting started with Amazon Macie](https://docs.aws.amazon.com/macie/latest/user/getting-started.html) 

 **Related videos:** 
+  [Introducing the New Amazon Macie](https://youtu.be/I-ewoQekdXE) 