# Data Validation and Demand Pattern Analysis


Data Validation and Demand Pattern Analysis tools help you evaluate the quality of your data and identify key patterns influencing your demand forecasts. These insights help you understand which patterns are likely to impact demand.

**Topics**
+ [

# Data Validation
](data-validation.md)
+ [

# Demand Pattern and Recommendation
](demand-patterns.md)

# Data Validation


Data Validation is a crucial step early in the forecast creation process that ensures the input data meets the necessary quality standards for forecasting. This feature runs a series of checks on your data, surfacing data errors that need to be fixed before proceeding to forecast creation, helping you identify and resolve issues early in the process.

The data validation step is preceded by a set of preprocessing activities to prepare the data, based on the plan settings or definition, which includes the following:
+ *Aggregation to align with forecast granularity.* For example:
  + If your forecast granularity is set to weekly, daily demand history data will be aggregated to weekly totals.
  + If your demand history contains product, site, customer, and channel dimensions, but your forecast granularity is set to product-site level, the system will aggregate sales across all customers and channels for each product-site combination.
+ *Data transformations from Demand Plan settings.* These transformations are based on your Demand Planning configuration settings. For example, if you have configured the system to ignore negative values, these will be handled accordingly.
+ *Product lineage consideration*. The system takes into account product relationships, such as predecessor-successor pairs or product alternatives, as defined in your configuration.
+ *Supplementary time series transformation*. The system transforms supplementary time series data into demand drivers that can influence the forecast generation. These transformed demand drivers provide additional context to the items above. 

**Topics**
+ [

# Data Validation Process
](data-validation-process.md)
+ [

# Data Validation Report Access
](data-validation-report-access.md)
+ [

# Data Validation Error Export
](data-validation-error-export.md)
+ [

# Data Validation Rules
](data-validation-rules.md)

# Data Validation Process


After the preprocessing process described above completes, the data validation process begins. Data validation consists of three steps:

1. Data Structure Validation[Demand Planning](required_entities.md) - This step includes checks to ensure all required tables and columns exist and have data before any transformation begins. This stage confirms your data tables are properly set up.

1. Data Quality Validation - This step ensures that data content is complete and error-free. It checks for:
   + Missing values in essential fields
   + Validation checks on data formats and validity of dates
   + Data completeness required for building forecast input

   This ensures all necessary data is present and valid before proceeding with transformations.

1. Forecasting Eligibility Validation: This step ensures that sufficient data is provided to create a forecast, including:
   + Minimum historical data requirements
   + Time series length limitations
   + Other algorithm-specific constraints

   This stage ensures that your data is suitable for generating forecasts.

Even a single validation failure will stop the forecast creation process. You must work with your data administrator to correct the underlying data issues, then choose **Retry** to try forecast creation again.

# Data Validation Report Access


When creating a forecast for the first time, navigate to the **Demand Planning** module in AWS Supply Chain and choose **Create a Plan**. The system guides you through three steps: Data Ingestion, Plan Configuration, and finally, Forecast Generation. After completing data ingestion and plan configuration, choose **Generate Forecast** to initiate data validation. Each new forecast generation creates a fresh validation report based on the current state of your data.

 Data Structure validation failures (such as missing tables or columns) appear as banner messages at the top of your screen. These fundamental issues must be resolved before proceeding. After data structure validation passes, the system proceeds with Data Quality and Forecasting Eligibility validations. Any failures in these stages are detailed in the validation report, accessible by choosing **Data Validations**.

## Subsequent Forecast Creation


For subsequent forecasts, choose **Generate Forecast**. You will see a banner displaying three steps, with data validation as the first step. The same validation behavior applies. Structural issues appear as banners, while other validation failures are available in the detailed report.

## Report Content


The Data Validation Issues report provides a comprehensive view of Data Quality and Forecasting Eligibility validation failures that need to be addressed. The report displays the following:
+ Dataset: Identifies the specific dataset where the issue occurs
+ Rule: Describes the type of validation that failed
+ Error Date/Time: Shows when the error was detected
+ Status Message: Provides detailed information about the records affected and recommended actions

To help navigate and resolve these issues, you can do the following:
+ Use the search box to find specific types of errors
+ Filter by dataset using the drop-down menu
+ Download a detailed report containing all validation failures
+ View **Records affected** for each validation to understand the scope of the issue

# Data Validation Error Export


Error records can be exported by choosing **Download** on the **Data Validation** report page when the validation is checking individual data points that failed.

**Note**  
The export option is not available when the validation is checking structural, systemic, or aggregate-level requirements. 

Export is available for the following:
+ Validation checks for content or quality of existing data
+ Validations that involve checking for missing or invalid values in existing fields
+ Data Quality Validations (such as null checks, and date range validations)

**Note**  
 The system limits error record downloads to a maximum of 10,000 rows. If the total error count exceeds this limit, a notification will appear on the screen. Work with your data administrator to review and resolve all errors in the source table. 

 Export is not available for the following:
+ Validation checks for structural elements (such as table existence or column presence)
+ Validations that involve system-level constraints (such as size limits, counts, and thresholds)
+ Forecasting eligibility checks (such as time series limits or active product counts)

# Data Validation Rules


The validations performed prior to forecast creation are below. For more information, see [Demand Planning](required_entities.md).


****  

| Rule Type | Rule | Datasets | Description | Export error records? | 
| --- | --- | --- | --- | --- | 
| Data Structure Validation | Mandatory columns existence validation | Product, Outbound order line, Supplementary time series |  Verifies presence of critical columns in datasets in required datasets: Outbound order line: product\$1id, order\$1date, final\$1quantity\$1requested Product: id, description Verifies presence of critical columns in recommended datasets, if provided: Supplementary Time Series: id, order\$1date, time\$1series\$1name, time\$1series\$1value  | No | 
| Data Structure Validation | Granularity columns existence validation | Product, Outbound order line |  Verifies presence of columns set as forecast granularity, if set in the demand plan settings. Outbound order line: product\$1id, ship\$1from\$1site\$1id, ship\$1to\$1site\$1id, ship\$1to\$1site\$1address\$1city, ship\$1to\$1address\$1state, ship\$1to\$1address\$1country, channel\$1id, customer\$1tpartner\$1id Product: id, product\$1group\$1id, product\$1type, brand\$1name, color, display\$1desc, parent\$1product\$1id  | No | 
| Data Structure Validation | Active product's history validation | Product, Outbound order line,Product Alternate | Verifies that there is atleast one active product that has history on its own or through product lineage | No | 
| Data Quality Validation | Missing values in mandatory columns validation | Product, Outbound order line, Supplementary time series | Verifies for null/empty values in mandatory columns specified in Mandatory columns existence check | Yes | 
| Data Quality Validation | Missing values in granularity columns validation | Product, Outbound order line | Verifies for null/empty values in mandatory columns specified in Granularity columns existence check | Yes | 
| Data Quality Validation | Date Range validation | OutboundOrderLine, SupplementaryTimeSeries | The order\$1date column in the dataset must contain dates in a sane time range: Anywhere from 01/01/1900 00:00:00 to 12/31/2050 00:00:00.  | Yes | 
| Forecasting Eligibility Validation | Timeseries per Predictor validation | OutboundOrderLine |  The timeseries per predictor must not exceed 5,000,000.  "Timeseries per predictor" is calculated by taking the count of unique values for the product\$1id column and each of the forecast granularity columns and then taking the product of all those counts.  | No | 
| Forecasting Eligibility Validation | Count of active products validation | Product | The number of active products with records in the OOL dataset must not exceed 800,000. | No | 
| Forecasting Eligibility Validation | Historical data sufficiency validation | Outbound order line |  Verifies if at least one product in the dataset has sufficient historical demand data to generate reliable forecasts The forecast horizon must be no greater than 1/3 the time range in the dataset (if training a new auto predictor) or 1/4 the time range in the dataset (if training an existing auto predictor). There is also a global maximum forecast horizon, which is 500.  | No | 
| Forecasting Eligibility Validation | Row Count validation | Partitioned OutboundOrderLine | The number of records in the partitioned OOL dataset must not exceed 3,000,000,000. There are certain forecast models that have smaller limits that are checked here as well, if those models are being used. | No | 
| Forecasting Eligibility Validation | Maximum Timeseries validation | Partitioned OutboundOrderLine |  The number of distinct timeseries must not exceed the model's limit, if there is one.  "Distinct timeseries" is defined as the number of distinct rows in the dataset when product\$1id \$1 all forecast granularity columns are considered.  | No | 
| Forecasting Eligibility Validation |  Data Density validation  | Partitioned OutboundOrderLine |  The Data density of the dataset must be at least 5. Data density is defined as (number of distinct products in the dataset) / (total number of rows in the dataset). In other words it is "average rows per product". The rule applies only when Prophet is selected as the forecasting algorithm.  | No | 

# Demand Pattern and Recommendation


Demand Pattern and Recommendation examines the transformed historical demand input at each configured forecast granularity level (for example, product, location, or channel) to uncover underlying patterns and characteristics in your demand data. Its primary purpose is to identify key demand pattern distribution, such as smooth, intermittent, erratic, and lumpy. It also provides statistical insights about length of history and trailing 12-month demand.

The analysis automatically triggers after successful data validation during the forecast generation process, and runs in parallel with forecast creation. However, it does not block or delay the forecasting process. The Demand Pattern analysis is triggered as part of the same workflow as data validation when you initiate forecast creation. However, any data validation failure prevents both the analysis from being generated and the forecast from being created. 

By providing this analytical overview, the system helps users understand the patterns in the dataset to improve forecast accuracy. 

# Demand Patterns Components


Demand Patterns analysis happens on three dimensions:
+ Demand Patterns (based on how demand changes over time and in quantity)
+ Annual Demand (total quantity demanded over a 12-month period)
+ History Length (the time period for which historical demand data is available)

The analysis categorizes your demand patterns into four distinct types: smooth, intermittent, erratic, and lumpy. Each is determined by analyzing the frequency and variability of demand. If there are eligible in-scope products with no historical data, it is grouped under the **Zero Forecast Demand** section. For more information, see [Demand pattern](https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/overview_dp.html#demand-pattern).

The distribution of demand patterns across your products provides valuable insights into expected forecast reliability. Products with smooth demand patterns (showing consistent order volumes and frequencies) typically yield the most reliable forecasts, because their behavior is more predictable. In contrast, erratic or lumpy patterns, characterized by irregular spikes and varying order frequencies, generally result in lower forecast reliability due to their unpredictable nature. By understanding this distribution, demand planners can set appropriate expectations and take proactive measures.

The system also analyzes your trailing 12-month demand (subject to trimming configuration), also known as Annual Demand, immediately preceding your forecast start date. For example, assume the forecast start date is January 15, 2024 (Monday) and the planning bucket is weekly. The system considers the trailing 12 month analysis period to be from January 16, 2023 to January 14, 2024. The trailing 12-month demand analysis helps demand planners distinguish between active and inactive products, while identifying products transitioning between these states - patterns that directly impact forecast reliability. By focusing on recent history rather than older data patterns, you can make more informed decisions about which products need special attention or alternative forecasting approaches, particularly for cases like seasonal items, discontinued products, or items in phase-out. For more information, see [Forecast Algorithms](https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/forecast-algorithims.html).

The history length in years is calculated for each forecast granularity (for example, product-location combination) based on the earliest and latest dates available in your preprocessed historical demand data, after adjusting the dates to the default start of the period. This analysis helps determine if products have accumulated enough historical data to generate reliable forecasts, with a minimum of two years typically needed to capture seasonal patterns and long-term trends. 

![\[Raw demand history\]](http://docs.aws.amazon.com/aws-supply-chain/latest/userguide/images/raw-demand-history.png)


# Demand Patterns Recommendations


The system provides targeted recommendations based on identified demand patterns to help improve forecast accuracy. For products displaying erratic demand, characterized by irregular spikes in order volume, the system suggests incorporating potential external influences, such as promotions or price changes. In such cases, you can significantly improve forecast accuracy by collaborating with your data administrator to upload relevant demand driver data to the [https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/demand_drivers.html](https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/demand_drivers.html) table in the data lake. This additional context helps the forecasting models better understand and predict demand fluctuations. 

For products with insufficient history (less than 2 years) or no history at all, the system recommends leveraging alternate product mapping. This approach allows you to utilize the demand patterns of similar, established products to enhance forecast reliability. Work with your data administrator to upload these product relationships to the [https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/product_lineage.html](https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/product_lineage.html) table in the data lake. This is particularly important because accurate seasonality and long-term trend detection requires at least 2 full years of historical data. By mapping to alternate products with sufficient history, you can establish a more reliable forecast baseline for newer or limited-history products.

# Demand Pattern and Recommendation Report Access


## First time forecast creation


When creating a forecast for the first time, under the **Demand Planning** module in AWS Supply Chain, choose **Create a Plan**. The system guides you through three steps: Data Ingestion, Plan Configuration, and finally, Forecast Generation. After completing data ingestion and plan configuration, choose **Generate Forecast** to initiate data validation. Upon successful validation, the system performs demand pattern analysis, and you see a hyperlink to access this analysis while your forecast generates. 

## Subsequent forecast creation


For subsequent forecasts, choose **Generate Forecast**. You see a banner displaying three steps: data validation, demand pattern analysis & recommendation, and forecast creation. After data validation is successful and the demand pattern analysis is complete, access the report by choosing its hyperlink in the banner. 

## Report content


The Demand Pattern and Recommendations report provides a summary view of exploratory data analysis at your configured forecast level for a given plan. At the top of the screen, you see five key pattern cards that show how your products are distributed: Smooth patterns, Intermittent patterns, Erratic patterns, Lumpy patterns, and Products with Zero Historical Demand.

Below this summary, you can find a detailed table breaking down patterns by the highest configured level in product hierarchy in the Demand Plan Settings. For example, if your product hierarchy configuration follows pattern product id, product group id, then you will see the summary at the product group id. For each category, you can see the following:
+ \$1 Forecasts, indicating the unique time series are eligible for forecast and its percentage of total
+ The annual demand volume and its percentage of total
+ A visual breakdown of demand pattern within that category
+ A visual breakdown of the length of history available within that category

To help you navigate this information, you can do the following:
+ Use the search box to find specific product categories
+ Download a detailed report. The report contains detailed analysis for each individual forecast at your configured granularity level 
+ Sort any product category, \$1 Forecasts, and Annual Demand to focus on specific metrics. For product categories containing alphanumeric formats or blank values, using the search function may be more effective.

## Ongoing access


After each successful forecast creation, you can revisit this analysis on the **Demand Pattern** tab in the forecast review pages. In this view, the analysis responds to any filters you apply in the forecast review. The downloaded report contains analysis specific to your filtered selection.