

 On October 7, 2026, AWS will discontinue support for Amazon Lookout for Equipment. After October 7, 2026, you will no longer be able to access the Lookout for Equipment console or resources. For more information, [see the following](https://aws.amazon.com/blogs/machine-learning/preserve-access-and-explore-alternatives-for-amazon-lookout-for-equipment/). 

# Reviewing data ingestion
<a name="understanding-ingestion-validation"></a>

Lookout for Equipment has [ingested your data](ingest-dataset.md#ingesting-data). Now it's time to make sure everything went according to plan.

**Note**  
After ingestion, a red or green status bar will appear at the top of the console screen. Although a green status bar indicates success, there may still be issues with specific files or sensors. It is still necessary to review the data validation summary.

**Topics**
+ [Reviewing the job](when-ingestion-jobs-fail.md)
+ [Checking the files](when-files-dont-get-ingested.md)
+ [Evaluating sensor grades](reading-details-by-sensor.md)

**Next steps:**
+ If your entire job did not succeed, then a red bar has appeared at the top of the **Ingest dataset**page. In that case, it's time to [review the job](when-ingestion-jobs-fail.md).
+ If the job itself succeeded, but not every file was ingested, then you'll find yourself on the details page for your dataset, with an error message indicating that there was a problem ingesting certain files. In that case, it's time to [check the files](when-files-dont-get-ingested.md).
+ If you did not receive any error messages regarding the ingestion job as a whole, or with issues with ingesting specific files, then it's time to look at your data's [details by sensor](reading-details-by-sensor.md).
+ If you want to make changes to your dataset based on what you've learned so far, and then re-ingest it, skip to [replacing your dataset](replacing-your-dataset.md).

# Reviewing the job
<a name="when-ingestion-jobs-fail"></a>

Few datasets are perfectly formed. Missing or incorrectly formatted values are common. Therefore, it's not feasible to fail an ingestion job because of a single error.

Lookout for Equipment operates with a bias toward complete ingestion. In other words, when it encounters a problem in the ingested data, Lookout for Equipment attempts to fix that problem automatically. Then it alerts you to whatever issues it encountered, and lets you know what fixes it implemented.

If your entire job fails, consider the following possibilities:

1. The files are not .csv files, or they are corrupted, or they are unreadable for some other reason.

1. The files were not named or organized as explained under Adding your data.

1. The files contain no data, or 100% of the data they contain is not formatted in a way that Lookout for Equipment recognizes.

If your ingestion job fails, check the issues above and make the appropriate adjustments. When you’re ready to try again, go back to [Adding your dataset](ingest-dataset.md).

**Important**  
This page is about troubleshooting the ingestion of *an entire job*. You can also read about [why some specific files don't get ingested](when-files-dont-get-ingested.md), and about [evaluating the data from specific sensors](reading-details-by-sensor.md).

## Checking the logs
<a name="job-fails-check-logs"></a>

If you enabled CloudWatch Logs, then the logs may help you troubleshoot ingestion issues. The published logs may include the following error codes:
+ COMPLETE\$1SENSOR\$1DATA\$1MISSING : A sensor has no valid data assosicated with it. The log contains the sensor name and the associated component name.
+ DATA\$1MISSING\$1IN\$1COLUMN : Data associated with a sensor is invalid at a particular timestamp. Along with the sensor name and associated component name, the log contains details about the timestamp and the associated file path.
+ UNSUPPORTED\$1DATE\$1FORMATS : A value in the timestamp column is invalid. The log contains details about the timestamp string, the path of the file, and the associated component name.
+ INSUFFICIENT\$1SENSOR\$1DATA : A sensor is associated with less than [14 days](formatting-data.md#understanding-date-range) of data. The log contains the sensor name, the component name, and the date range of data (in days) associated with the sensor.
+ DUPLICATE\$1TIMESTAMPS : A value in the timestamp column of the data is a duplicate entry. The timestamp in question and the associated file path are part of the log.
+ FILES\$1NOT\$1INGESTED : A file was not ingested during the ingestion workflow. The log contains details about the file's path.

# Checking the files
<a name="when-files-dont-get-ingested"></a>

If Lookout for Equipment fails to ingest a particular file, consider the following possibilities:
+ None of the sensors listed in the file have any data that can be ingested.
+ The file is not a .csv file, or the file is corrupted, or the file cannot be read for some other reason.

To troubleshoot files that were not ingested:

1. From the **Job details** tab of the main console page for your dataset, note the names of any files that failed the ingestion process.

1. To address issues with file formatting, see [Formatting your data](formatting-data.md#formatting-data.title).

1. To address issues with individual sensors, see [Understanding sensor quality](reading-details-by-sensor.md#reading-details-by-sensor.title).

1. When you’re ready to try again, see [Replacing your dataset](replacing-your-dataset.md#replacing-your-dataset.title).

**Important**  
This page is about troubleshooting the ingestion of *specific files*. You can also read about [why the ingestion of an entire job can fail](when-ingestion-jobs-fail.md), and about [evaluating the data from specific sensors](reading-details-by-sensor.md).

## Anticipating schema detection problems
<a name="anticipating-schema-issues"></a>

The following circumstances will lead to the failure of an entire ingestion job:
+ One or more column headers contain one or more invalid characters. 

  A single invalid character in a single column in a single file is enough to fail an entire job involving multiple files.
+ In a job consisting of a single file, that file has a formatting issue that prevents ingestion.
+ In a job consisting of multiple files, every single file has a formatting issue that prevents ingestion.

The easiest way to prevent problems with file ingestion is to take the following precautions:
+ Make sure your headers don't include any invalid characters, such as spaces.

  Valid characters are: 0-9, a-z, A-Z, and \$1 \$1 . \$1 - (hyphen) \$1 (underscore)
+ Make sure that the timestamp column is the one furthest to the left in your CSV file.
+ Make sure that you don't have any duplicated column headers.

# Evaluating sensor grades
<a name="reading-details-by-sensor"></a>

This is where you can dive deep and troubleshoot exactly why you’re getting the error codes, and make some decisions about whether you want to remove some sensors from your dataset.

Even if your ingestion job succeeds as a whole, and all your individual files also ingest successfully, you may decide not to use all the data from your sensors.

For each sensor, Lookout for Equipment tallies up the number of issues that arise. Based on how many issues occur for each sensor, Lookout for Equipment issues that sensor a grade.

**Important**  
This page is about evaluating the quality of the data coming from *specific sensors*. You can also read about [why the ingestion of an entire job can fail](when-ingestion-jobs-fail.md), and about [why the ingestion of a particular file can fail](when-files-dont-get-ingested.md).

**Sensor grades**
+ **High**

  No validation errors were detected in the data during ingestion. Data from sensors in this category is considered the most reliable for model training and evaluation.
+ **Medium**

  One or more potentially harmful validation errors were detected in the data during ingestion. Data from sensors in this category is considered less reliable for model training and evaluation.
+ **Low**

  One or more significant validation errors were detected in the data during ingestion. There's a high probability that training a model on data from sensors in this category will result in poor model performance. 


**Individual sensor errors**  

| Error | Explanation | Data quality | Action taken by Lookout for Equipment | Action recommended for customer | 
| --- | --- | --- | --- | --- | 
| No data found | No data is present for this sensor. | Low | Cannot use data from this sensor | Do not use this sensor. | 
| Insufficient data | Less than [14 days](formatting-data.md#understanding-date-range) of data provided. | Low | Lookout for Equipment cannot use data from this sensor. | This sensor cannot be used. | 
| Monotonic values detected | Data only goes up, only goes down, or remains virtually static. | Low | Lookout for equipment can use this sensor but there is a risk of high number of false positive alerts. | Review this sensor and update sensor if necessary. We recommend that you do not use monotonic sensors. | 
| Large data gaps detected | Data has at least one gap longer than 30 days. | Medium | Lookout for Equipment will forward fill all the missing values. | Review missing values and update sensor if necessary. The data gaps may cause false alerts. | 
| Multiple operating modes detected | Data shifts between ranges. | Medium | Lookout for Equipment can use this sensor but there is a risk of high number of false positive alerts. | Multiple operating modes add variability. Ensure all normal modes of operation are present in both the training dataset and the evaluation dataset. | 
| Missing values detected | Total number of missing values is above 10% | Medium | If used, the missing values will be forward filled. | Review the missing values and update the sensor if necessary. | 
| Categorical values detected | This sensor has N=<10 distinct values. | Medium | Lookout for Equipment can use this sensor but there is a risk of high number of false positive alerts. | Review categorical values and update sensor if necessary. Categorical values may lead to a higher number of false positive alerts. | 
| Constant values detected | The value does not change over time. | Medium |  | This sensor can be used, but it is not likely to add value. | 
| Non-numberical values detected | Non-numerical data is present in this sensor. | Medium | The unsupported data will be removed and treated as missing values, then forward filled. | Review the non numerical data and update sensor if necessary | 
| Duplicate timestamps detected | There are two or more rows that have the exact same timestamp.  | Medium | The last encountered data point will be ingested, and the remaining duplicates will be omitted. | Review the duplicate timestamps and update the sensor if necessary. | 

## Choosing the best sensors for your project
<a name="choosing-sensors"></a>

Use this information to decide which sensors are right for your project.

A **high-grade sensor**, from the point of view of Lookout for Equipment, is a sensor that did not trigger any errors in the table above. However, just because it's eligible to contribute doesn't mean it should. For example, suppose that the sensor is not actually attached to the asset that you're trying to monitor.

Suppose that the sensor is attached, instead, to the leg of the table that the asset sits on. The sensor might collect data related to vibration or heat, and the data it collects may not trigger any of the errors in the table above. But that doesn't mean that the data is actually useful. The data the sensor is collecting may not be relevant to the operation of your asset. Even if the data is revelant, another sensor, nearby but better positioned, may already be collecting the most useful data for that part of the asset. Just because the data from a particular sensor doesn't trigger any of the errors above, doesn't mean that it ought to be selected for your model.

A **medium-grade sensor** collects data that triggers at least one error from the table above. But that doesn't necessarily mean that you shouldn't use that sensor in your model. For example, your sensor may have been labeled as medium-grade because it duplicated a timestamp once over the course of 14 days. 

Based on your knowledge of the asset and how the data was collected, you may decide that Lookout for Equipment's method of remediation (deleting all but the first record collected for duplicate timestamps) is appropriate and productive. On the other hand, after receiving the alert, you may review the data, find many duplicate timestamps, and decide that the duplications indicate a problem with how the data was collected. You may then decide not to use data from that sensor in your model.

Data from a **low-grade sensor** contains a problem that may interfere with the accuracy of your model. We recommend that you do not include sensors with low-grade data when building your model. However, you may still choose to do so. 

**Next Steps:**
+ If you've just chosen **Create model**, then it's time to [Train your model.](create-model.md)
+ If you've changed your mind and decided to start over the data ingestion process, choose [Replace your dataset](replacing-your-dataset.md).
+ If this isn't the first time you've ingested a dataset with Lookout for Equipment, you may want to [View your ingestion history](viewing-ingestion-history.md).