# Insights


Amazon Comprehend can analyze a document or set of documents to gather insights about it. Some of the insights that Amazon Comprehend develops about a document include:
+ [Entities](how-entities.md) – Amazon Comprehend returns a list of entities, such as people, places, and locations, identified in a document. 
+ [Events](how-events.md) – Amazon Comprehend detects speciﬁc types of events and related details. 
+ [Key phrases](how-key-phrases.md) – Amazon Comprehend extracts key phrases that appear in a document. For example, a document about a basketball game might return the names of the teams, the name of the venue, and the final score. 
+ [Personally identifiable information (PII)](pii.md) – Amazon Comprehend analyzes documents to detect personal data that identify an individual, such as an address, bank account number, or phone number. 
+ [Dominant language](how-languages.md) – Amazon Comprehend identifies the dominant language in a document. Amazon Comprehend can identify 100 languages.
+ [Sentiment](how-sentiment.md) – Amazon Comprehend determines the dominant sentiment of a document. Sentiment can be positive, neutral, negative, or mixed. 
+ [Targeted Sentiment](how-targeted-sentiment.md) – Amazon Comprehend determines the sentiment of specific entities mentioned in a document. The sentiment of each mention can be positive, neutral, negative, or mixed. 
+ [Syntax analysis](how-syntax.md) – Amazon Comprehend parses each word in your document and determines the part of speech for the word. For example, in the sentence "It is raining today in Seattle," "it" is identified as a pronoun, "raining" is identified as a verb, and "Seattle" is identified as a proper noun. 

# Entities


An *entity* is a textual reference to the unique name of a real-world object such as people, places, and commercial items, and to precise references to measures such as dates and quantities.

For example, in the text "John moved to 1313 Mockingbird Lane in 2012," "John" might be recognized as a `PERSON`, "1313 Mockingbird Lane" might be recognized as a `LOCATION`, and "2012" might be recognized as a `DATE`.

Each entity also has a score that indicates the level of confidence that Amazon Comprehend has that it correctly detected the entity type. You can filter out the entities with lower scores to reduce the risk of using incorrect detections.

The following table lists the entity types. 


| Type | Description | 
| --- | --- | 
|  COMMERCIAL\$1ITEM  | A branded product | 
|  DATE  | A full date (for example, 11/25/2017), day (Tuesday), month (May), or time (8:30 a.m.) | 
|  EVENT  | An event, such as a festival, concert, election, etc. | 
|  LOCATION  | A specific location, such as a country, city, lake, building, etc. | 
|  ORGANIZATION  | Large organizations, such as a government, company, religion, sports team, etc. | 
|  OTHER  | Entities that don't fit into any of the other entity categories | 
|  PERSON  | Individuals, groups of people, nicknames, fictional characters | 
|  QUANTITY  | A quantified amount, such as currency, percentages, numbers, bytes, etc. | 
|  TITLE  | An official name given to any creation or creative work, such as movies, books, songs, etc. | 

Detect entities operations can be performed using any of the primary languages supported by Amazon Comprehend. This includes only predefined (non-custom) entity detection. All documents must be in the same language.

You can use any of the following API operations to detect entities in a document or set of documents.
+ [DetectEntities](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DetectEntities.html)
+  [BatchDetectEntities](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_BatchDetectEntities.html)
+  [StartEntitiesDetectionJob](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_StartEntitiesDetectionJob.html)

The operations return a list of [API Entity](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_Entity.html) objects, one for each entity in the document. The `BatchDetectEntities` operation returns a list of `Entity` objects, one list for each document in the batch. The `StartEntitiesDetectionJob` operation starts an asynchronous job that produces a file containing a list of `Entity` objects for each document in the job.

The following example is the response from the `DetectEntities` operation.

```
{
    "Entities": [
        {
            "Text": "today",
            "Score": 0.97,
            "Type": "DATE",
            "BeginOffset": 14,
            "EndOffset": 19
        },
        {
            "Text": "Seattle",
            "Score": 0.95,
            "Type": "LOCATION",
            "BeginOffset": 23,
            "EndOffset": 30
        }
    ],
    "LanguageCode": "en"
}
```

# Events
Events

**Note**  
Amazon Comprehend topic modeling, event detection, and prompt safety classification features will no longer be available to new customers, effective April 30, 2026. If you would like to use these features with new accounts, please do so before this date. No action is required for accounts that have used these features within the last 12 months. For more information, see [Amazon Comprehend feature availability change](comprehend-availability-change.md).

Use *event detection* to analyze text documents for speciﬁc types of events and their related entities. Amazon Comprehend supports event detection across large collections of documents using asynchronous analysis jobs. For more information about events, including example event analysis jobs, see [ Announcing the launch of Amazon Comprehend Events ](https://aws.amazon.com/blogs/machine-learning/announcing-the-launch-of-amazon-comprehend-events/)

## Entities
Entities

From the input text, Amazon Comprehend extracts a list of entities that are related to the detected event. An *entity* can be a real-world object, such as a person, place, or location; an entity can also be a concept, such as a measurement, date, or quantity. Each occurrence of an entity is identified by a *mention*, which is a textual reference to the entity in the input text. For each unique entity, all mentions are grouped into a list. This list provides details for each location in the input text where the entity occurs. Amazon Comprehend detects only the entities associated with supported event types.

Each entity associated with a supported event type returns with the following related details:
+ **Mentions**: Details for each occurrence of the same entity in the input text.
  + **BeginOffset**: A character offset in the input text that shows where the mention begins (the first character is at position 0). 
  + **EndOffset**: A character offset in the input text that shows where the mention ends.
  + **Score**: The level of confidence that Amazon Comprehend has in the accuracy of the entity's type.
  + **GroupScore**: The level of confidence from Amazon Comprehend that the mention is correctly grouped with other mentions of the same entity.
  + **Text**: The text of the entity.
  + **Type**: The entity's type. For all supported entity types, see [Entity types](#events-entity-types).

## Events
Events

Amazon Comprehend returns the list of events (of supported event types) that it detects in the input text. Each event returns with the following related details:
+ **Type**: The event's type. For all supported event types, see [Event types](#events-types).
+ **Arguments**: A list of arguments that are related to the detected event. An *argument* consists of an entity that is related to the detected event. The argument's role describes the relationship, such as *who* did *what*, *where *and *when*.
  + **EntityIndex**: An index value that identifies an entity from the list of entities that Amazon Comprehend returned for this analysis.
  + **Role**: The argument type, which describes how the entity for this argument is related to the event. For all supported argument types, see [Argument types](#events-argument-types).
  + **Score**: The level of confidence that Amazon Comprehend has in the accuracy of the role detection.
+ **Triggers**: A list of triggers for the detected event. A *trigger* is a single word or phrase that indicates the occurrence of the event.
  + **BeginOffset**: A character offset in the input text that shows where the trigger begins (the first character is at position 0).
  + **EndOffset**: A character offset in the input text that shows where the trigger ends.
  + **Score**: The level of confidence that Amazon Comprehend has in the accuracy of the detection.
  + **Text**: The text of the trigger.
  + **GroupScore**: The level of confidence from Amazon Comprehend that the trigger is correctly grouped with other triggers for the same event.
  + **Type**: The type of event that this trigger indicates.

## Detect events results format
Detect events results format

When your event detection job completes, Amazon Comprehend writes the analysis results to the Amazon S3 output location that you specified when you started the job.

For each detected event, the output provides details in the following format:

```
{
   "Entities": [
     {
       "Mentions": [
         {
           "BeginOffset": number,
           "EndOffset": number,
           "Score": number,
           "GroupScore": number,
           "Text": "string",
           "Type": "string"
         }, ...
       ]    
     }, ...
   ],
   "Events": [
     {
       "Type": "string",
       "Arguments": [
         {                   
           "EntityIndex": number,   
           "Role": "string",
           "Score": number
         }, ...
       ],
       "Triggers": [
         {
           "BeginOffset": number,
           "EndOffset": number,
           "Score": number,
           "Text": "string",
           "GroupScore": number,
           "Type": "string"
         }, ...
       ]
     }, ...
   ]
 }
```

## Supported types for entities, events, and arguments


### Entity types


| Type | Description | 
| --- | --- | 
| DATE | Any reference to a date or time, whether specific or general. | 
| FACILITY | Buildings, airports, highways, bridges, and other permanent man-made structures and real estate improvements. | 
| LOCATION | Physical locations such as streets, cities, states, countries, bodies of water, or geographic coordinates. | 
| MONETARY\$1VALUE | The value of something in US or other currency. The value can be specific or approximate. | 
| ORGANIZATION | Companies and other groups of people defined by an established organizational structure. | 
| PERSON | The names or nicknames of individuals or fictional characters. | 
| PERSON\$1TITLE | Any title which describes a person, which is usually an employment category (such as CEO) or honorific (such as Mr.). | 
| QUANTITY | A number or value and the unit of measurement. | 
| STOCK\$1CODE | A stock ticker symbol, such as AMZN, an International Securities Identification Number (ISIN), Committee on Uniform Securities Identification Procedures (CUSIP), or Stock Exchange Daily Official List (SEDOL). | 

### Event types


| Type | Description | 
| --- | --- | 
| BANKRUPTCY | A legal proceeding involving a person or company unable to repay outstanding debts. | 
| EMPLOYMENT | Occurs when an employee is hired, fired, retired, or otherwise changes employment state.  | 
| CORPORATE\$1ACQUISTION | Occurs when a company obtains the possession of most or all of another company's shares or physical assets to gain control of that company. | 
| INVESTMENT\$1GENERAL | Occurs when a person or company purchases an asset with the prospect of generating future income or appreciation. | 
| CORPORATE\$1MERGER | Occurs when two or more companies unite to create a new legal entity.  | 
| IPO | An initial public offering (IPO) of shares of a private corporation to the public in a new stock issuance. | 
| RIGHTS\$1ISSUE | A group of rights offered to existing shareholders to purchase additional stock shares, known as subscription warrants, in proportion to their existing holdings. | 
| SECONDARY\$1OFFERING | An offer of securities by a shareholder of a company.  | 
| SHELF\$1OFFERING | A Securities and Exchange Commission (SEC) provision that allows an issuer to register a new issue of security and sell portions of the issue over a period of time without re-registering the security or incurring penalties. Also known as a shelf registration. | 
| TENDER\$1OFFERING | An offer to purchase some or all of shareholders' shares in a company. | 
| STOCK\$1SPLIT | Occurs when a company's board of directors increases the number of shares that are outstanding by issuing more shares to current shareholders. This event also applies to reverse stock splits. | 

### Argument types


**Argument types for BANKRUPTCY**  

| Argument type | Description | 
| --- | --- | 
| FILER | The person or company filing the bankruptcy.  | 
| DATE | The date or time of bankruptcy. | 
| PLACE | Location or facility where (or nearest to where) the bankruptcy took place. | 


**Argument types for EMPLOYMENT**  

| Type | Description | 
| --- | --- | 
| EMPLOYEE | The person employed by a company. | 
| EMPLOYEE\$1TITLE | The title of the employee. | 
| EMPLOYER | The person or company employing the employee. | 
| START\$1DATE | The start date or time of the employment. | 
| END\$1DATE | The end date or time of the employment. | 


**Argument types for CORPORATE\$1ACQUISTION, INVESTMENT\$1GENERAL**  

| Type | Description | 
| --- | --- | 
| AMOUNT | The monetary value associated with the transaction. | 
| INVESTEE | The person or company associated with the investment. | 
| INVESTOR | The person or company investing in the asset. | 
| DATE | The date or time of the acquisition or investment. | 
| PLACE | Location where (or nearest to where) the acquisition or investment took place. | 


**Argument types for CORPORATE\$1MERGER**  

| Type | Description | 
| --- | --- | 
| DATE | The date or time of the merger. | 
| NEW\$1COMPANY | The new legal entity resulting from the merger. | 
| PARTICIPANT | The company involved in the merger. | 


**Argument types for IPO, RIGHTS\$1ISSUE, SECONDARY\$1OFFERING, SHELF\$1OFFERING, TENDER\$1OFFERING**  

| Type | Description | 
| --- | --- | 
| EXPIRE\$1DATE | The expiration date or time of the offering. | 
| INVESTOR | The person or company investing in the asset. | 
| OFFEREE | The person or company receiving the offering. | 
| OFFERING\$1AMOUNT | The monetary value associated with the offering. | 
| OFFERING\$1DATE | The date or time of the offering. | 
| OFFEROR | The person or company initiating the offering. | 
| OFFEROR\$1TOTAL\$1VALUE | The total monetary value associated with the offering. | 
| RECORD\$1DATE | The record date or time of the offering. | 
| SELLING\$1AGENT | The person or company facilitating the sale of the offering.  | 
| SHARE\$1PRICE | The monetary value associated with the share price. | 
| SHARE\$1QUANTITY | The number of shares associated with the offering. | 
| UNDERWRITERS | The company associated with the underwriting of the offering. | 


**Argument types for STOCK\$1SPLIT**  

| Type | Description | 
| --- | --- | 
| COMPANY | The company issuing shares of the stock split. | 
| DATE | The date or time of the stock split. | 
| SPLIT\$1RATIO | The ratio of the increased new number of shares outstanding to the current number of shares before the stock split.  | 

# Key phrases


A *key phrase* is a string containing a noun phrase that describes a particular thing. It generally consists of a noun and the modifiers that distinguish it. For example, "day" is a noun; "a beautiful day" is a noun phrase that includes an article ("a") and an adjective ("beautiful"). Each key phrase includes a score that indicates the level of confidence that Amazon Comprehend has that the string is a noun phrase. You can use the score to determine if the detection has high enough confidence for your application.

Detect key phrases operations can be performed using any of the primary languages supported by Amazon Comprehend. All documents must be in the same language.

You can use any of the following operations to detect key phrases in a document or set of documents.
+ [DetectKeyPhrases](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DetectKeyPhrases.html)
+  [BatchDetectKeyPhrases](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_BatchDetectKeyPhrases.html)
+  [StartKeyPhrasesDetectionJob](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_StartKeyPhrasesDetectionJob.html)

The operations return a list of [KeyPhrase](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_KeyPhrase.html) objects, one for each key phrase in the document. The `BatchDetectKeyPhrases` operation returns a list of `KeyPhrase` objects, one for each document in the batch. The `StartKeyPhrasesDetectionJob` operation starts an asynchronous job that produces a file containing a list of `KeyPhrase` objects for each document in the job.

The following example is the response from the `DetectKeyPhrases` operation.

```
{
    "LanguageCode": "en",
    "KeyPhrases": [
        {
            "Text": "today",
            "Score": 0.89,
            "BeginOffset": 14,
            "EndOffset": 19
        },
        {
            "Text": "Seattle",
            "Score": 0.91,
            "BeginOffset": 23,
            "EndOffset": 30
        }
    ]
}
```

# Dominant language


You can use Amazon Comprehend to examine text to determine the dominant language. Amazon Comprehend identifies the language using identifiers from RFC 5646 — if there is a 2-letter ISO 639-1 identifier, with a regional subtag if necessary, it uses that. Otherwise, it uses the ISO 639-2 3-letter code. 

For more information about RFC 5646, see [Tags for identifying languages](https://tools.ietf.org/html/rfc5646) on the *IETF Tools* web site.

The response includes a score that indicates the confidence level that Amazon Comprehend has that a particular language is the dominant language in the document. Each score is independent of the other scores. The score doesn't indicate that a language makes up a particular percentage of a document.

If a long document (such as a book) contains multiple languages, you can break the long document into smaller pieces and run the `DetectDominantLanguage` operation on the individual pieces. You can then aggregate the results to determine the percentage of each language in the longer document.

Amazon Comprehend language detection has the following limitations:
+ It doesn't support phonetic language detection. For example, it doesn't detect "arigato" as Japanese or "nihao" as Chinese.
+ It may have diffuculty distinguishing close language pairs, such as Indonesian and Malay; or Bosnian, Croatian, and Serbian.
+ For best results, provide at least 20 characters of input text.

Amazon Comprehend detects the following languages.


| Code | Language | 
| --- | --- | 
| af | Afrikaans | 
| am | Amharic | 
| ar | Arabic | 
| as | Assamese | 
| az | Azerbaijani | 
| ba | Bashkir | 
| be | Belarusian | 
| bn | Bengali | 
| bs | Bosnian | 
| bg | Bulgarian | 
| ca | Catalan | 
| ceb | Cebuano | 
| cs | Czech | 
| cv | Chuvash | 
| cy | Welsh | 
| da | Danish | 
| de | German | 
| el | Greek | 
| en | English | 
| eo | Esperanto | 
| et | Estonian | 
| eu | Basque | 
| fa | Persian | 
| fi | Finnish | 
| fr | French | 
| gd | Scottish Gaelic | 
| ga | Irish | 
| gl | Galician | 
| gu | Gujarati | 
| ht | Haitian | 
| he | Hebrew | 
| ha | Hausa | 
| hi | Hindi | 
| hr | Croatian | 
| hu | Hungarian | 
| hy | Armenian | 
| ilo | Iloko | 
| id | Indonesian | 
| is | Icelandic | 
| it | Italian | 
| jv | Javanese | 
| ja | Japanese | 
| kn | Kannada | 
| ka | Georgian | 
| kk | Kazakh | 
| km | Central Khmer | 
| ky | Kirghiz | 
| ko | Korean | 
| ku | Kurdish | 
| lo | Lao | 
| la | Latin | 
| lv | Latvian | 
| lt | Lithuanian | 
| lb | Luxembourgish | 
| ml | Malayalam | 
| mt | Maltese | 
| mr | Marathi | 
| mk | Macedonian | 
| mg | Malagasy | 
| mn | Mongolian | 
| ms | Malay | 
| my | Burmese | 
| ne | Nepali | 
| new | Newari | 
| nl | Dutch | 
| no | Norwegian | 
| or | Oriya | 
| om | Oromo | 
| pa | Punjabi | 
| pl | Polish | 
| pt | Portuguese | 
| ps | Pushto | 
| qu | Quechua | 
| ro | Romanian | 
| ru | Russian | 
| sa | Sanskrit | 
| si | Sinhala | 
| sk | Slovak | 
| sl | Slovenian | 
| sd | Sindhi | 
| so | Somali | 
| es | Spanish | 
| sq | Albanian | 
| sr | Serbian | 
| su | Sundanese | 
| sw | Swahili | 
| sv | Swedish | 
| ta | Tamil | 
| tt | Tatar | 
| te | Telugu | 
| tg | Tajik | 
| tl | Tagalog | 
| th | Thai | 
| tk | Turkmen | 
| tr | Turkish | 
| ug | Uighur | 
| uk | Ukrainian | 
| ur | Urdu | 
| uz | Uzbek | 
| vi | Vietnamese | 
| yi | Yiddish | 
| yo | Yoruba | 
| zh | Chinese (Simplified) | 
| zh-TW | Chinese (Traditional) | 

You can use any of the following operations to detect the dominant language in a document or set of documents.
+  [DetectDominantLanguage](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DetectDominantLanguage.html)
+  [BatchDetectDominantLanguage](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_BatchDetectDominantLanguage.html)
+  [StartDominantLanguageDetectionJob](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_StartDominantLanguageDetectionJob.html)

The `DetectDominantLanguage` operation returns a [DominantLanguage](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DominantLanguage.html) object. The `BatchDetectDominantLanguage` operation returns a list of `DominantLanguage` objects, one for each document in the batch. The `StartDominantLanguageDetectionJob` operation starts an asynchronous job that produces a file containing a list of `DominantLanguage` objects, one for each document in the job.

The following example is the response from the `DetectDominantLanguage` operation.

```
{
    "Languages": [
        {
            "LanguageCode": "en",
            "Score": 0.9793661236763
        }
    ]
}
```

# Sentiment


Use Amazon Comprehend to determine the sentiment of content in UTF-8 encoded text documents. For example, you can use sentiment analysis to determine the sentiments of comments on a blog posting to determine if your readers liked the post.

You can determine sentiment for documents in any of the primary languages supported by Amazon Comprehend. All documents in one job must be in the same language.

Sentiment determination returns the following values:
+ **Positive** – The text expresses an overall positive sentiment.
+ **Negative** – The text expresses an overall negative sentiment.
+ **Mixed** – The text expresses both positive and negative sentiments.
+ **Neutral** – The text does not express either positive or negative sentiments.

You can use any of the following API operations to detect the sentiment of a document or a set of documents.
+ [DetectSentiment](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DetectSentiment.html)
+  [BatchDetectSentiment](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_BatchDetectSentiment.html)
+  [StartSentimentDetectionJob](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_StartSentimentDetectionJob.html)

The operations return the most likely sentiment for the text and the scores for each of the sentiments. The score represents the likelihood that the sentiment was correctly detected. For example, in the example below it is 95 percent likely that the text has a `Positive` sentiment. There is a less than 1 percent likelihood that the text has a `Negative` sentiment. You can use the `SentimentScore` to determine if the accuracy of the detection meets the needs of your application.

The `DetectSentiment` operation returns an object that contains the detected sentiment and a [SentimentScore](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_SentimentScore.html) object. The `BatchDetectSentiment` operation returns a list of sentiments and `SentimentScore` objects, one for each document in the batch. The `StartSentimentDetectionJob` operation starts an asynchronous job that produces a file containing a list of sentiments and `SentimentScore` objects, one for each document in the job.

The following example is the response from the `DetectSentiment` operation.

```
{
"SentimentScore": {
        "Mixed": 0.030585512690246105,
        "Positive": 0.94992071056365967,
        "Neutral": 0.0141543131828308,
        "Negative": 0.00893945890665054
    },
    "Sentiment": "POSITIVE",
    "LanguageCode": "en"
}
```

# Targeted sentiment


*Targeted sentiment* provides a granular understanding of the sentiments associated with specific entities (such as brands or products) in your input documents. 

The difference between targeted sentiment and [sentiment](how-sentiment.md) is the level of granularity in the output data. Sentiment analysis determines the dominant sentiment for each input document, but doesn't provide data for further analysis. Targeted sentiment analysis determines the entity-level sentiment for specific entities in each input document. You can analyze the output data to determine the specific products and services that get positive or negative feedback.

For example, in a set of restaurant reviews, a customer provides the following review: "The tacos were delicious and the staff was friendly.” Analysis of this review produces the following results: 
+ **Sentiment analysis** determines whether the overall sentiment of each restaurant review is positive, negative, neutral, or mixed. In this example, the overall sentiment is positive. 
+ **Targeted sentiment analysis** determines sentiment for entities and attributes of the restaurant that customers mention in the reviews. In this example, the customer made positive comments about “tacos” and “staff”. 

Targeted sentiment provides the following outputs for each analysis job:
+ Identity of the entities mentioned in the documents.
+ Classification of the entity type for each entity mention.
+ The sentiment and a sentiment score for each entity mention.
+ Groups of mentions (co-reference groups) that correspond to a single entity.

You can use the [console](get-started-console.md) or the [API](using-api-targeted-sentiment.md) to run targeted sentiment analysis. The console and the API support real-time analysis and asynchronous analysis for targeted sentiment.

 Amazon Comprehend supports targeted sentiment for documents in the English language. 

For additional information about targeted sentiment, including a tutorial, see [ Extract granular sentiment in text with Amazon Comprehend Targeted Sentiment](https://aws.amazon.com/blogs/machine-learning/extract-granular-sentiment-in-text-with-amazon-comprehend-targeted-sentiment/) in the AWS machine learning blog. 

**Topics**
+ [

## Entity types
](#how-targeted-sentiment-entities)
+ [

## Co-reference group
](#how-targeted-sentiment-values)
+ [

## Output file organization
](#how-targeted-sentiment-output)
+ [

## Real time analysis using the console
](#how-targeted-sentiment-console)
+ [

## Targeted sentiment output example
](#how-targeted-sentiment-example)

## Entity types


Targeted sentiment identifies the following entity types. It assigns entity type OTHER if the entity doesn’t belong in any other category. Each entity mention in the output file includes the entity type, such as `"Type": "PERSON"`.


**Entity type definitions**  

| Entity Type | Definition | 
| --- | --- | 
| PERSON | Examples include individuals, groups of people, nicknames, fictional characters, and animal names. | 
| LOCATION | Geographical locations such as countries, cities, states, addresses, geological formations, bodies of water, natural landmarks, and astronomical locations. | 
| ORGANIZATION | Examples include governments, companies, sports teams, and religions. | 
| FACILITY | Buildings, airports, highways, bridges, and other permanent man-made structures and real estate improvements. | 
| BRAND | Organization, group, or producer of a specific commercial item or line of products. | 
| COMMERCIAL\$1ITEM | Any non-generic purchasable or acquirable item, including vehicles, and large products that had only one item produced. | 
| MOVIE | A movie or television show. Entity could be the full name, a nickname, or a subtitle. | 
| MUSIC | A song, full or partial. Also, collections of individual music creations, such as an album or an anthology. | 
| BOOK | A book, published professionally or self-published. | 
| SOFTWARE | An officially released software product. | 
| GAME | A game, such as video games, board games, common games, or sports. | 
| PERSONAL\$1TITLE | Official titles and honorifics such as President, PhD, or Dr. | 
| EVENT | Examples include festival, concert, election, war, conference, and promotional event. | 
| DATE | Any reference to a date or time, whether specific or general, whether absolute or relative. | 
| QUANTITY | All measurements along with their units (currency, percent, number, bytes, etc.). | 
| ATTRIBUTE | An attribute, characteristic, or trait of an entity, such as the "quality" of a product, the "price" of a phone, or the "speed" of a CPU. | 
| OTHER | Entities that don’t belong in any of the other categories. | 

## Co-reference group


Targeted sentiment identifies co-reference groups in each input document. A co-reference group is a group of mentions in a document that correspond to one real-world entity.

**Example**  
In the following example of a customer review, “spa” is the entity, which has entity type `FACILITY`. The entity has two additional mentions as a pronoun ("it").   

![\[Targeted sentiment co-reference group.\]](http://docs.aws.amazon.com/comprehend/latest/dg/images/gs-console-targeted-sentiment4.png)


## Output file organization


The targeted sentiment analysis job creates a JSON text output file. The file contains one JSON object for each of the input documents. Each JSON object contains the following fields:
+ **Entities** – An array of entities found in the document. 
+ **File** – The file name of the input document.
+ **Line** – If the input file is one document per line, **Entities** contains the line number of the document in the file.

**Note**  
If targeted sentiment doesn't identify any entities in the input text, it returns an empty array as the Entities result.

The following example shows **Entities** for an input file with three lines of input. The input format is **ONE\$1DOC\$1PER\$1LINE**, so each line of input is a document.

```
{ "Entities":[
    {entityA},
    {entityB},
    {entityC}
    ],
  "File": "TargetSentimentInputDocs.txt",
  "Line": 0
}
{ "Entities": [
    {entityD},
    {entityE}
  ],
  "File": "TargetSentimentInputDocs.txt",
  "Line": 1
}
{ "Entities": [
    {entityF},
    {entityG}
    ],
  "File": "TargetSentimentInputDocs.txt",
  "Line": 2
}
```


An entity in the **Entities** array includes a logical grouping (called a co-reference group) of the entity mentions detected in the document. Each entity has the following overall structure:

```
{"DescriptiveMentionIndex": [0],
  "Mentions": [
     {mentionD},
     {mentionE}
    ]
}
```

An entity contains these fields:
+ **Mentions** – An array of mentions of the entity in the document. The array represents a co-reference group. See [Co-reference group](#how-targeted-sentiment-values) for an example. The order of mentions in the Mentions array is the order of their location (offset) in the document. Each mention includes the sentiment score and group score for that mention. The group score indicates the confidence level that these mentions belong to the same entity.
+ **DescriptiveMentionIndex** – One or more index into the Mentions array that provides the best name for the entity group. For example, an entity could have three mentions with **Text** values "ABC Hotel," “ABC Hotel,” and “it.” The best name is “ABC Hotel,” which has a DescriptiveMentionIndex value of [0,1]. 

Each mention includes the following fields
+ **BeginOffset** – The offset into the document text where the mention begins.
+ **EndOffset** – The offset into the document text where the mention ends.
+ **GroupScore** – The confidence that all the entities mentioned in the group relate to the same entity.
+ **Text** – The text in the document that identifies the entity.
+ **Type** – The type of the entity. Amazon Comprehend supports a variety of [entity types](#how-targeted-sentiment-entities).
+ **Score** – Model confidence that the entity is relevant. Value range is zero to one, where one is highest confidence.
+ **MentionSentiment** – Contains the sentiment and sentiment score for the mention.
+ **Sentiment** – The sentiment of the mention. Values include: POSITIVE, NEUTRAL, NEGATIVE, and MIXED. 
+ **SentimentScore** – Provides model confidence for each of the possible sentiments. Value range is zero to one, where one is highest confidence.

The **Sentiment** values have the following meaning:
+ **Positive** – The entity mention expresses a positive sentiment.
+ **Negative** – The entity mention expresses a negative sentiment.
+ **Mixed** – The entity mention expresses both positive and negative sentiments.
+ **Neutral** – The entity mention does not express either positive or negative sentiments.

In the following example, an entity has only one mention in the input document, so the DescriptiveMentionIndex is zero (the first mention in the Mentions array). The identified entity is a PERSON with the name "I." The sentiment score is neutral.

```
{"Entities":[
  {
    "DescriptiveMentionIndex": [0],
    "Mentions": [
      {
       "BeginOffset": 0,
       "EndOffset": 1,
       "Score": 0.999997,
       "GroupScore": 1,
       "Text": "I",
       "Type": "PERSON",
       "MentionSentiment": {
         "Sentiment": "NEUTRAL",
         "SentimentScore": {
           "Mixed": 0,
           "Negative": 0,
           "Neutral": 1,
           "Positive": 0
         }
       }
     }
   ]
  }
 ],
 "File": "Input.txt",
 "Line": 0
}
```

## Real time analysis using the console


You can use the Amazon Comprehend console to run [Targeted sentiment](realtime-console-analysis.md#realtime-analysis-console-targeted-sentiment) in real-time. Use the sample text or paste your own text into the input text box, then choose **Analyze**.

In the **Insights** panel, the console displays three views of the targeted sentiment analysis:
+ **Analyzed text** – Displays the analyzed text and underlines each entity. The color of the underline indicates the sentiment value (positive, neutral, negative, or mixed) that the analysis assigned to the entity. The console displays the color mappings at the top right corner of the analzed text box. If you hover your cursor over an entity, the console displays a popup panel containing analysis values (entity type, sentiment score) for the entity.
+ **Results** – Displays a table containing a row for each entity mention identified in the text. For each entity, the table shows the [entity](#how-targeted-sentiment-entities) and entity score. The row also includes the primary sentiment and the score for each sentiment value. If there are multiple mentions of the same entity, known as a [Co-reference group](#how-targeted-sentiment-values), the table displays these mentions as a collapsible set of rows associated with the main entity. 

  If you hover over an entity row in the **Results** table, the console highlights the entity mention in the **Analyzed text** panel.
+ **Application integration** – Displays the parameter values of the API request and the structure of the JSON object returned in the API response. For a description of the fields in the JSON object, see [Output file organization](#how-targeted-sentiment-output).

### Console real-time analysis example


This example uses the following text as input, which is the default input text that the console provides.

```
Hello Zhang Wei, I am John. Your AnyCompany Financial Services, LLC credit card account 1111-0000-1111-0008 has a minimum payment 
  of $24.53 that is due by July 31st. Based on your autopay settings, we will withdraw your payment on the due date from your 
  bank account number XXXXXX1111 with the routing number XXXXX0000. 
  Customer feedback for Sunshine Spa, 123 Main St, Anywhere. Send comments to Alice at sunspa@mail.com. 
  I enjoyed visiting the spa. It was very comfortable but it was also very expensive. The amenities were ok but the service made 
  the spa a great experience.
```

The **Analyzed text** panel shows the following output for this example. Hover your mouse over the text `Zhang Wei` to view the popup panel for this entity.

![\[Targeted sentiment analyzed text.\]](http://docs.aws.amazon.com/comprehend/latest/dg/images/gs-console-targeted-sentiment2.png)


The **Results** table provides additional detail about each entity, including the entity score, the primary sentiment, and the score for each sentiment. 

![\[Targeted sentiment results table.\]](http://docs.aws.amazon.com/comprehend/latest/dg/images/gs-console-targeted-sentiment3.png)


In our example, targeted sentiment analysis recognizes that each mention of **your** in the input text is a reference to the person entity **Zhang Wei**. The console displays these mentions as a set of collapsible rows associated with the main entity.

![\[Targeted sentiment results table.\]](http://docs.aws.amazon.com/comprehend/latest/dg/images/gs-console-targeted-sentiment5.png)


The **Application integration** panel displays the JSON object that the DetectTargetedSentiment API generates. See the following section for a full example.

## Targeted sentiment output example


The following example shows the output file from a targeted sentiment analysis job. The input file consists of three simple documents:

```
The burger was very flavorful and the burger bun was excellent. However, customer service was slow.
My burger was good, and it was warm. The burger had plenty of toppings.
The burger was cooked perfectly but it was cold. The service was OK.
```

The targeted sentiment analysis of this input file produces the following output.

```
  {"Entities":[
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 4,
          "EndOffset": 10,
          "Score": 0.999991,
          "GroupScore": 1,
          "Text": "burger",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "POSITIVE",
            "SentimentScore": {
              "Mixed": 0,
              "Negative": 0,
              "Neutral": 0,
              "Positive": 1
            }
          }
        }
      ]
    },
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 38,
          "EndOffset": 44,
          "Score": 1,
          "GroupScore": 1,
          "Text": "burger",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "NEUTRAL",
            "SentimentScore": {
              "Mixed": 0.000005,
              "Negative": 0.000005,
              "Neutral": 0.999591,
              "Positive": 0.000398
            }
          }
        }
      ]
    },
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 45,
          "EndOffset": 48,
          "Score": 0.961575,
          "GroupScore": 1,
          "Text": "bun",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "POSITIVE",
            "SentimentScore": {
              "Mixed": 0.000327,
              "Negative": 0.000286,
              "Neutral": 0.050269,
              "Positive": 0.949118
            }
          }
        }
      ]
    },
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 73,
          "EndOffset": 89,
          "Score": 0.999988,
          "GroupScore": 1,
          "Text": "customer service",
          "Type": "ATTRIBUTE",
          "MentionSentiment": {
            "Sentiment": "NEGATIVE",
            "SentimentScore": {
              "Mixed": 0.000001,
              "Negative": 0.999976,
              "Neutral": 0.000017,
              "Positive": 0.000006
            }
          }
        }
      ]
    }
  ],
  "File": "TargetSentimentInputDocs.txt",
  "Line": 0
}
{
  "Entities": [
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 0,
          "EndOffset": 2,
          "Score": 0.99995,
          "GroupScore": 1,
          "Text": "My",
          "Type": "PERSON",
          "MentionSentiment": {
            "Sentiment": "NEUTRAL",
            "SentimentScore": {
              "Mixed": 0,
              "Negative": 0,
              "Neutral": 1,
              "Positive": 0
            }
          }
        }
      ]
    },
    {
      "DescriptiveMentionIndex": [
        0,
        2
      ],
      "Mentions": [
        {
          "BeginOffset": 3,
          "EndOffset": 9,
          "Score": 0.999999,
          "GroupScore": 1,
          "Text": "burger",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "POSITIVE",
            "SentimentScore": {
              "Mixed": 0.000002,
              "Negative": 0.000001,
              "Neutral": 0.000003,
              "Positive": 0.999994
            }
          }
        },
        {
          "BeginOffset": 24,
          "EndOffset": 26,
          "Score": 0.999756,
          "GroupScore": 0.999314,
          "Text": "it",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "POSITIVE",
            "SentimentScore": {
              "Mixed": 0,
              "Negative": 0.000003,
              "Neutral": 0.000006,
              "Positive": 0.999991
            }
          }
        },
        {
          "BeginOffset": 41,
          "EndOffset": 47,
          "Score": 1,
          "GroupScore": 0.531342,
          "Text": "burger",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "POSITIVE",
            "SentimentScore": {
              "Mixed": 0.000215,
              "Negative": 0.000094,
              "Neutral": 0.00008,
              "Positive": 0.999611
            }
          }
        }
      ]
    },
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 52,
          "EndOffset": 58,
          "Score": 0.965462,
          "GroupScore": 1,
          "Text": "plenty",
          "Type": "QUANTITY",
          "MentionSentiment": {
            "Sentiment": "NEUTRAL",
            "SentimentScore": {
              "Mixed": 0,
              "Negative": 0,
              "Neutral": 1,
              "Positive": 0
            }
          }
        }
      ]
    },
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 62,
          "EndOffset": 70,
          "Score": 0.998353,
          "GroupScore": 1,
          "Text": "toppings",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "NEUTRAL",
            "SentimentScore": {
              "Mixed": 0,
              "Negative": 0,
              "Neutral": 0.999964,
              "Positive": 0.000036
            }
          }
        }
      ]
    }
  ],
  "File": "TargetSentimentInputDocs.txt",
  "Line": 1
}
{
  "Entities": [
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 4,
          "EndOffset": 10,
          "Score": 1,
          "GroupScore": 1,
          "Text": "burger",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "POSITIVE",
            "SentimentScore": {
              "Mixed": 0.001515,
              "Negative": 0.000822,
              "Neutral": 0.000243,
              "Positive": 0.99742
            }
          }
        },
        {
          "BeginOffset": 36,
          "EndOffset": 38,
          "Score": 0.999843,
          "GroupScore": 0.999661,
          "Text": "it",
          "Type": "OTHER",
          "MentionSentiment": {
            "Sentiment": "NEGATIVE",
            "SentimentScore": {
              "Mixed": 0,
              "Negative": 0.999996,
              "Neutral": 0.000004,
              "Positive": 0
            }
          }
        }
      ]
    },
    {
      "DescriptiveMentionIndex": [
        0
      ],
      "Mentions": [
        {
          "BeginOffset": 53,
          "EndOffset": 60,
          "Score": 1,
          "GroupScore": 1,
          "Text": "service",
          "Type": "ATTRIBUTE",
          "MentionSentiment": {
            "Sentiment": "NEUTRAL",
            "SentimentScore": {
              "Mixed": 0.000033,
              "Negative": 0.000089,
              "Neutral": 0.993325,
              "Positive": 0.006553
            }
          }
        }
      ]
    }
  ],
  "File": "TargetSentimentInputDocs.txt",
  "Line": 2
}
  }
```

# Syntax analysis


Use syntax analysis to parse the words from the document and return the part of speech, or syntactic function, for each word in the document. You can identify the nouns, verbs, adjectives and so on in your document. Use this information to gain a richer understanding of the content of your documents, and to understand the relationship of the words in the document.

For example, you can look for the nouns in a document and then look for the verbs related to those nouns. In a sentence like "My grandmother moved her couch" you can see the nouns, "grandmother" and "couch," and the verb, "moved." You can use this information to build applications for analyzing text for word combinations that you are interested in.

To start the analysis, Amazon Comprehend parses the source text to find the individual words in the text. After the text is parsed, each word is assigned the part of speech that it takes in the source text.

Amazon Comprehend can identify the following parts of speech. 


| Token | Part of speech | 
| --- | --- | 
| ADJ | Adjective Words that typically modify nouns. | 
| ADP | Adposition The head of a prepositional or postpositional phrase. | 
| ADV | Adverb Words that typically modify verbs. They may also modify adjectives and other adverbs. | 
| AUX | Auxiliary Function words that accompanies the verb of a verb phrase. | 
| CCONJ | Coordinating conjunction A coordinating conjunction connects words, phrases, or clauses in a sentence without subordinating one to the other. | 
| CONJ | Conjunction A conjunction connects words, phrases, or clauses in a sentence. | 
| DET | Determiner Articles and other words that specify a particular noun phrase. | 
| INTJ | Interjection Words used as an exclamation or part of an exclamation. | 
| NOUN | Noun Words that specify a person, place, thing, animal, or idea. | 
| NUM | Numeral Words, typically determiners, adjectives, or pronouns, that express a number. | 
| O | Other Words that can't be assigned a part of speech category. | 
| PART | Particle Function words associated with another word or phrase to impart meaning.  | 
| PRON | Pronoun Words that substitute for nouns or noun phrases. | 
| PROPN | Proper nounA noun that is the name of a specific individual, place or object. | 
| PUNCT | Punctuation Non-alphabetical characters that delimit text. | 
| SCONJ | Subordinating conjunction A conjunction that joins a dependent clause to a sentence. An example of a subordinating conjunction is "because". | 
| SYM | SymbolWord-like entities such as the dollar sign (\$1) or mathematical symbols. | 
| VERB | VerbWords that signal events and actions. | 

For more information about the parts of speech, see [Universal POS tags](http://universaldependencies.org/u/pos/) at the *Universal Dependencies* website.

The operations return tokens that identify the word and the part of speech that the word represents in the text. Each token represents a word in the source text. It provides the location of the word in the source, the part of speech that the word takes in the text, the confidence that Amazon Comprehend has that the part of speech was correctly identified, and the word that was parsed from the source text.

The following is the structure of the list of syntax tokens. One syntax token is generated for each word in the document. 

```
{
   "SyntaxTokens": [ 
      { 
         "BeginOffset": number,
         "EndOffset": number,
         "PartOfSpeech": { 
            "Score": number,
            "Tag": "string"
         },
         "Text": "string",
         "TokenId": number
      }
   ]
}
```

Each token provides the following information:
+ `BeginOffset` and `EndOffset`—Provides the location of the word in the input text. 
+ `PartOfSpeech`—Provides two pieces of information, the `Tag` that identifies the part of speech and the `Score` that represents the confidence that Amazon Comprehend Syntax has that the part of speech was correctly identifies.
+ `Text`—Provides the word that was identified.
+ `TokenId`—Provides an identifier for the token. The identifier is the position of the token in the list of tokens.