

# Syntax analysis
<a name="how-syntax"></a>

Use syntax analysis to parse the words from the document and return the part of speech, or syntactic function, for each word in the document. You can identify the nouns, verbs, adjectives and so on in your document. Use this information to gain a richer understanding of the content of your documents, and to understand the relationship of the words in the document.

For example, you can look for the nouns in a document and then look for the verbs related to those nouns. In a sentence like "My grandmother moved her couch" you can see the nouns, "grandmother" and "couch," and the verb, "moved." You can use this information to build applications for analyzing text for word combinations that you are interested in.

To start the analysis, Amazon Comprehend parses the source text to find the individual words in the text. After the text is parsed, each word is assigned the part of speech that it takes in the source text.

Amazon Comprehend can identify the following parts of speech. 


| Token | Part of speech | 
| --- | --- | 
| ADJ | Adjective<br />Words that typically modify nouns. | 
| ADP | Adposition<br />The head of a prepositional or postpositional phrase. | 
| ADV | Adverb<br />Words that typically modify verbs. They may also modify adjectives and other adverbs. | 
| AUX | Auxiliary<br />Function words that accompanies the verb of a verb phrase. | 
| CCONJ | Coordinating conjunction<br />A coordinating conjunction connects words, phrases, or clauses in a sentence without subordinating one to the other. | 
| CONJ | Conjunction<br />A conjunction connects words, phrases, or clauses in a sentence. | 
| DET | Determiner<br />Articles and other words that specify a particular noun phrase. | 
| INTJ | Interjection<br />Words used as an exclamation or part of an exclamation. | 
| NOUN | Noun<br />Words that specify a person, place, thing, animal, or idea. | 
| NUM | Numeral<br />Words, typically determiners, adjectives, or pronouns, that express a number. | 
| O | Other<br />Words that can't be assigned a part of speech category. | 
| PART | Particle<br />Function words associated with another word or phrase to impart meaning. | 
| PRON | Pronoun<br />Words that substitute for nouns or noun phrases. | 
| PROPN | Proper noun<br />A noun that is the name of a specific individual, place or object. | 
| PUNCT | Punctuation<br />Non-alphabetical characters that delimit text. | 
| SCONJ | Subordinating conjunction<br />A conjunction that joins a dependent clause to a sentence. An example of a subordinating conjunction is "because". | 
| SYM | Symbol<br />Word-like entities such as the dollar sign ($) or mathematical symbols. | 
| VERB | Verb<br />Words that signal events and actions. | 

For more information about the parts of speech, see [Universal POS tags](http://universaldependencies.org/u/pos/) at the *Universal Dependencies* website.

The operations return tokens that identify the word and the part of speech that the word represents in the text. Each token represents a word in the source text. It provides the location of the word in the source, the part of speech that the word takes in the text, the confidence that Amazon Comprehend has that the part of speech was correctly identified, and the word that was parsed from the source text.

The following is the structure of the list of syntax tokens. One syntax token is generated for each word in the document. 

```
{
   "SyntaxTokens": [ 
      { 
         "BeginOffset": number,
         "EndOffset": number,
         "PartOfSpeech": { 
            "Score": number,
            "Tag": "string"
         },
         "Text": "string",
         "TokenId": number
      }
   ]
}
```

Each token provides the following information:
+ `BeginOffset` and `EndOffset`—Provides the location of the word in the input text. 
+ `PartOfSpeech`—Provides two pieces of information, the `Tag` that identifies the part of speech and the `Score` that represents the confidence that Amazon Comprehend Syntax has that the part of speech was correctly identifies.
+ `Text`—Provides the word that was identified.
+ `TokenId`—Provides an identifier for the token. The identifier is the position of the token in the list of tokens.