

# Text Detection and Document Analysis Response Objects
<a name="how-it-works-document-layout"></a>

When Amazon Textract processes a document, it creates a list of [Block](API_Block.md) objects for the detected or analyzed text. Each block contains information about a detected item, where it's located, and the confidence that Amazon Textract has in the accuracy of the processing.

A document is made up from the following types of `Block` objects.
+ [Pages](how-it-works-pages.md)
+  [Lines and words of text](how-it-works-lines-words.md) 
+  [Form Data (Key-value pairs)](how-it-works-kvp.md) 
+  [Tables and Cells](how-it-works-tables.md) 
+ [Selection elements](how-it-works-selectables.md)
+ [Queries](queryresponse.md)
+ [Layout](layoutresponse.md)

The contents of a block depend on the operation you call. If you call one of the text detection operations, the pages, lines, and words of detected text are returned. For more information, see [Detecting Text](how-it-works-detecting.md). If you call one of the document analysis operations, information about detected pages, key-value pairs, tables, selection elements, and text is returned. For more information, see [Analyzing Documents](how-it-works-analyzing.md).

Some `Block` object fields are common to both types of processing. For example, each block has a unique identifier.

For examples that show how to use `Block` objects, see [Tutorials](examples-blocks.md).

## Document Layout
<a name="hows-it-works-blocks-types.title"></a>

Amazon Textract returns a representation of a document as a list of different types of `Block` objects that are linked in a parent-to-child relationship or a key-value pair. Metadata that provides the number of pages in a document is also returned. The following is the JSON for a typical `Block` object of type `PAGE`.

```
{
    "Blocks": [
        {
            "Geometry": {
                "BoundingBox": {
                    "Width": 1.0, 
                    "Top": 0.0, 
                    "Left": 0.0, 
                    "Height": 1.0
                }, 
                "Polygon": [
                    {
                        "Y": 0.0, 
                        "X": 0.0
                    }, 
                    {
                        "Y": 0.0, 
                        "X": 1.0
                    }, 
                    {
                        "Y": 1.0, 
                        "X": 1.0
                    }, 
                    {
                        "Y": 1.0, 
                        "X": 0.0
                    }
                ]
            }, 
            "Relationships": [
                {
                    "Type": "CHILD", 
                    "Ids": [
                        "2602b0a6-20e3-4e6e-9e46-3be57fd0844b", 
                        "82aedd57-187f-43dd-9eb1-4f312ca30042", 
                        "52be1777-53f7-42f6-a7cf-6d09bdc15a30", 
                        "7ca7caa6-00ef-4cda-b1aa-5571dfed1a7c"
                    ]
                }
            ], 
            "BlockType": "PAGE", 
            "Id": "8136b2dc-37c1-4300-a9da-6ed8b276ea97"
        }..... 
        
    ], 
    "DocumentMetadata": {
        "Pages": 1
    }
}
```

A document is made from one or more `PAGE` blocks. Each page contains a list of child blocks for the primary items detected on the page, such as lines of text and tables. For more information, see [Pages](how-it-works-pages.md). 

You can determine the type of a `Block` object by inspecting the `BlockType` field.

A `Block` object contains a list of related `Block` objects in the `Relationships` field, which is an array of [Relationship](API_Relationship.md) objects. A `Relationships` array is either of type CHILD or of type VALUE. An array of type CHILD is used to list the items that are children of the current block. For example, if the current block is of type LINE, `Relationships` contains a list of IDs for the WORD blocks that make up the line of text. An array of type VALUE is used to contain key-value pairs. You can determine the type of the relationship by inspecting the `Type` field of the `Relationship` object. 

Child blocks don't have information about their parent Block objects.

For examples that show `Block` information, see [Processing Documents Synchronously](sync.md).

## Confidence
<a name="how-it-works-confidence"></a>

Amazon Textract operations return the percentage confidence that Amazon Textract has in the accuracy of the detected item. To get the confidence, use the `Confidence` field of the `Block` object. A higher value indicates a higher confidence. Depending on the scenario, detections with a low confidence might need visual confirmation by a human.

## Geometry
<a name="how-it-works-geometry"></a>

Amazon Textract operations (except for identity analysis) return location information about the location of detected items on a document page. To get the location, use the `Geometry` field of the `Block` object. For more information, see [Locating Items on a Document Page](text-location.md). 

# Pages
<a name="how-it-works-pages"></a>

A document consists of one or more pages. A [Block](API_Block.md) object of type `PAGE` exists for each page of the document. A `PAGE` block object contains a list of the child IDs for the lines of text, key-value pairs, tables, Queries, and Query Results that are detected on the document page. 

![\[Document structure diagram showing Page containing Line, Table, Key-Value Set, Query, and Queries Result components.\]](http://docs.aws.amazon.com/textract/latest/dg/images/pages-image.png)


The JSON for a `PAGE` block looks similar to the following.

```
{

    "Geometry": .... 
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "2602b0a6-20e3-4e6e-9e46-3be57fd0844b", // Line - Hello, world.
                "82aedd57-187f-43dd-9eb1-4f312ca30042", // Line - How are you?
                "52be1777-53f7-42f6-a7cf-6d09bdc15a30", 
                "7ca7caa6-00ef-4cda-b1aa-5571dfed1a7c"   
            ]
        }
    ], 
    "BlockType": "PAGE", 
    "Id": "8136b2dc-37c1-4300-a9da-6ed8b276ea97"  // Page identifier
},
```

If you're using asynchronous operations with a multipage document that's in PDF format, you can determine the page that a block is located on by inspecting the `Page` field of the `Block` object. A scanned image (an image in JPEG, PNG, PDF, or TIFF format) is considered to be a single-page document, even if there's more than one document page on the image. Asynchronous operations always return a `Page` value of 1 for scanned images.

The total number of pages is returned in the `Pages` field of `DocumentMetadata`. `DocumentMetadata` is returned with each list of `Block` objects returned by an Amazon Textract operation.

# Lines and Words of Text
<a name="how-it-works-lines-words"></a>

Detected text that's returned by Amazon Textract operations is returned in a list of [Block](API_Block.md) objects. These objects represent lines of text or textual words that are detected on a document page. The following text shows two lines of text that are made from multiple words.

This is text.

In two separate lines.

Detected text is returned in the `Text` field of a `Block` object. The `BlockType` field determines if the text is a line of text (LINE) or a word (WORD). A *WORD* is one or more ISO basic Latin script characters that aren't separated by spaces. A *LINE* is a string of tab-delimited and contiguous words.

 Additionally, Amazon Textract will determine if a piece of text was handwritten or printed using the `TextTypes` field. These return as HANDWRITING and PRINTED respectively. 

The other `Block` properties are common to all block types, such as the ID, confidence, and geometry information. For more information, see [Text Detection and Document Analysis Response Objects](how-it-works-document-layout.md). 

To detect only lines and words, you can use [DetectDocumentText](API_DetectDocumentText.md) or [StartDocumentTextDetection](API_StartDocumentTextDetection.md). For more information, see [Detecting Text](how-it-works-detecting.md). To get the detected text (lines and words) and information about how it relates to other parts of the document, such as tables, you can use [AnalyzeDocument](API_AnalyzeDocument.md) or [StartDocumentAnalysis](API_StartDocumentAnalysis.md). For more information, see [Analyzing Documents](how-it-works-analyzing.md).

`PAGE`, `LINE`, and `WORD` blocks are related to each other in a parent-to-child relationship. A `PAGE` block is the parent for all `LINE` block objects on a document page. Because a LINE can have one or more words, the `Relationships` array for a LINE block stores the IDs for child WORD blocks that make up the line of text. 

The following diagram shows how the line *Hello, world.* in the text *Hello, world. How are you?* is represented by `Block` objects. 

![\[Diagram showing text objects "PAGE", "LINE" with two instances, "WORD" with two instances, and "Hello, world." Labels and connections depict a hierarchical structure.\]](http://docs.aws.amazon.com/textract/latest/dg/images/hieroglyph-text-detection.png)




The following is the JSON output from `DetectDocumentText` when the sentence *Hello, world. How are you?* is detected. The first example is the JSON for the document page. You can use the CHILD IDs to navigate through the document.

```
{
    "Geometry": {...}, 
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "d7fbd604-d609-4d69-857d-247a3f591238", // Line - Hello, world.
                "b6c19a93-6493-4d8e-958f-853c8f7ca055" //  Line - How are you?
            ]
        }
    ], 
    "BlockType": "PAGE", 
    "Id": "56ec1d77-171f-4881-9852-2b5b7e761608"
},
```

The following is the JSON for the LINE blocks that make up the line "Hello, World": 

```
{
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "7f97e2ca-063e-47a8-981c-8beee31afc01", // Word - Hello,
                "4b990aa0-af96-4369-b90f-dbe02538ed21"  // Word - world.
            ]
        }
    ], 
    "Confidence": 99.63229370117188, 
    "Geometry": {...}, 
    "Text": "Hello, world.", 
    "BlockType": "LINE", 
    "Id": "d7fbd604-d609-4d69-857d-247a3f591238"
},
```

The following is the JSON for the WORD block for the word *Hello,*: 

```
{
    "Geometry": {...}, 
    "Text": "Hello,", 
    "TextType": "PRINTED",
    "BlockType": "WORD", 
    "Confidence": 99.74746704101562, 
    "Id": "7f97e2ca-063e-47a8-981c-8beee31afc01"
},
```

The final JSON is the WORD block for the word *world.*:

```
{
    "Geometry": {...}, 
    "Text": "world.",
    "TextType": "PRINTED",
    "BlockType": "WORD", 
    "Confidence": 99.5171127319336, 
    "Id": "4b990aa0-af96-4369-b90f-dbe02538ed21"
},
```

# Form Data (Key-Value Pairs)
<a name="how-it-works-kvp"></a>

Amazon Textract can extract form data from documents as key-value pairs. For example, in the following text, Amazon Textract can identify a key (*Name:*) and a value (*Ana Carolina*).

*Name: Ana Carolina*

Detected key-value pairs are returned as [Block](API_Block.md) objects in the responses from [AnalyzeDocument](API_AnalyzeDocument.md) and [GetDocumentAnalysis](API_GetDocumentAnalysis.md). You can use the `FeatureTypes` input parameter to retrieve information about key-value pairs, tables, or both. For key-value pairs only, use the value `FORMS`. For an example, see [Extracting Key-Value Pairs from a Form Document](examples-extract-kvp.md). For general information about how a document is represented by `Block` objects, see [Text Detection and Document Analysis Response Objects](how-it-works-document-layout.md). 

Dates found through key-value pair detection are returned exactly as detected on the input document, with most date formats supported.

Block objects with the type KEY\$1VALUE\$1SET are the containers for KEY or VALUE Block objects that store information about linked text items detected in a document. You can use the `EntityType` attribute to determine if a block is a KEY or a VALUE. 
+ A *KEY* object contains information about the key for linked text. For example, *Name:*. A KEY block has two relationship lists. A relationship of type VALUE is a list that contains the ID of the VALUE block associated with the key. A relationship of type CHILD is a list of IDs for the WORD blocks that make up the text of the key.
+ A *VALUE* object contains information about the text associated with a key. In the preceding example, *Ana Carolina* is the value for the key *Name:*. A VALUE block has a relationship with a list of CHILD blocks that identify WORD blocks. Each WORD block contains one of the words that make up the text of the value. A `VALUE` object can also contain information about selected elements. For more information, see [Selection Elements](how-it-works-selectables.md).

Amazon Textract returns the same confidence value for both KEY and VALUE in a KEY\$1VALUE\$1SET, as both KEY and VALUE are evaluated as a pair. It returns a different confidence value for a word in WORD blocks. 

Each instance of a KEY\$1VALUE\$1SET `Block` object is a child of the PAGE `Block` object that corresponds to the current page.

The following diagram shows how the key-value pair *Name: Ana Carolina* is represented by `Block` objects.

![\[Diagram depicting the structure of a database table with a page containing keys and values, where keys are the words "Name:", "Ana", and "Carolina".\]](http://docs.aws.amazon.com/textract/latest/dg/images/hieroglyph-key-value-set.png)


The following examples show how the key-value pair *Name: Ana Carolina* is represented by JSON.

The PAGE block has CHILD blocks of type `KEY_VALUE_SET` for each KEY and VALUE block detected in the document. 

```
{
    "Geometry": .... 
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "2602b0a6-20e3-4e6e-9e46-3be57fd0844b", 
                "82aedd57-187f-43dd-9eb1-4f312ca30042", 
                "52be1777-53f7-42f6-a7cf-6d09bdc15a30", // Key - Name:
                "7ca7caa6-00ef-4cda-b1aa-5571dfed1a7c"  // Value - Ana Caroline 
            ]
        }
    ], 
    "BlockType": "PAGE", 
    "Id": "8136b2dc-37c1-4300-a9da-6ed8b276ea97"  // Page identifier
},
```

The following JSON shows that the KEY block (52be1777-53f7-42f6-a7cf-6d09bdc15a30) has a relationship with the VALUE block (7ca7caa6-00ef-4cda-b1aa-5571dfed1a7c). It also has a CHILD block for the WORD block (c734fca6-c4c4-415c-b6c1-30f7510b72ee) that contains the text for the key (*Name:*).

```
{
    "Relationships": [
        {
            "Type": "VALUE", 
            "Ids": [
                "7ca7caa6-00ef-4cda-b1aa-5571dfed1a7c"  // Value identifier
            ]
        }, 
        {
            "Type": "CHILD", 
            "Ids": [
                "c734fca6-c4c4-415c-b6c1-30f7510b72ee"  // Name:
            ]
        }
    ], 
    "Confidence": 51.55965805053711, 
    "Geometry": ...., 
    "BlockType": "KEY_VALUE_SET", 
    "EntityTypes": [
        "KEY"
    ], 
    "Id": "52be1777-53f7-42f6-a7cf-6d09bdc15a30"  //Key identifier
},
```

The following JSON shows that VALUE block 7ca7caa6-00ef-4cda-b1aa-5571dfed1a7c has a CHILD list of IDs for the WORD blocks that make up the text of the value (*Ana* and *Carolina*).

```
{
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "db553509-64ef-4ecf-ad3c-bea62cc1cd8a", // Ana
                "e5d7646c-eaa2-413a-95ad-f4ae19f53ef3"  // Carolina
            ]
        }
    ], 
    "Confidence": 51.55965805053711, 
    "Geometry": ...., 
    "BlockType": "KEY_VALUE_SET", 
    "EntityTypes": [
        "VALUE"
    ], 
    "Id": "7ca7caa6-00ef-4cda-b1aa-5571dfed1a7c" // Value identifier
}
```

The following JSON shows the `Block` objects for the words *Name:*, *Ana*, and *Carolina*.

```
{
    "Geometry": {...}, 
    "Text": "Name:", 
    "TextType": "PRINTED".
    "BlockType": "WORD", 
    "Confidence": 99.56285858154297, 
    "Id": "c734fca6-c4c4-415c-b6c1-30f7510b72ee"
},
 {
    "Geometry": {...}, 
    "Text": "Ana", 
    "TextType": "PRINTED",
    "BlockType": "WORD", 
    "Confidence": 99.52057647705078, 
    "Id": "db553509-64ef-4ecf-ad3c-bea62cc1cd8a"
}, 
{
    "Geometry": {...}, 
    "Text": "Carolina", 
    "TextType": "PRINTED",
    "BlockType": "WORD", 
    "Confidence": 99.84207916259766, 
    "Id": "e5d7646c-eaa2-413a-95ad-f4ae19f53ef3"
},
```

# Tables
<a name="how-it-works-tables"></a>

Use Amazon Textract to extract tables in a document and extract cells, merged cells, column headers, titles, section titles, footers, table type (structured or semistructured), and summary cells within a table. 

Detected tables are returned as [Block](API_Block.md) objects in the responses from [AnalyzeDocument](API_AnalyzeDocument.md) and [GetDocumentAnalysis](API_GetDocumentAnalysis.md). You can use the `FeatureTypes` input parameter to retrieve information about key-value pairs, tables, or both. For tables only, use the value `TABLES`. For an example, see [Exporting Tables into a CSV File](examples-export-table-csv.md). For general information about how a document is represented by `Block` objects, see [Text Detection and Document Analysis Response Objects](how-it-works-document-layout.md).

The following is an example of a table that could be detected by Amazon Textract.

![\[Balance sheet table showing transactions from 2022-12-24 to 2023-01-15, with starting balance of $11,000, credits of $1,040, debits of $1,040, and ending balance of $11,000 as of 2023-01-20.\]](http://docs.aws.amazon.com/textract/latest/dg/images/example_table.png)


The following diagram shows how a single cell in a table is represented by `Block` objects.

![\[Diagram depicting the structure of a table with cells, including a merged cell spanning 5 columns for the table title. The table comprises nested components like pages, cells, words, and a merged title cell.\]](http://docs.aws.amazon.com/textract/latest/dg/images/updated_table_diagram.png)


A cell contains `WORD` blocks for detected words, and where applicable, `TABLE_TITLE` blocks for table titles, `TABLE_FOOTER` blocks for table footers, and `SELECTION_ELEMENT` blocks for selection elements such as check boxes. 

The following is part of the JSON for the preceding table. The `PAGE` block object has a list of `CHILD` block IDs for the `TABLE` block and each `LINE` of text that's detected.

```
{
    "BlockType": "PAGE",
        "Geometry": {
            "BoundingBox": {
                "Width": 1.0,
                "Height": 1.0,
                "Left": 0.0,
                "Top": 0.0
                },
            },
    "Id": "8a5d3f57-97bc-4a05-b028-f72617877626",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "7499ac64-3fa9-46fd-8e3f-581ec9c316eb",
                "87ed4709-66f2-4b3d-abda-52c92a111474",
                "27a87eb3-bd21-475e-80fe-3f8e16958dcf",
                "d89894ea-2f37-4667-94b6-d90def01c5c1",
                "9f9d6383-ed6d-4bd0-ba8c-71fc3eec704e",
                "cdc74e1a-c568-439b-9eef-7bd54e060f18",
                "1b64f24c-5e84-4c7e-851a-cb1f5258a53c",
                "84a84878-04b4-4608-81b6-38117ead1629",
                ...
                "8cef603b-932e-452b-adc4-15f8e02ad1fe",
                "a3f97508-0d6b-4ae0-aa04-76078f9fe11a",
                "dd1f23c6-dfad-447b-8105-29ba136bd3a4",
                "46138f38-5b77-41a9-b068-f8394587122f",
                "a5e5247c-2637-4fa8-a271-ab46399cd77c",
                "63d7b889-71e3-422a-8cb7-2103ba0aa276",
                "033e5c86-371a-46fb-bbea-eb7f6b0cd092",
                "559b1354-ef94-4cb9-8e03-9eca83c6dba4",
                "55edc4fa-052f-40f9-9edd-739b100e6f75"
            ]
        }
    ]
},
```

To learn more about the table, access the `TABLE` block object. The table block includes four types of relationships: “Child,” “Merged Cells,” "Title," and "Footer." For relationship type `CHILD`, each child ID represents a single cell within the table. A merged cell is broken down into all the individual cells that are combined to make one merged cell. `TABLE_TITLE` and `TABLE_FOOTER` relationship types contain the block ID for the corresponding `TABLE_TITLE` and `TABLE_FOOTER` blocks, where information about the title and footer is stored. The table block type has an `EntityType` of either `STRUCTURED_TABLE` or `SEMI_STRUCTURED_TABLE` that identifies the type of table. 

 The following JSON shows that the preceding table has 65 cells for 13 rows and 5 columns, which are listed in the `CHILD` relationship `Ids` array. For relationship type `MERGED_CELL`, each merged cell ID represents a single merged cell within the table. The following JSON shows that the table has 9 merged cells, which are listed in the `MERGED_CELL` relationship `Ids` array. The two additional relationship types, `TABLE_TITLE` and `TABLE_FOOTER`, list the IDs of the respective title and footer blocks. The following JSON also shows that the table is structured in the `EntityTypes` block.

```
{
    "BlockType": "TABLE",
    "Confidence": 99.8046875,
    "Geometry": {...},
    "Id": "55edc4fa-052f-40f9-9edd-739b100e6f75",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "c1c03d64-d365-4906-af7a-a852f1acc040",
                "8b415996-6b05-4183-a959-d27d12ccef79",
                "48b0e972-7dba-4db7-896e-ca7066e8c761",
                "69948207-47d8-4825-8929-1d7abb650a88",
                "b9ac9f14-8899-43b3-8572-0e997180e0a4",
                "6f06c024-0b36-4acd-b61f-4467203234dd",
                "c8a88487-dbc7-4662-a69b-21103049b61d",
                ...
                "2b41c8e1-f754-4b37-91b6-a97cdc413f91",
                "365a1bab-0c18-4cd8-a465-6f7bc7e25e60",
                "f08af959-cfac-4ad6-a63f-2771c7a8ff62",
                "e4f6fbfd-c7d8-4f64-9102-733d4806850f",
                "68c0b8ff-fd35-41ce-ba76-de08c26084d7",
                "44e80372-aa70-4a36-9aac-3a93aaa91bb1"
            ]
        },
        {
            "Type": "MERGED_CELL",
            "Ids": [
                "a27a3ecc-afd0-4f7c-9db2-6f8e6d31c605",
                "6c02cf21-40de-4480-b755-e94462ac4884",
                "6faad856-8d37-4751-b741-c4ad8d5dcbe3",
                "d777d6e2-7430-4c6e-a261-03ec5a612c8c",
                "f0f5a9fb-5bfa-4c80-8f41-1d4fad674b09",
                "83c7af02-8128-4479-89c9-962544ad4048",
                "b2b5126c-409f-4b67-9adf-e3e12f60bf86",
                "87d7f688-3d38-4198-b491-433af0da4d8b",
                "1c2436e2-a1fc-4b2a-9e73-cc8a1ca67568"
            ]
        },
        {
            "Type": "TABLE_TITLE",
            "Ids": [
                "cde34920-0131-4e68-a3ec-82922269afd4"
            ]
        },
        {
            "Type": "TABLE_FOOTER",
            "Ids": [
                "11dfd98c-6140-49e8-a544-e220d76bdd2f",
                "ad1b9c81-3b53-4fc7-a533-dabb3d29b0b1"
            ]
        }
    ],
    "EntityTypes": [
        "STRUCTURED_TABLE"
    ]
},
```

The block type for each table cell is `CELL`. The cell block type will always have row span of 1 and column span of 1. The block object for each cell includes information about the cell location compared to other cells in the table. It also includes geometry information for the location of the cell on the document. In addition, cell blocks can have different `EntityTypes` that identify them as a particular type of cell, including TABLE\$1TITLE, TABLE\$1FOOTER, TABLE\$1SECTION\$1TITLE, COLUMN\$1HEADER, and TABLE\$1SUMMARY. For example, in the preceding table, the cell that contains the word “Date” is a column header, as shown in the following example. 

```
{
    "BlockType": "CELL",
    "Confidence": 81.8359375,
    "RowIndex": 2,
    "ColumnIndex": 1,
    "RowSpan": 1,
    "ColumnSpan": 1,
    "Geometry": {...},
    "Id": "6f06c024-0b36-4acd-b61f-4467203234dd",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "c49f55d5-a7e4-41d5-9c29-d8244f56181c"
            ]
        }
    ],
    "EntityTypes": [
        "COLUMN_HEADER"
    ]
},
```

The cell that contains the word "Deposit" is not a title, footer, column header, section title, or summary cell. This is shown by the lack of the field `"EntityTypes"`.

```
{
    "BlockType": "CELL",
    "Confidence": 86.181640625,
    "RowIndex": 7,
    "ColumnIndex": 2,
    "RowSpan": 1,
    "ColumnSpan": 1,
    "Geometry": {...},
    "Id": "7af5160b-bd60-45f5-a12c-bf376e9d742c",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "bb9bcaed-5998-44a6-9076-aa1ecc82fbc6"
            ]
        }
    ]
},
```

All the merged cells are listed under `"Type": "MERGED_CELL"` in the `TABLE` block. In the preceding example table, there are nine merged cells. 

```
{
    "Type": "MERGED_CELL",
    "Ids": [
        "a27a3ecc-afd0-4f7c-9db2-6f8e6d31c605",
        "6c02cf21-40de-4480-b755-e94462ac4884",
        "6faad856-8d37-4751-b741-c4ad8d5dcbe3",
        "d777d6e2-7430-4c6e-a261-03ec5a612c8c",
        "f0f5a9fb-5bfa-4c80-8f41-1d4fad674b09",
        "83c7af02-8128-4479-89c9-962544ad4048",
        "b2b5126c-409f-4b67-9adf-e3e12f60bf86",
        "87d7f688-3d38-4198-b491-433af0da4d8b",
        "1c2436e2-a1fc-4b2a-9e73-cc8a1ca67568"
    ]
},
```

To find specific details associated with each merged cell, go to `"BlockType": "MERGED_CELL"`. For the merged cell “Balance Sheet”, which is also a title cell, the ID associated with it is `"a27a3ecc-afd0-4f7c-9db2-6f8e6d31c605"`.

There are 5 cells that constitute this merged cell, as shown by the "ColumnSpan" of 5. To find the text within the merged cell, go further down to the `Ids` array for details on `"BlockType": "CELL"` followed by `"BlockType": "WORD"`.

```
{
    "BlockType": "MERGED_CELL",
    "Confidence": 77.44140625,
    "RowIndex": 1,
    "ColumnIndex": 1,
    "RowSpan": 1,
    "ColumnSpan": 5,
    "Geometry": {...},
    "Id": "a27a3ecc-afd0-4f7c-9db2-6f8e6d31c605",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "c1c03d64-d365-4906-af7a-a852f1acc040",
                "8b415996-6b05-4183-a959-d27d12ccef79",
                "48b0e972-7dba-4db7-896e-ca7066e8c761",
                "69948207-47d8-4825-8929-1d7abb650a88",
                "b9ac9f14-8899-43b3-8572-0e997180e0a4"
            ]
        }
    ],
    "EntityTypes": [
        "TABLE_TITLE"
    ]
},
```

On the cell level, there are 5 cells for the merged cell “Balance Sheet”. Each cell has an `EntityType` of `TABLE_TITLE` because the title was identified in the merged cell. The cell with an `Id` of `48b0e972-7dba-4db7-896e-ca7066e8c761` contains two `CHILD` relationship IDs that correspond to the `WORD` blocks that make up this merged title cell. 

```
{
    "BlockType": "CELL",
    "Confidence": 77.44140625,
    "RowIndex": 1,
    "ColumnIndex": 1,
    "RowSpan": 1,
    "ColumnSpan": 1,
    "Geometry": {...},
    "Id": "c1c03d64-d365-4906-af7a-a852f1acc040",
    "EntityTypes": [
        "TABLE_TITLE"
    ]
},
{
    "BlockType": "CELL",
    "Confidence": 77.44140625,
    "RowIndex": 1,
    "ColumnIndex": 2,
    "RowSpan": 1,
    "ColumnSpan": 1,
    "Geometry": {...},
    "Id": "8b415996-6b05-4183-a959-d27d12ccef79",
    "EntityTypes": [
        "TABLE_TITLE"
    ]
},
{
    "BlockType": "CELL",
    "Confidence": 77.44140625,
    "RowIndex": 1,
    "ColumnIndex": 3,
    "RowSpan": 1,
    "ColumnSpan": 1,
    "Geometry": {...},
    "Id": "48b0e972-7dba-4db7-896e-ca7066e8c761",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "998394ef-c6cf-491b-9bac-ec470c638ecd",
                "1c875a06-f8e5-4df7-8f6a-583c47cbd9fe"
            ]
        }
    ],
    "EntityTypes": [
        "TABLE_TITLE"
    ]
},
{
    "BlockType": "CELL",
    "Confidence": 77.44140625,
    "RowIndex": 1,
    "ColumnIndex": 4,
    "RowSpan": 1,
    "ColumnSpan": 1,
    "Geometry": {...},
    "Id": "69948207-47d8-4825-8929-1d7abb650a88",
    "EntityTypes": [
        "TABLE_TITLE"
    ]
},
{
    "BlockType": "CELL",
    "Confidence": 77.44140625,
    "RowIndex": 1,
    "ColumnIndex": 5,
    "RowSpan": 1,
    "ColumnSpan": 1,
    "Geometry": {...},
    "Id": "b9ac9f14-8899-43b3-8572-0e997180e0a4",
    "EntityTypes": [
        "TABLE_TITLE"
    ]
},
```

On the word level, there are two words, “Balance” and "Sheet." Since the first two and last two cells on columns 1, 2, 4, and 5 are blank, there are no words associated with them. This is also shown in the previous JSON output, where only the third cell contains child IDs. 

```
{
    "BlockType": "WORD",
    "Confidence": 99.95711517333984,
    "Text": "Balance",
    "TextType": "PRINTED",
    "Geometry": {...},
    "Id": "998394ef-c6cf-491b-9bac-ec470c638ecd"
},
{
    "BlockType": "WORD",
    "Confidence": 99.87372589111328,
    "Text": "Sheet",
    "TextType": "PRINTED",
    "Geometry": {...},
    "Id": "1c875a06-f8e5-4df7-8f6a-583c47cbd9fe"
},
```

The `TABLE_TITLE` and `TABLE_FOOTER` block types contain information about title and footer cells, including `CHILD` relationships that point to the `WORD` blocks that make up the title or footer. This is shown in the following JSON response. 

In this example, the title is an in-table title, meaning it is found within the structure of the table itself, as opposed to outside of the table as a floating title. This means that the title also has a `CELL` block type that contains the child IDs of the word blocks that make up the title. See the previous JSON output for the five cell blocks that comprise the merged title cell, which includes the title cell block with the child IDs of the word blocks. The footer cells for this table would also be represented by cell blocks for each footer.

```
{
    "BlockType": "TABLE_TITLE",
    "Confidence": 97.802734375,
    "Geometry": {...},
    "Id": "cde34920-0131-4e68-a3ec-82922269afd4",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "998394ef-c6cf-491b-9bac-ec470c638ecd",
                "1c875a06-f8e5-4df7-8f6a-583c47cbd9fe"
            ]
        }
    ]
},
{
    "BlockType": "TABLE_FOOTER",
    "Confidence": 88.0859375,
    "Geometry": {...},
    "Id": "11dfd98c-6140-49e8-a544-e220d76bdd2f",
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "77a70b2d-c137-4161-8d9c-65170266e5ff",
                "d413ef1f-fa1b-44cb-87ed-809494fc87d8",
                "19616f50-1a34-431f-94bf-7e575106cd85",
                "35063ea4-a3c7-4e19-9d32-10eca92807b8",
                "48de1523-7776-49ef-96d9-fc19bcde89c5"
            ]
        }
    ]
},
```

# Selection Elements
<a name="how-it-works-selectables"></a>

Amazon Textract can detect selection elements such as option buttons (radio buttons), check boxes, underlined, and circled text on a document page. Selection elements can be detected in [form data](how-it-works-kvp.md) and in [tables](how-it-works-tables.md). For example, when the following table is detected on a form, Amazon Textract detects the check boxes in the table cells.


|  |  |  |  | 
| --- |--- |--- |--- |
|     |  **Agree**  |  **Neutral**  |  **Disagree**  | 
|  **Good Service**  |  ☑  |  ☐  |  ☐  | 
|  **Easy to Use**  |  ☐  |  ☑  |  ☐  | 
|  **Fair Price**  |  ☑  |  ☐  |  ☐  | 

Detected selection elements are returned as [Block](API_Block.md) objects in the responses from [AnalyzeDocument](API_AnalyzeDocument.md) and [GetDocumentAnalysis](API_GetDocumentAnalysis.md).

Below is a table that provides examples of the different selectable types supported by Amazon Textract.


|  |  | 
| --- |--- |
|  Selectable Type  |  Example  | 
| Radio Button |  Yes ○ No ●  | 
| Checkbox | Yes ☐ No ☑ | 
| Underlined Words | Yes *No* | 
| Circled Words |  ![\[Two buttons labeled "Yes" and "No" for making a binary choice.\]](http://docs.aws.amazon.com/textract/latest/dg/images/circleclickable.png) | 
| Crossed Out Words | ![\[Two buttons labeled "Yes" and "No", with "No" crossed out in blue.\]](http://docs.aws.amazon.com/textract/latest/dg/images/cutclickable.png) | 

Additionally Amazon Textract can detect implicit clickables, or clickables that are structured as questions and answered by marking one of several answers. These are returned the same way clickables are.

**Note**  
You can use the `FeatureTypes` input parameter to retrieve information about key-value pairs, tables, or both. For example, if you filter on tables, the response includes the selection elements that are detected in tables. Selection elements that are detected in key-value pairs aren't included in the response.

Information about a selection element is contained in a `Block` object of type `SELECTION_ELEMENT`. To determine the status of a selectable element, use the `SelectionStatus` field of the `SELECTION_ELEMENT` block. The status can be either *SELECTED* or *NOT\$1SELECTED*. For example, the value of `SelectionStatus` for the previous image is *SELECTED*.

A `SELECTION_ELEMENT` `Block` object is associated with either a key-value pair or a table cell. A `SELECTION_ELEMENT` `Block` object contains bounding box information for a selection element in the `Geometry` field. A `SELECTION_ELEMENT` `Block` object isn't a child of a `PAGE` `Block` object.

## Form Data (Key-Value Pairs)
<a name="how-it-works-selectable-kvp"></a>

A key-value pair is used to represent a selection element that's detected on a form. The `KEY` block contains the text for the selection element. The `VALUE` block contains the SELECTION\$1ELEMENT block. The following diagram shows how selection elements are represented by [Block](API_Block.md) objects.

![\[Diagram depicting the representation of a selection element on a form using a key-value pair data structure, with KEY containing the text, VALUE containing the SELECTION_ELEMENT, and PAGE as the parent object.\]](http://docs.aws.amazon.com/textract/latest/dg/images/hieroglyph-key-value-set-selectable.png)


For more information about key-value pairs, see [Form Data (Key-Value Pairs)](how-it-works-kvp.md).

The following JSON snippet shows the key for a key-value pair that contains a selection element (**male ☑**). The child ID (Id bd14cfd5-9005-498b-a7f3-45ceb171f0ff) is the ID of the WORD block that contains the text for the selection element (*male*). The value ID (Id 24aaac7f-fcce-49c7-a4f0-3688b05586d4) is the ID of the `VALUE` block that contains the `SELECTION_ELEMENT` block object.

```
{
    "Relationships": [
        {
            "Type": "VALUE", 
            "Ids": [
                "24aaac7f-fcce-49c7-a4f0-3688b05586d4"  // Value containing Selection Element
            ]
        }, 
        {
            "Type": "CHILD", 
            "Ids": [
                "bd14cfd5-9005-498b-a7f3-45ceb171f0ff"  // WORD - male
            ]
        }
    ], 
    "Confidence": 94.15619659423828, 
    "Geometry": {
        "BoundingBox": {
            "Width": 0.022914813831448555, 
            "Top": 0.08072036504745483, 
            "Left": 0.18966935575008392, 
            "Height": 0.014860388822853565
        }, 
        "Polygon": [
            {
                "Y": 0.08072036504745483, 
                "X": 0.18966935575008392
            }, 
            {
                "Y": 0.08072036504745483, 
                "X": 0.21258416771888733
            }, 
            {
                "Y": 0.09558075666427612, 
                "X": 0.21258416771888733
            }, 
            {
                "Y": 0.09558075666427612, 
                "X": 0.18966935575008392
            }
        ]
    }, 
    "BlockType": "KEY_VALUE_SET", 
    "EntityTypes": [
        "KEY"
    ], 
    "Id": "a118dc43-d5f7-49a2-a20a-5f876d9ffd79"
}
```

The following JSON snippet is the WORD block for the word *Male*. The WORD block also has a parent LINE block.

```
{
    "Geometry": {
        "BoundingBox": {
            "Width": 0.022464623674750328, 
            "Top": 0.07842985540628433, 
            "Left": 0.18863198161125183, 
            "Height": 0.01617223583161831
        }, 
        "Polygon": [
            {
                "Y": 0.07842985540628433, 
                "X": 0.18863198161125183
            }, 
            {
                "Y": 0.07842985540628433, 
                "X": 0.2110965996980667
            }, 
            {
                "Y": 0.09460209310054779, 
                "X": 0.2110965996980667
            }, 
            {
                "Y": 0.09460209310054779, 
                "X": 0.18863198161125183
            }
        ]
    }, 
    "Text": "Male", 
    "BlockType": "WORD", 
    "Confidence": 54.06439208984375, 
    "Id": "bd14cfd5-9005-498b-a7f3-45ceb171f0ff"
},
```

The VALUE block has a child (Id f2f5e8cd-e73a-4e99-a095-053acd3b6bfb) that is the SELECTION\$1ELEMENT block. 

```
{
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "f2f5e8cd-e73a-4e99-a095-053acd3b6bfb"  // Selection element
            ]
        }
    ], 
    "Confidence": 94.15619659423828, 
    "Geometry": {
        "BoundingBox": {
            "Width": 0.017281491309404373, 
            "Top": 0.07643391191959381, 
            "Left": 0.2271782010793686, 
            "Height": 0.026274094358086586
        }, 
        "Polygon": [
            {
                "Y": 0.07643391191959381, 
                "X": 0.2271782010793686
            }, 
            {
                "Y": 0.07643391191959381, 
                "X": 0.24445968866348267
            }, 
            {
                "Y": 0.10270800441503525, 
                "X": 0.24445968866348267
            }, 
            {
                "Y": 0.10270800441503525, 
                "X": 0.2271782010793686
            }
        ]
    }, 
    "BlockType": "KEY_VALUE_SET", 
    "EntityTypes": [
        "VALUE"
    ], 
    "Id": "24aaac7f-fcce-49c7-a4f0-3688b05586d4"
}, 
}
```

The following JSON is the SELECTION\$1ELEMENT block. The value of `SelectionStatus` indicates that the check box is selected.

```
{
    "Geometry": {
        "BoundingBox": {
            "Width": 0.020316146314144135, 
            "Top": 0.07575977593660355, 
            "Left": 0.22590067982673645, 
            "Height": 0.027631107717752457
        }, 
        "Polygon": [
            {
                "Y": 0.07575977593660355, 
                "X": 0.22590067982673645
            }, 
            {
                "Y": 0.07575977593660355, 
                "X": 0.2462168186903
            }, 
            {
                "Y": 0.1033908873796463, 
                "X": 0.2462168186903
            }, 
            {
                "Y": 0.1033908873796463, 
                "X": 0.22590067982673645
            }
        ]
    }, 
    "BlockType": "SELECTION_ELEMENT", 
    "SelectionStatus": "SELECTED", 
    "Confidence": 74.14942932128906, 
    "Id": "f2f5e8cd-e73a-4e99-a095-053acd3b6bfb"
}
```

## Table Cells
<a name="how-it-works-selectable-table"></a>

Amazon Textract can detect selection elements inside a table cell. For example, the cells in the following table have check boxes.


|  |  |  |  | 
| --- |--- |--- |--- |
|     |  **Agree**  |  **Neutral**  |  **Disagree**  | 
|  **Good Service**  |  ☑  |  ☐  |  ☐  | 
|  **Easy to Use**  |  ☐  |  ☑  |  ☐  | 
|  **Fair Price**  |  ☑  |  ☐  |  ☐  | 

A `CELL` block can contain child `SELECTION_ELEMENT` objects for selection elements and child `WORD` blocks for detected text.

![\[Diagram showing a hierarchical structure of a page layout with components: page, table, cell, word, and selection element.\]](http://docs.aws.amazon.com/textract/latest/dg/images/hieroglyph-table-cell-selectable.png)


For more information about tables, see [Tables](how-it-works-tables.md).

The TABLE `Block` object for the previous table looks similar to this.

```
{
    "Geometry": {.....}, 
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "652c09eb-8945-473d-b1be-fa03ac055928", 
                "37efc5cc-946d-42cd-aa04-e68e5ed4741d", 
                "4a44940a-435a-4c5c-8a6a-7fea341fa295", 
                "2de20014-9a3b-4e26-b453-0de755144b1a", 
                "8ed78aeb-5c9a-4980-b669-9e08b28671d2", 
                "1f8e1c68-2c97-47b2-847c-a19619c02ca9", 
                "9927e1d1-6018-4960-ac17-aadb0a94f4d9", 
                "68f0ed8b-a887-42a5-b618-f68b494a6034", 
                "fcba16e0-6bd7-4ea5-b86e-36e8330b68ea", 
                "2250357c-ae34-4ed9-86da-45dac5a5e903", 
                "c63ad40d-5a14-4646-a8df-2d4304213dbc",   // Cell
                "2b8417dc-e65f-4fcd-aa0f-61a23f1e8cb0", 
                "26c62932-72f0-4dc2-9893-1ae27829c060", 
                "27f291cc-abf4-4c23-aa24-676abe99cb1e", 
                "7e5ce028-1bcd-4d9f-ad42-15ac181c5b47", 
                "bf32e3d2-efa2-4fc1-b09b-ab9cc52ff734"
            ]
        }
    ], 
    "BlockType": "TABLE", 
    "Confidence": 99.99993896484375, 
    "Id": "f66eac36-2e74-406e-8032-14d1c14e0b86"
}
```

The CELL `BLOCK` object (Id c63ad40d-5a14-4646-a8df-2d4304213dbc) for the cell that contains the check box *Good Service* looks like the following. It includes a child `Block` (Id = 26d122fd-c5f4-4b53-92c4-0ae92730ee1e) that is the `SELECTION_ELEMENT` `Block` object for the check box.

```
{
    "Geometry": {.....}, 
    "Relationships": [
        {
            "Type": "CHILD", 
            "Ids": [
                "26d122fd-c5f4-4b53-92c4-0ae92730ee1e"  // Selection Element
            ]
        }
    ], 
    "Confidence": 79.741689682006836, 
    "RowSpan": 1, 
    "RowIndex": 3, 
    "ColumnIndex": 3, 
    "ColumnSpan": 1, 
    "BlockType": "CELL", 
    "Id": "c63ad40d-5a14-4646-a8df-2d4304213dbc"
}
```

The SELECTION\$1ELEMENT `Block` object for the check box is as follows. The value of `SelectionStatus` indicates that the check box is selected.

```
{
    "Geometry": {.......}, 
    "BlockType": "SELECTION_ELEMENT", 
    "SelectionStatus": "SELECTED", 
    "Confidence": 88.79517364501953, 
    "Id": "26d122fd-c5f4-4b53-92c4-0ae92730ee1e"
}
```

# Queries
<a name="queryresponse"></a>

When provided a query, Amazon Textract provides a specialized response object. This object repeats the question back to the user along with the alias for the question. It then provides the confidence Amazon Textract has with the answer, a location of the answer on the page, and the text answer to the question. If no answer is found, this response element is kept blank.

Detected queries are returned as Block objects in the responses from AnalyzeDocument and GetDocumentAnalysis. You can use the FeatureTypes input parameter to retrieve information about key-value pairs, tables, or Queries. For general information about how a document is represented by Block objects, see [Text Detection and Document Analysis Response Objects](how-it-works-document-layout.md).

The following shows a diagram of how a query response is represented in `Block` objects.

![\[Diagram showing a query and two responses flowing from a page, with one response redirecting to an answer.\]](http://docs.aws.amazon.com/textract/latest/dg/images/query-response-image.png)


Following is an example for a query response as part of a full response of document analysis.

```
                        {
                            "BlockType": "QUERY",
                            "Id": "77cfbd28-168a-40fc-9c8a-863ba3066bd2",
                            "Relationships": [
                            {
                                "Type": "ANSWER",
                                "Ids": [
                                "21396475-27ee-4da7-965b-f7631ef60fcc"
                                ]
                            }
                            ],
                            "Query": {
                                "Text": "What is the patient first name?",
                                "Alias": "PATIENT_FIRST_NAME"
                            }
                            },
                        {
                            "BlockType": "QUERY_RESULT",
                            "Confidence": 1.0,
                            "Text": "ALEJANDRO",
                            "Id": "21396475-27ee-4da7-965b-f7631ef60fcc"
                        }
```

We have compiled a list of example queries for common documents in the [Example Queries document](samples/Example%20Queries.zip).