# Ontology linking
<a name="comprehendmedical-ontologies"></a>

Use Amazon Comprehend Medical to detect entities in clinical text and link those entities to concepts in standardized medical ontologies, including the RxNorm, ICD-10-CM, and SNOMED CT knowledge bases. You can perform analysis both on single files or as a batch analysis on large documents or multiple files stored in an Amazon Simple Storage Service (S3).

# ICD-10-CM linking
<a name="ontology-icd10"></a>

 Use InferICD10CM to detect possible medical conditions as entities and link them to codes from the 2026 version of the [International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM)](https://www.cdc.gov/nchs/icd/icd-10-cm/?CDC_AAref_Val=https://www.cdc.gov/nchs/icd/icd-10-cm.htm). The ICD-10-CM is provided by the US Centers for Disease Control and Prevention (CDC).

When medical conditions are detected, `InferICD10CM` returns the matching ICD-10-CM codes and descriptions. The detected conditions are listed in descending order of confidence. The scores indicate the confidence in the accuracy of the entities matched to the concepts found in the text. Related information such as family history, signs, symptoms, and negation are recognized as traits. Additional information such as anatomical designations and acuity are listed as attributes.

InferICD10CM is well-suited for the following scenarios:
+ Assisting the professional medical coding of patient records
+ Clinical studies and trials
+ Integration with a medical software system 
+ Early detection and diagnosis 
+ Population health management 

## ICD-10-CM category
<a name="icd10-cm-category"></a>

**InferICD10CM** detects entities in the `MEDICAL_CONDITION` category. Additional related information is also detected and linked as attributes or traits.

## ICD-10-CM types
<a name="icd10-cm-type"></a>

 **InferICD10CM** detects entities of the types `DX_NAME` and `TIME_EXPRESSION`.

## ICD-10-CM traits
<a name="icd10-cm-traits"></a>

**InferICD10CM** detects the following contextual information as traits: 
+ `DIAGNOSIS`: An identification of a medical condition that is determined by evaluation of the symptoms.
+ `HYPOTHETICAL`: An indication that a medical condition is expressed as a hypothesis.
+ `LOW_CONFIDENCE`: An indication that a medical condition is expressed as having high uncertainty. This is not directly related to the confidence scores provided.
+ `NEGATION`: An indication that a medical condition is not present.
+ `PERTAINS_TO_FAMILY`: An indication that a medical condition is relevant to the patient’s family, not the patient.
+ `SIGN`: A medical condition that is reported by the physician.
+ `SYMPTOM`: A medical condition that is reported by the patient.

## ICD-10-CM attributes
<a name="icd10-cm-attributes"></a>

**InferICD10CM** detects the following contextual information as attributes: 
+ `DIRECTION`: Directional terms. For example, left, right, medial, lateral, upper, lower, posterior, anterior, distal, proximal, contralateral, bilateral, ipsilateral, dorsal, or ventral.
+ `SYSTEM_ORGAN_SITE`: Anatomical location.
+ `ACUITY:` Determination of a disease instance, such as chronic, acute, sudden, persistent, or gradual. This only applies to the `MEDICAL_CONDITION` type. 
+ `QUALITY`: Any descriptive term of the medical condition, such as stage or grade. 

## Time expression category
<a name="time-expression-icd10-cm"></a>

The `TIME_EXPRESSION` category detects entities related to time. This includes entities such as dates, and time expressions such as "three days ago," "today," "currently," "day of admission," "last month," or "16 days." Results in this category are only returned if they are associated with an entity. For example, the expression, "Yesterday, the patient was diagnosed with influenza" would return `Yesterday` as a `TIME_EXPRESSION` entity that overlaps with the `DX_NAME` entity, "influenza." However, "yesterday" would not be recognized as an entity in the expression, "yesterday, the patient walked their dog."

## Types
<a name="time-expression-icd10cm-categories"></a>

The recognized type of `TIME_EXPRESSION` is `TIME_TO_DX_NAME`: the date that a medical condition occurred. The attribute for this type is `DX_NAME`.

## Relationship type
<a name="time-expression-icd10cm-relationship-type"></a>

The `RELATIONSHIP_TYPE` refers to the relationship between an entity and an attribute. The recognized `RELATIONSHIP_TYPE` is `OVERLAP`– the `TIME_EXPRESSION` concurs with the entity detected.

## Input and response examples
<a name="icd10cminput-med"></a>

**Note**  
For specific API input and response syntax, see [InferICD10CM](https://docs.aws.amazon.com/comprehend-medical/latest/api/API_InferICD10CM.html) in the *Amazon Comprehend Medical API Reference*.

The following example input text shows how the `InferICD10CM` operation works. To view all input text, scroll over the **Copy** button.

```
"The patient is a 71-year-old female patient of Dr. X. The patient presented to the emergency room last evening with approximately 7 to 8 day history of abdominal pain which has been persistent. She has had no nausea and vomiting, but has had persistent associated anorexia. She is passing flatus, but had some obstipation symptoms with the last bowel movement two days ago. She denies any bright red blood per rectum and no history of recent melena. Her last colonoscopy was approximately 5 years ago with Dr. Y. She has had no definite fevers or chills and no history of jaundice. The patient denies any significant recent weight loss."
```

The `InferICD10CM` operation returns the following output in JSON format (abbreviated for brevity).

```
{
    "Entities": [
        {
            "Id": 1,
            "Text": "abdominal pain",
            "Category": "MEDICAL_CONDITION",
            "Type": "DX_NAME",
            "Score": Float,
            "BeginOffset": 153,
            "EndOffset": 167,
            "Attributes": [
                {
                    "Type": "ACUITY",
                    "Score": Float,
                    "RelationshipScore": Float,
                    "Id": 2,
                    "BeginOffset": 183,
                    "EndOffset": 193,
                    "Text": "persistent",
                    "Traits": []
                }
            ],
            "Traits": [
                {
                    "Name": "SYMPTOM",
                    "Score": Float
                }
            ],
            "ICD10CMConcepts": [
                {
                    "Description": "Unspecified abdominal pain",
                    "Code": "R10.9",
                    "Score": Float
                },
                {
                    "Description": "Epigastric pain",
                    "Code": "R10.13",
                    "Score": Float
                },
                {
                    "Description": "Lower abdominal pain, unspecified",
                    "Code": "R10.30",
                    "Score": Float
                },
                {
                    "Description": "Generalized abdominal pain",
                    "Code": "R10.84",
                    "Score": Float
                },
                {
                    "Description": "Upper abdominal pain, unspecified",
                    "Code": "R10.10",
                    "Score": Float
                }
            ]
        }
...
    "ModelVersion": "3.3.0.20251001"
}
```

`InferICD10CM` also recognizes when an entity is negated in text. For instance, if a patient is not experiencing a symptom, both the symptom and negation are identified as traits and listed with a confidence score. Based on the input for the previous example, the symptom `Nausea` will be listed under `NEGATION` because the patient isn't experiencing nausea.

```
{
    "Id": 3,
    "Text": "nausea",
    "Category": "MEDICAL_CONDITION",
    "Type": "DX_NAME",
    "Score": Float,
    "BeginOffset": 210,
    "EndOffset": 216,
    "Attributes": [],
    "Traits": [
        {
            "Name": "SYMPTOM",
            "Score": Float
        },
        {
            "Name": "NEGATION",
            "Score": Float
        }
    ],
    "ICD10CMConcepts": [
        {
            "Description": "Nausea with vomiting, unspecified",
            "Code": "R11.2",
            "Score": Float
        },
        {
            "Description": "Nausea",
            "Code": "R11.0",
            "Score": Float
        }
    ]
}
```

# RxNorm linking
<a name="ontology-RxNorm"></a>

Use the **InferRxNorm** operation to identify medications that are listed in a patient record as entities. The operation also links those entities to concept identifiers (RxCUI) from [ the RxNorm database from the National Library of Medicine](https://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html ). The source for each RxCUI is the 2022-11-07 RxNorm and RxTerms Release. Each RxCUI is unique for different strengths and dose forms. Amazon Comprehend Medical lists the top potentially matching RxCUIs for each medication that it detects in descending order by confidence score. Use the RxCUI codes for downstream analysis that is not possible with unstructured text. Related information such as strength, frequency, dose, dose form, and route of administration are listed as attributes in JSON format.

 You can use **InferRxNorm** for the following scenarios:
+  Screening for medications the patient has taken. 
+  Preventing potentially negative reactions between newly prescribed drugs and drugs the patient is currently taking.
+  Screening for inclusion in clinical trials based on drug history using the RxCUI. 
+  Checking whether the dosage and frequency of a drug is appropriate. 
+  Screening for uses, indications, and side effects of drugs. 
+ Managing population health.

## Important notice
<a name="important-notice"></a>

The **InferRxNorm** operation of Amazon Comprehend Medical is not a substitute for professional medical advice, diagnosis, or treatment. Identify the right confidence threshold for your use case, and use high confidence thresholds in situations that require high accuracy. Only use Amazon Comprehend Medical operations in patient care scenarios *after* reviewing for accuracy and receiving sound judgment by trained medical professionals.

## RxNorm category
<a name="medication-v2-rxnorm"></a>

**InferRxNorm** detects entities in the `MEDICATION` category. It also detects additional related information that is linked as attributes or traits.

## RxNorm types
<a name="medication-type-rxnorm"></a>

 The types of entities in the `Medication` category are
+ `BRAND_NAME`: The copyrighted brand name of the medication or therapeutic agent.
+ `GENERIC_NAME`: Non-brand name, ingredient name, or formula mixture of the medication or therapeutic agent.

## RxNorm attributes
<a name="medication-attribute-rxnorm"></a>
+ `DOSAGE`: The amount of medication ordered.
+ `DURATION`: How long the medication should be administered.
+ `FORM`: The form of the medication.
+ `FREQUENCY`: How often to administer the medication. 
+ `RATE`: The administration rate of the medication (primarily for medication infusions or IVs).
+ `ROUTE_OR_MODE`: The administration method of a medication.
+ `STRENGTH`: The medication strength.

## RxNorm traits
<a name="medication-trait-v2-rxnorm"></a>
+ `NEGATION`: Any indication that the patient is *not* taking a medication.
+ `PAST_HISTORY`: An indication that a medication detected is from the patient’s past (prior to current encounter).

## Input and response examples
<a name="rxnorminput"></a>

**Note**  
For specific API input and response syntax, see [InferRxNorm](https://docs.aws.amazon.com/comprehend-medical/latest/api/API_InferRxNorm.html) in the *Amazon Comprehend Medical API Reference*.

The following example input text shows how the `InferRxNorm` operation works. To view all input text, scroll over the **Copy** button.

```
"fluoride topical ( fluoride 1.1 % topical gel ) 1 application Topically daily Brush onto teeth before bed time , spit , do not rinse, eat or drink for 20-30 minutes"
```

The `InferRxNorm` operation returns the following output in JSON format:

```
{
    "Entities": [
        {
            "Id": 1,
            "Text": "fluoride",
            "Category": "MEDICATION",
            "Type": "GENERIC_NAME",
            "Score": Float,
            "BeginOffset": 19,
            "EndOffset": 27,
            "Attributes": [],
            "Traits": [],
            "RxNormConcepts": [
                {
                    "Description": "fluorine",
                    "Code": "1310123",
                    "Score": Float
                },
                {
                    "Description": "sodium fluoride",
                    "Code": "9873",
                    "Score": Float
                },
                {
                    "Description": "magnesium fluoride",
                    "Code": "1435860",
                    "Score": Float
                },
                {
                    "Description": "sulfuryl fluoride",
                    "Code": "2289224",
                    "Score": Float
                },
                {
                    "Description": "acidulated phosphate fluoride",
                    "Code": "236",
                    "Score": Float
                }
            ]
        }
    ],
    "ModelVersion": "3.3.0.20221107"
}
```

Using the following input text, the `InferRxNorm` operation recognizes the negation trait, too.

```
'patient is not on warfarin'
```

The `InferRxNorm` operation returns the following output in JSON format:

```
{
    "Entities": [
        {
            "Id": 1,
            "Text": "warfarin",
            "Category": "MEDICATION",
            "Type": "GENERIC_NAME",
            "Score": Float,
            "BeginOffset": 18,
            "EndOffset": 26,
            "Attributes": [],
            "Traits": [
                {
                    "Name": "NEGATION",
                    "Score": Float
                }
            ],
            "RxNormConcepts": [
                {
                    "Description": "warfarin",
                    "Code": "11289",
                    "Score": Float
                },
                {
                    "Description": "warfarin sodium 2 MG Oral Tablet",
                    "Code": "855302",
                    "Score": Float
                },
                {
                    "Description": "warfarin sodium 10 MG Oral Tablet",
                    "Code": "855296",
                    "Score": Float
                },
                {
                    "Description": "warfarin sodium 2 MG Oral Tablet [Coumadin]",
                    "Code": "855304",
                    "Score": Float
                },
                {
                    "Description": "warfarin sodium 10 MG Oral Tablet [Jantoven]",
                    "Code": "855300",
                    "Score": Float
                }
            ]
        }
    ],
    "ModelVersion": "3.3.0.20221107"
}
```

# SNOMED CT linking
<a name="ontology-linking-snomed"></a>

 Use **InferSNOMEDCT** to detect medical entities and link them to concepts from the 2022-03 version of the Systematized Nomenclature of Medicine, Clinical Terms (SNOMED CT). SNOMED CT provides you with a comprehensive vocabulary of medical concepts, including medical conditions and anatomy, medical tests, treatments, and procedures. To learn more about SNOMED CT, visit [SNOMED CT](https://www.snomed.org/value-of-snomedct). 

For each detected medical entity, Amazon Comprehend Medical lists the top five SNOMED CT concept IDs and descriptions associated with the medical concept, along with a confidence score to indicate the confidence of the model in its prediction. The SNOMED CT concept IDs are listed in descending order of confidence along with the confidence scores. The SNOMED CT concept IDs can then be used to structure patient clinical data for medical coding, reporting, or clinical analytics when you use them with the SNOMED CT poly-hierarchy. 

**InferSNOMEDCT** is available for customers in the US. For information on SNOMED CT in other countries, and information on SNOMED CT licensing, see [SNOMED CT](https://www.snomed.org/value-of-snomedct).

**InferSNOMEDCT** is well suited for the following scenarios:
+  Assistance for professional medical coding in patient records 
+  Clinical studies and trials 
+  Population health management

**InferSNOMEDCT** detects entities in the following categories. Additional contextual information is also detected and linked as attributes or traits.
+ `MEDICAL_CONDITION`: The signs, symptoms, and diagnoses of medical conditions. 
+ `ANATOMY`: The parts of the body or body systems and the locations of those parts or systems.
+ `TEST_TREATMENT_PROCEDURE`: The procedures that are used to determine a medical condition.

## Anatomy category
<a name="anatomy-snomed"></a>

The `ANATOMY` category detects references to the parts of the body or body systems and the locations of those parts or systems. 

### Attributes
<a name="anatomy-attributes-snomed"></a>

The following attributes are detected for the `ANATOMY` category:
+ `DIRECTION`: Directional terms. For example, left, right, medial, lateral, upper, lower, posterior, anterior, distal, proximal, contralateral, bilateral, ipsilateral, dorsal, or ventral.
+ `SYSTEM_ORGAN_SITE`: Body systems, anatomic locations or regions, and body sites.

## Medical condition category
<a name="snomed-med-cond"></a>

The `MEDICAL_CONDITION` category detects the signs, symptoms, and diagnoses of medical conditions.

### Type
<a name="med-cond-type-snomed"></a>

For the **MEDICAL\$1CONDITION** category, the following type is detected:
+ `DX_NAME:` An identification of a medical condition that is determined by evaluation of the symptoms. 

### Attributes
<a name="med-cond-attributes-snomed"></a>

The following attributes are detected for the `MEDICAL_CONDITION` category:
+ `ACUITY:` Determination of disease instance, such as chronic, acute, sudden, persistent, or gradual.
+ `QUALITY:` Any descriptive term of the medical condition, such as stage or grade. 
+ `DIRECTION`: Directional terms. For example, left, right medial, lateral, upper, lower, posterior, anterior, distal, proximal, contralateral, bilateral, ipsilateral, dorsal, or ventral.
+ `SYSTEM_ORGAN_SITE`: Body systems, anatomic locations or regions, and body sites.

### Traits
<a name="med-cond-traits"></a>

The following traits are detected for the `MEDICAL_CONDITION` category:
+ `DIAGNOSIS`: A medical condition that is determined as the cause or result of the symptoms. Symptoms can be found through physical findings, laboratory or radiological reports, or other means. 
+ `HYPOTHETICAL`: An indication that a medical condition is expressed as a hypothesis.
+ `LOW_CONFIDENCE`: An indication that a medical condition is expressed as having high uncertainty. This is not directly related to the confidence scores provided.
+ `NEGATION`: An indication that a medical condition is not present.
+ `PERTAINS_TO_FAMILY`: An indication that a medical condition is relevant to the patient’s family, not the patient.
+ `SIGN`: A medical condition that is reported by the physician.
+ `SYMPTON`: A medical condition that is reported by the patient.

## Test, treatment, and procedure category
<a name="ttt-snomed"></a>

The `TEST_TREATMENT_PROCEDURE` category detects the procedures that are used to determine a medical condition.

### Type
<a name="ttt-type-snomed"></a>

For the **TEST\$1TREATMENT\$1PROCEDURE** category, the following types are detected:
+ `PROCEDURE_NAME:` Interventions performed on the patient to treat a medical condition or to provide patient care.
+ `TEST_NAME:` Procedures performed on a patient for diagnosis, measurement, screening, or a rating that might have a resulting value. This includes any procedure, process, evaluation, or rating to determine a diagnosis, to rule out or find a condition, or to scale or score a patient. 
+ `TREATMENT_NAME:` Interventions performed to combat a disease or disorder. This includes medications, such as antivirals and vaccinations.

### Attributes
<a name="ttt-attributes-snomed"></a>

For the **TEST\$1TREATMENT\$1PROCEDURE** category, the following attributes are detected:
+ `TEST_NAME:` The diagnostic test performed.
+ `TEST_VALUE:` The numeric results from a diagnostic test. 
+ `TEST_UNIT:` The units associated with a `TEST_VALUE:` result.
+ `PROCEDURE_NAME:` The name of a surgery or medical procedure performed.
+ `TREATMENT_NAME:` The name of a treatment administered to a patient.

### Traits
<a name="ttt-traits-snomed"></a>
+ `FUTURE`: An indication that a test, treatment, or procedure refers to an action or event that will occur after the subject of the notes.
+ `HYPOTHETICAL`: An indication that a test, treatment, or procedure is expressed as a hypothesis
+ `NEGATION`: An indication that a result or action is negative or not being performed.
+ `PAST_HISTORY`: An indication that a test, treatment, or procedure is from the patient’s past (prior to the current encounter).

## SNOMED CT details
<a name="snomed-details"></a>

Included in the JSON response are the SNOMED CT details, which includes the following information:
+ `EDITION:` Only the US edition is supported.
+ `VERSIONDATE: ` The date stamp of the SNOMED CT version used. 
+ `LANGUAGE:` Analysis on English (US-EN) language is supported.

## Input and response examples
<a name="snomed-example"></a>

**Note**  
For specific API input and response syntax, see [InferSNOMEDCT](https://docs.aws.amazon.com/comprehend-medical/latest/api/API_InferSNOMEDCT.html) in the *Amazon Comprehend Medical API Reference*.

The following example input text shows how the `InferSNOMEDCT` operation works. To view all input text, scroll over the **Copy** button.

```
"HEENT : Boggy inferior turbinates, No oropharyngeal lesion"
```

The `InferSNOMEDCT` operation returns the following output in JSON format.

```
{
    "Entities": [
        {
            "Category": "ANATOMY",
            "BeginOffset": 0,
            "EndOffset": 5,
            "Text": "HEENT",
            "Traits": [],
            "SNOMEDCTConcepts": [
                {
                    "Code": "69536005",
                    "Score": Float,
                    "Description": "Head structure (body structure)"
                },
                {
                    "Code": "429031000124106",
                    "Score": Float,
                    "Description": "Review of systems, head, ear, eyes, nose and throat (procedure)"
                },
                {
                    "Code": "385383008",
                    "Score": Float,
                    "Description": "Ear, nose and throat structure (body structure)"
                },
                {
                    "Code": "64237003",
                    "Score": Float,
                    "Description": "Structure of left half of head (body structure)"
                },
                {
                    "Code": "113028003",
                    "Score": Float,
                    "Description": "Ear, nose and throat examination (procedure)"
                }
            ],
            "Score": Float,
            "Attributes": [],
            "Type": "SYSTEM_ORGAN_SITE",
            "Id": 0
        },
        {
            "Category": "MEDICAL_CONDITION",
            "BeginOffset": 8,
            "EndOffset": 33,
            "Text": "Boggy inferior turbinates",
            "Traits": [
                {
                    "Score": Float,
                    "Name": "SIGN"
                }
            ],
            "SNOMEDCTConcepts": [
                {
                    "Code": "254477009",
                    "Score": Float,
                    "Description": "Tumor of inferior turbinate (disorder)"
                },
                {
                    "Code": "260762006",
                    "Score": Float,
                    "Description": "Choroidal invasion status (attribute)"
                },
                {
                    "Code": "2455009",
                    "Score": Float,
                    "Description": "Revision of lumbosubarachnoid shunt (procedure)"
                },
                {
                    "Code": "19883003",
                    "Score": Float,
                    "Description": "Atrophy of nasal turbinates (disorder)"
                },
                {
                    "Code": "256723009",
                    "Score": Float,
                    "Description": "Inferior turbinate flap (substance)"
                }
            ],
            "Score": Float,
            "Attributes": [
                {
                    "Category": "ANATOMY",
                    "RelationshipScore": Float,
                    "EndOffset": 5,
                    "Text": "HEENT",
                    "Traits": [],
                    "SNOMEDCTConcepts": [
                        {
                            "Code": "69536005",
                            "Score": Float,
                            "Description": "Head structure (body structure)"
                        },
                        {
                            "Code": "429031000124106",
                            "Score": Float,
                            "Description": "Review of systems, head, ear, eyes, nose and throat (procedure)"
                        },
                        {
                            "Code": "385383008",
                            "Score": Float,
                            "Description": "Ear, nose and throat structure (body structure)"
                        },
                        {
                            "Code": "64237003",
                            "Score": Float,
                            "Description": "Structure of left half of head (body structure)"
                        },
                        {
                            "Code": "113028003",
                            "Score": Float,
                            "Description": "Ear, nose and throat examination (procedure)"
                        }
                    ],
                    "Score": Float,
                    "RelationshipType": "SYSTEM_ORGAN_SITE",
                    "Type": "SYSTEM_ORGAN_SITE",
                    "Id": 0,
                    "BeginOffset": 0
                }
            ],
            "Type": "DX_NAME",
            "Id": 1
        },
        {
            "Category": "ANATOMY",
            "BeginOffset": 23,
            "EndOffset": 33,
            "Text": "turbinates",
            "Traits": [],
            "SNOMEDCTConcepts": [
                {
                    "Code": "310607007",
                    "Score": Float,
                    "Description": "Sarcoidosis of inferior turbinates (disorder)"
                },
                {
                    "Code": "80153006",
                    "Score": Float,
                    "Description": "Segmented neutrophil (cell)"
                },
                {
                    "Code": "46607005",
                    "Score": Float,
                    "Description": "Nasal turbinate structure (body structure)"
                },
                {
                    "Code": "6553002",
                    "Score": Float,
                    "Description": "Inferior nasal turbinate structure (body structure)"
                },
                {
                    "Code": "254477009",
                    "Score": Float,
                    "Description": "Tumor of inferior turbinate (disorder)"
                }
            ],
            "Score": Float,
            "Attributes": [],
            "Type": "SYSTEM_ORGAN_SITE",
            "Id": 3
        },
        {
            "Category": "ANATOMY",
            "BeginOffset": 39,
            "EndOffset": 52,
            "Text": "oropharyngeal",
            "Traits": [],
            "SNOMEDCTConcepts": [
                {
                    "Code": "31389004",
                    "Score": Float,
                    "Description": "Oropharyngeal structure (body structure)"
                },
                {
                    "Code": "33431000119109",
                    "Score": Float,
                    "Description": "Lesion of oropharynx (disorder)"
                },
                {
                    "Code": "263376008",
                    "Score": Float,
                    "Description": "Entire oropharynx (body structure)"
                },
                {
                    "Code": "716151000",
                    "Score": Float,
                    "Description": "Structure of oropharynx and/or hypopharynx and/or larynx (body structure)"
                },
                {
                    "Code": "764786007",
                    "Score": Float,
                    "Description": "Oropharyngeal (intended site)"
                }
            ],
            "Score": Float,
            "Attributes": [],
            "Type": "SYSTEM_ORGAN_SITE",
            "Id": 5
        },
        {
            "Category": "MEDICAL_CONDITION",
            "BeginOffset": 39,
            "EndOffset": 59,
            "Text": "oropharyngeal lesion",
            "Traits": [
                {
                    "Score": Float,
                    "Name": "SIGN"
                }
            ],
            "SNOMEDCTConcepts": [
                {
                    "Code": "31389004",
                    "Score": Float,
                    "Description": "Oropharyngeal structure (body structure)"
                },
                {
                    "Code": "33431000119109",
                    "Score": Float,
                    "Description": "Lesion of oropharynx (disorder)"
                },
                {
                    "Code": "764786007",
                    "Score": Float,
                    "Description": "Oropharyngeal (intended site)"
                },
                {
                    "Code": "418664002",
                    "Score": Float,
                    "Description": "Oropharyngeal route (qualifier value)"
                },
                {
                    "Code": "110162001",
                    "Score": Float,
                    "Description": "Abrasion of oropharynx (disorder)"
                }
            ],
            "Score": Float,
            "Attributes": [
                {
                    "Category": "ANATOMY",
                    "RelationshipScore": Float,
                    "EndOffset": 5,
                    "Text": "HEENT",
                    "Traits": [],
                    "SNOMEDCTConcepts": [
                        {
                            "Code": "69536005",
                            "Score": Float,
                            "Description": "Head structure (body structure)"
                        },
                        {
                            "Code": "429031000124106",
                            "Score": Float,
                            "Description": "Review of systems, head, ear, eyes, nose and throat (procedure)"
                        },
                        {
                            "Code": "385383008",
                            "Score": Float,
                            "Description": "Ear, nose and throat structure (body structure)"
                        },
                        {
                            "Code": "64237003",
                            "Score": Float,
                            "Description": "Structure of left half of head (body structure)"
                        },
                        {
                            "Code": "113028003",
                            "Score": Float,
                            "Description": "Ear, nose and throat examination (procedure)"
                        }
                    ],
                    "Score": Float,
                    "RelationshipType": "SYSTEM_ORGAN_SITE",
                    "Type": "SYSTEM_ORGAN_SITE",
                    "Id": 0,
                    "BeginOffset": 0
                }
            ],
            "Type": "DX_NAME",
            "Id": 4
        }
    ],
    "SNOMEDCTDetails": {
        "Edition": "US",
        "VersionDate": "20200901",
        "Language": "en"
    },
    "Characters": {
        "OriginalTextCharacters": 59
    },
    "ModelVersion": "3.3.0.20220301"
}
```

# Ontology linking batch analysis
<a name="ontologies-batchapi"></a>

Use Amazon Comprehend Medical to detect entities in clinical text stored in an Amazon Simple Storage Service (Amazon S3) bucket and to link those entities to standardized ontologies. You can use ontology linking batch analysis to analyze either a collection of documents or a single document with up to 20,000 characters. By using either the console or the ontology linking batch API operations, you can perform operations to start, stop, list, and describe ongoing batch analysis jobs.

 For pricing information for batch analysis and other Amazon Comprehend Medical operations, see [Amazon Comprehend Medical Pricing](https://aws.amazon.com/comprehend/medical/pricing/).

## Performing batch analysis
<a name="performing-batch-analysis-ontology-linking"></a>

You can run a batch analysis job using either the Amazon Comprehend Medical console or the Amazon Comprehend Medical batch API operations.

### Performing batch analysis using the API operations
<a name="batch-api-ontology-linking"></a>

**Prerequisites**

 When you are using the Amazon Comprehend Medical API, create an AWS Identity Access and Management (IAM) policy and attach it to an IAM role. To learn more about IAM roles and trust policies, see [IAM Policies and Permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html). 

1. Upload your data into an S3 bucket.

1. To start a new analysis job, use the **StartICD10CMInferenceJob**, **StartSNOMEDCTInferenceJob**, or the **StartRxNormInferenceJob** operations. Provide the name of the Amazon S3 bucket that contains the input files and the name of the Amazon S3 bucket where you want to send the output files.

1. Monitor the progress of the job by using **DescribeICD10CMInferenceJob**, **DescribeSNOMEDCTInferenceJob**, or **DescribeRxNormInferenceJob** operations. Additionally, you can use **ListICD10CMInferenceJobs**, **ListSNOMEDCTInferenceJobs**, and **ListRxNormInferenceJobs** to see the status of all ontology linking batch analysis jobs.

1. If you need to stop a job in progress, use **StopICD10CMInferenceJob**, **StopSNOMEDCTInferenceJob**, or **StopRxNormInferenceJob** to stop analysis.

1. To view the results of your analysis job, see the output S3 bucket that you configured when you started the job.

### Performing batch analysis using the console
<a name="batch-api-ontology-linking-console"></a>

****

1. Upload your data into an S3 bucket.

1. To start a new analysis job, select the type of analysis you will be performing. Then, provide the name of the S3 bucket that contains the input files and the name of the S3 bucket where you want to send the output files.

1. Monitor the status of your job while it is ongoing. From the console, you are can view all batch analysis operations and their status, including when analysis was started and ended.

1. To see the results of your analysis job, see the output S3 bucket that you configured when you started the job. 

## IAM policies for batch operations
<a name="batch-iam-ontology-linking"></a>

The IAM role that calls the Amazon Comprehend Medical batch API operations must have a policy that grants access to the S3 buckets that contain the input and output files. The IAM role must also be assigned a trust relationship so that the Amazon Comprehend Medical service can assume the role. To learn more about IAM roles and trust policies, see [IAM Roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html).

The role must have the following policy:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::input-bucket/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::input-bucket",
                "arn:aws:s3:::output-bucket"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::output-bucket/*"
            ],
            "Effect": "Allow"
        }
    ]
}
```

------

The role must have the following trust relationship. It is recommended that you use the `aws:SourceAccount ` and `aws:SourceArn` condition keys to prevent the confused deputy security issue. To learn more about the confused deputy problem and how to protect your AWS account, see [The confused deputy problem](https://docs.aws.amazon.com/IAM/latest/UserGuide/confused-deputy.html) in the IAM documentation.

------
#### [ JSON ]

****  

```
{
   "Version":"2012-10-17",		 	 	 
   "Statement":[
      {
         "Effect":"Allow",
         "Principal":{
            "Service":[
               "comprehendmedical.amazonaws.com"
            ]
         },
         "Action":"sts:AssumeRole",
         "Condition": {
            "StringEquals": {
               "aws:SourceAccount": "account_id"
            },
            "ArnLike": {
               "aws:SourceArn": "arn:aws:comprehendmedical:us-east-1:account_id:*"
            }
         }
      }
   ]
}
```

------

## Batch analysis output files
<a name="batch-ouput-ontology-linking"></a>

Amazon Comprehend Medical creates one output file for each input file in the batch. The file has the extension `.out`. Amazon Comprehend Medical first creates a directory in the output S3 bucket using the *AwsAccountId*-*JobType*-*JobId* as the name, and then it writes all of the output files for the batch to this directory. Amazon Comprehend Medical creates this new directory so that output from one job doesn't overwrite the output of another job.

A batch operation produces the same output as a synchronous operation.

Each batch operation produces the following three manifest files that contain information about the job:
+ `Manifest` – Summarizes the job. Provides information about the parameters used for the job, the total size of the job, and the number of files processed.
+ `Success` – Provides information about the files that were successfully processed. Includes the input and output file name and the size of the input file.
+ `Unprocessed` – Lists files that the batch job did not process with error codes and error messages per file.

Amazon Comprehend Medical writes the files to the output directory that you specified for the batch job. The summary manifest file will be written to the output folder, along with a folder titled `Manifest_AccountId-Operation-JobId`. Within the manifest folder is the `success` folder, which contains the success manifest, and the `failed` folder, which contains the unprocessed file manifest. The following sections show the structure of the manifest files.

### Batch manifest file
<a name="batch-manifest-ontology-linking"></a>

The following is the JSON structure of the batch manifest file.

```
{"Summary" : 
    {"Status" : "COMPLETED | FAILED | PARTIAL_SUCCESS | STOPPED", 
    "JobType" : "ICD10CMInference | RxNormInference | SNOMEDCTInference", 
    "InputDataConfiguration" : {
        "Bucket" : "input bucket", 
        "Path" : "path to files/account ID-job type-job ID" 
    }, "OutputDataConfiguration" : {
        "Bucket" : "output bucket", 
        "Path" : "path to files" 
    }, 
    "InputFileCount" : number of files in input bucket, 
    "TotalMeteredCharacters" : total characters processed from all files, 
    "UnprocessedFilesCount" : number of files not processed, 
    "SuccessFilesCount" : total number of files processed, 
    "TotalDurationSeconds" : time required for processing, 
    "SuccessfulFilesListLocation" : "path to file", 
    "UnprocessedFilesListLocation" : "path to file",
    "FailedJobErrorMessage": "error message or if not applicable,
              The status of the job is completed"
    } 
}
```

### Success manifest file
<a name="batch-success-ontology-linking"></a>

The following is the JSON structure of the file that contains information about successfully processed files.

```
{
    "Files": [{
            "Input": "input path/input file name",
            "Output": "output path/output file name",
            "InputSize": size in bytes of input file
        },
        {
            "Input": "input path/input file name",
            "Output": "output path/output file name",
            "InputSize": size in bytes of input file
     }]
}
```

### Unprocessed manifest file
<a name="batch-unprocessed-ontology-linking"></a>

Following is the JSON structure of the manifest file that contains information about unprocessed files.

```
{
  "Files" : [ {
      "Input": "file_name_that_failed",
      "ErrorCode": "error code for exception",
      "ErrorMessage": "explanation of the error code and suggestions"
  }, 
  { ...}
  ]
}
```