

# Querying HealthLake data with Amazon Athena
SQL index and query

During a HealthLake import job, nested FHIR JSON data undergoes an ETL process and is stored in [Apache Iceberg open table format](https://iceberg.apache.org/), where each FHIR resource type is represented as an individual table in Athena. This enables users to query the FHIR data using SQL, but without having to export it first. This is valuable, as it empowers clinicians and scientists to query FHIR data to validate their decisions or advance their research. For more information about how Apache Iceberg tables function in Athena, see [Query Apache Iceberg tables](https://docs.aws.amazon.com/athena/latest/ug/querying-iceberg.html) in the *Athena User Guide*.

**Note**  
HealthLake supports FHIR R4 `read` interaction on your HealthLake data in Athena. For more information, see [Reading a FHIR resource](managing-fhir-resources-read.md).

The topics in this section describe how to connect your HealthLake data store to Athena, how to query it using SQL, and how to connect results with other AWS services for further analysis.

**Topics**
+ [Getting started](integrating-athena-getting-started.md)
+ [Querying with SQL](integrating-athena-query-sql.md)
+ [Example queries](integrating-athena-complex-filtering.md)

# Getting started with Amazon Athena
Getting started

To integrate HealthLake with Amazon Athena, you must set up permissions. To do this, you'll create an Athena user, group or role, and grant them access to FHIR resources located within a HealthLake data store.
+ [Granting a user, group, or role access to a HealthLake data store (AWS Lake Formation Console)](#getting-started-athena-admin)
+ [Setting up an Athena account](#getting-started-athena-user)

## Granting a user, group, or role access to a HealthLake data store (AWS Lake Formation Console)
Granting access

**Persona: HealthLake administrator**  
The HealthLake administrator persona is a data lake administrator in AWS Lake Formation. They grant access to HealthLake data stores in Lake Formation.

For each data store created, there are two entries visible in the AWS Lake Formation console. One entry is a *resource link*. Resource link names are always displayed in *italics*. Each resource link is displayed with the name and owner of its linked shared resource. For all HealthLake data stores, the shared resource owner is the HealthLake service account. The other entry is the HealthLake data store in the HealthLake service account. The steps in this procedure use the data store that is the resource link.

To learn more about resource links, see [How resource links work in Lake Formation](https://docs.aws.amazon.com/lake-formation/latest/dg/resource-links-about.html) in the *AWS Lake Formation Developer Guide*.

For a user, group, or role to be able to query data in Athena, you must grant **Describe** permission on the resource database. Then, you must grant **Select** and **Describe** on the tables.

**STEP 1: To grant **DESCRIBE** permissions on a HealthLake data store resource link database**

1. Open the AWS Lake Formation console: [https://console.aws.amazon.com/lakeformation/](https://console.aws.amazon.com//lakeformation)

1. In the primary navigation bar, choose **Databases**.

1. On the **Databases** page, choose the radio button next to the name of the data store that is in italics.

1. Choose **Actions (▼)**.

1. Choose **Grant**.

1. On the **Grant data permissions** page, under **Principals**, choose **IAM users or roles**.

1. Under **IAM users or roles**, use the **down arrow (▼)**, or search for the IAM user, role, or group that you want to be able to make queries on in Athena.

1. Under **LF-Tags or catalog resources** card, choose the **Named data catalog resources** option.

1. Under **Databases**, use the **down arrow (▼)** to choose the HealthLake data store database that you want to share access to.

1. In the **Resource link permissions** card, under **Resource link permissions**, choose **Describe**.

When the grant is successful, the **Grant permission success** banner appears. To view the permission you just granted, choose **Data lake permissions**. Find the user, group, and role in the table. Under the **Permissions** column, you will see **Describe** listed.

Now you must use **Grant on target** to grant **Select** and **Describe** on all tables in the database.

**STEP 2: Grant access to all tables in a HealthLake data store resource link**

1. Open the AWS Lake Formation console: [https://console.aws.amazon.com/lakeformation/](https://console.aws.amazon.com//lakeformation)

1. In the primary navigation bar, choose **Databases**.

1. On the **Databases** page, choose the radio button next to the name of the data store that is in italics.

1. Choose **Actions (▼)**.

1. Choose **Grant on target**.

1. On the **Grant data permissions** page, under **Principals**, choose **IAM users or roles**.

1. Under **IAM users or roles**, use the **down arrow (▼)** or search for the IAM user, group, or role that you want to be able to make queries on in Athena.

1. Under **LF-Tags or catalog resources** card, choose the **Named data catalog resources** option.

1. Under **Databases**, use the **down arrow (▼)** to choose the HealthLake data store database that you want to grant access to.

1. Under **Tables**, choose **All tables** to share all tables with a HealthLake user.

1. In the **Table permissions** card, under **Table permissions**, choose **Describe** and **Select**.

1. Choose **Grant**.

After choosing grant,a **Grant permissions success** banner appears. The specified user can now make queries on a HealthLake data store in Athena.

## Getting started with Athena


**HealthLake user**  
The HealthLake user will use the Athena console, AWS CLI, or AWS SDKs to query a HealthLake data store shared with them by the HealthLake administrator.

To query a data store using Athena, you must do the following three things.
+ Grant the IAM user or role access to the HealthLake data store via Lake Formation. To learn more, see [Granting a user, group, or role access to a HealthLake data store (AWS Lake Formation Console)](#getting-started-athena-admin).
+ Create a workgroup for your HealthLake data store.
+ Designate an Amazon S3 bucket to store your query results.

To get started with Athena, add the **AmazonAthenaFullAccess** and **AmazonS3FullAccess** AWS managed policies to your user, group or role. Using an AWS managed policy is great way to get started using a new service. Keep in mind that AWS managed policies might not grant least-privilege permissions for your specific use cases because they are available for use by all AWS customers. When you set permissions with IAM policies, grant only the permissions required to perform a task. To learn more about IAM and applying least-privilege, see [Apply least-privilege permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#bp-use-aws-defined-policies) in the *IAM User Guide*.

**Important**  
To query a HealthLake data store in Athena, you must use **Athena engine version 3**.

Workgroups are resources, and therefore you can use IAM-based policies to control access to specific workgroups. To learn more, see [Using workgroups to control query access and costs](https://docs.aws.amazon.com/athena/latest/ug/manage-queries-control-costs-with-workgroups.html) in the *Athena User Guide*.

To learn more about setting up workgroups, see [https://docs.aws.amazon.com/athena/latest/ug/workgroups-procedure.html](https://docs.aws.amazon.com/athena/latest/ug/workgroups-procedure.html) in the *Athena User Guide*.

**Note**  
The region your Amazon S3 bucket is in and the Athena console must match.

Before you can run a query, a query result bucket location in Amazon S3 must be specified, or you must use a workgroup that has specified a bucket and whose configuration overrides client settings. Output files are saved automatically for every query that runs.

For more details on specifying query result locations in the Athena console, see [Specifying a query result location using the Athena console](https://docs.aws.amazon.com/athena/latest/ug/querying.html#query-results-specify-location-console) in the *Amazon Athena User Guide*.

To see examples of how to query your HealthLake data store in Athena, see [Querying HealthLake data with SQL](integrating-athena-query-sql.md).

# Querying HealthLake data with SQL
Querying with SQL

When you import your FHIR data into HealthLake data store, the nested JSON FHIR data simultaneously undergoes an ETL process and is stored in Apache Iceberg open table format in Amazon S3. Each FHIR resource type from your HealthLake data store is converted into a table, where it can be queried using Amazon Athena. The tables can be queried individually or as group using SQL-based queries. Because of the structure of data stores, your data is imported into Athena as multiple different data types. To learn more about creating SQL queries that can access these data types, see [Query arrays with complex types and nested structures](https://docs.aws.amazon.com/athena/latest/ug/rows-and-structs.html) in the *Amazon Athena User Guide*.

**Note**  
All examples in this topic use fictionalized data created using Synthea. To learn more about creating a data store preloaded with Synthea data, see [Creating a HealthLake data store](managing-data-stores-create.md).

For each element in a resource type, the FHIR specification defines a cardinality. The cardinality of an element defines the lower and upper bounds of how many times this element can appear. When constructing a SQL query, you must take this into account. For example, let's look at some elements in [Resource type: Patient](https://hl7.org/fhir/R4/patient.html).
+ **Element: Name** The FHIR specification sets the cardinality as `0..*`.

  The element is captured as an array.

  ```
  [{
  	id = null,
  	extension = null,
  	use = official,
  	_use = null,
  	text = null,
  	_text = null,
  	family = Wolf938,
  	_family = null,
  	given = [Noel608],
  	_given = null,
  	prefix = null,
  	_prefix = null,
  	suffix = null,
  	_suffix = null,
  	period = null
  }]
  ```

  In Athena, to see how a resource type has been ingested, search for it under **Tables and views**. To access elements in this array, you can use dot notation. Here's a simple example that would access the values for `given` and `family`.

  ```
  SELECT
      name[1].given as FirstName,
      name[1].family as LastName
  FROM Patient
  ```
+ **Element: MaritalStatus** The FHIR specification sets the cardinality as `0..1`.

  This element is captured as JSON.

  ```
  {
  	id = null,
  	extension = null,
  	coding = [
  		{
  			id = null,
  			extension = null,
  			system = http: //terminology.hl7.org/CodeSystem/v3-MaritalStatus,
  				_system = null,
  			version = null,
  			_version = null,
  			code = S,
  			_code = null,
  			display = Never Married,
  			_display = null,
  			userSelected = null,
  			_userSelected = null
  		}
  
  	],
  	text = Never Married,
  	_text = null
  }
  ```

  In Athena, to see how a resource type has been ingested, search for it under **Tables and views**. To access key-value pairs in the JSON, you can use dot notation. Because it isn't an array, no array index is required. Here's a simple example that would access the value for `text`.

  ```
  SELECT
      maritalstatus.text as MaritalStatus
  FROM Patient
  ```

To learn more about accessing and searching JSON, see [Querying JSON](https://docs.aws.amazon.com//athena/latest/ug/querying-JSON.html) in the *Athena User Guide*.

Athena Data Manipulation Language (DML) query statements are based on Trino. Athena does not support all of Trino's features, and there are *significant* differences. To learn more, see [DML queries, functions, and operators](https://docs.aws.amazon.com/athena/latest/ug/functions-operators-reference-section.html) in the *Amazon Athena User Guide*.

Furthermore, Athena supports multiple data types that you may encounter when creating queries of your HealthLake data store. To learn more about data types in Athena, see [Data types in Amazon Athena](https://docs.aws.amazon.com/athena/latest/ug/data-types.html) in the *Amazon Athena User Guide*.

To learn more about how SQL queries work in Athena, see [SQL reference for Amazon Athena](https://docs.aws.amazon.com/athena/latest/ug/ddl-sql-reference.html) in the *Amazon Athena User Guide*.

Each tab shows examples of how to search on the specified resource types and associated elements using Athena.

------
#### [ Element: Extension ]

The element `extension` is used to create custom fields in a data store.

This example shows you how to access the features of the `extension` element found in the `Patient` resource type.

When your HealthLake data store is imported into Athena, the elements of a resource type are parsed differently. Because the structure of the `element` is variable, it cannot be fully specified in the schema. To handle that variability, the elements inside the array are passed as strings.

In the table description of `Patient`, you can see the element `extension` described as `array<string>`, which means you can access the elements of array by using an index value. To access the elements of the string, however, you must use `json_extract`.

Here is a single entry from the `extension` element found in the patient table.

```
[{
		"valueString": "Kerry175 Cummerata161",
		"url": "http://hl7.org/fhir/StructureDefinition/patient-mothersMaidenName"
	},
	{
		"valueAddress": {
			"country": "DE",
			"city": "Hamburg",
			"state": "Hamburg"
		},
		"url": "http://hl7.org/fhir/StructureDefinition/patient-birthPlace"
	},
	{
		"valueDecimal": 0.0,
		"url": "http://synthetichealth.github.io/synthea/disability-adjusted-life-years"
	},
	{
		"valueDecimal": 5.0,
		"url": "http://synthetichealth.github.io/synthea/quality-adjusted-life-years"
	}
]
```

Even though this is valid JSON, Athena treats it as a string.

This SQL query example demonstrates how you can create a table that contains the `patient-mothersMaidenName` and `patient-birthPlace` elements. To access these elements, you need to use different array indices and `json_extract.`

```
SELECT
    extension[1],
    json_extract(extension[1], '$.valueString') AS MothersMaidenName,
    extension[2],
    json_extract(extension[2], '$.valueAddress.city') AS birthPlace
FROM patient
```

To learn more about queries that involve JSON, see [Extracting data from JSON](https://docs.aws.amazon.com/athena/latest/ug/extracting-data-from-JSON.html) in the *Amazon Athena User Guide*.

------
#### [ Element: birthDate (Age) ]

Age is *not* an element of the Patient resource type in FHIR. Here are two examples for searches that filter based on age.

Because age is not an element, we use the `birthDate` for the SQL queries. To see how an element has been ingested into FHIR, search for the table name under **Tables and views**. You can see that it is of type **string**.

**Example 1**: Calculating a value for age

In this sample SQL query, we use a built-in SQL tool, `current_date` and `year` to extract those components. Then, we subtract them to return a patient's actual age as a column called `age`.

```
SELECT
	(year(current_date) - year(date(birthdate))) as age
FROM patient
```

**Example 2**: Filtering for patients who are born before `2019-01-01` and are `male`.

The SQL query shows you how to use the `CAST` function to cast the `birthDate` element as type `DATE`, and how to filter based on two criteria in the `WHERE` clause. Because the element is ingested as type **string** by default, we must `CAST` it as type `DATE`. Then you can use the `<` operator to compare it to a different date, `2019-01-01`. By using `AND`, you can add a second criteria to the `WHERE` clause.

```
SELECT birthdate
FROM patient
-- we convert birthdate (varchar) to date  > cast that as date too
WHERE CAST(birthdate AS DATE) < CAST('2019-01-01' AS DATE) AND gender = 'male'
```

------
#### [ Resource type: Location ]

This example shows searches for locations within the Location resource type where the city name is Attleboro.

```
SELECT *
FROM Location
WHERE address.city='ATTLEBORO'
LIMIT 10;
```

------
#### [ Element: Age ]

```
SELECT birthdate
FROM patient
-- we convert birthdate (varchar) to date  > cast that as date too
WHERE CAST(birthdate AS DATE) < CAST('2019-01-01' AS DATE) AND gender = 'male'
```

------
#### [ Resource type: Condition ]

The resource type condition stores diagnosis data related to issues that have risen to a level of concern. HealthLake's integrated medical natural language processing (NLP) generates *new* `Condition` resources based on details found in the DocumentReference resource type. When new resource are generated, HealthLake appends the tag `SYSTEM_GENERATED` to the `meta` element. This sample SQL query demonstrates how you can search the condition table and return results where the `SYSTEM_GENERATED` results have been removed.

To learn more about HealthLake's integrated natural language processing (NLP), see [Integrated natural language processing (NLP) for HealthLake](integrating-nlp.md).

```
SELECT *
FROM condition
WHERE meta.tag[1] is NULL
```

You can also search within a specified string element to filter your query further. The `modifierextension` element contains details about which `DocumentReference` resource was used to generate a set of conditions. Again, you must use `json_extract` to access the nested JSON elements that are brought into Athena as a string.

This sample SQL query demonstrates how you can search for all the `Condition` that has been generated based off of a specific `DocumentReference`. Use `CAST` to set the JSON element as a string so that you can use `LIKE` to compare.

```
SELECT
    meta.tag[1].display as SystemGenerated,
    json_extract(modifierextension[4], '$.valueReference.reference') as DocumentReference
FROM condition
WHERE meta.tag[1].display = 'SYSTEM_GENERATED'

AND CAST(json_extract(modifierextension[4], '$.valueReference.reference') as VARCHAR) LIKE '%DocumentReference/67aa0278-8111-40d0-8adc-43055eb9d18d%'
```

------
#### [ Resource type: Observation ]

The resource type, Observation stores measurements and simple assertions made about a patient, device, or other subject. HealthLake's integrated natural language processing (NLP) generates *new* `Observation` resources based on details found in a `DocumentReference` resource. This sample SQL query includes `WHERE meta.tag[1] is NULL` commented out, which means that the `SYSTEM_GENERATED` results are included.

```
SELECT valueCodeableConcept.coding[1].code
FROM Observation
WHERE  valueCodeableConcept.coding[1].code = '266919005'
-- WHERE meta.tag[1] is NULL
```

This column was imported as an [https://iceberg.apache.org/spec/#schemas-and-data-types](https://iceberg.apache.org/spec/#schemas-and-data-types). Therefore, you can access elements inside it using dot notation.

------
#### [ Resource type: MedicationStatement ]

MedicationStatement is a FHIR resource type that you can use to store details about medications a patient has taken, is taking, or will take in the future. HealthLake's integrated medical natural language processing (NLP) generates new MedicationStatement resources based on documents found in the DocumentReference resource type. When new resources are generated, HealthLake appends the tag `SYSTEM_GENERATED` to the `meta` element. This sample SQL query demonstrates how to create a query that filters based off of a single patient by using their identifier and finds resources that have been added by HealthLake's integrated NLP.

```
SELECT *
FROM medicationstatement
WHERE meta.tag[1].display = 'SYSTEM_GENERATED' AND subject.reference = 'Patient/0679b7b7-937d-488a-b48d-6315b8e7003b';
```

To learn more about HealthLake's integrated natural language processing (NLP), see [Integrated natural language processing (NLP) for HealthLake](integrating-nlp.md).

------

# Example SQL queries with complex filtering
Example queries

The following examples demonstrate how to use Amazon Athena SQL queries with complex filtering to locate FHIR data from a HealthLake data store.

**Example Create filtering criteria based on demographic data**  
Identifying the correct patient demographics is important when creating a patient cohort. This sample query demonstrates how you can use Trino dot notation and `json_extract` to filter data in your HealthLake data store.  

```
SELECT
    id
    , CONCAT(name[1].family, ' ', name[1].given[1]) as name
    , (year(current_date) - year(date(birthdate))) as age
    , gender as gender
    , json_extract(extension[1], '$.valueString') as MothersMaidenName
    , json_extract(extension[2], '$.valueAddress.city') as birthPlace
    , maritalstatus.coding[1].display as maritalstatus
    , address[1].line[1] as addressline
    , address[1].city as city
    , address[1].district as district
    , address[1].state as state
    , address[1].postalcode as postalcode
    , address[1].country as country
    , json_extract(address[1].extension[1], '$.extension[0].valueDecimal') as latitude
    , json_extract(address[1].extension[1], '$.extension[1].valueDecimal') as longitude
    , telecom[1].value as telNumber
    , deceasedboolean as deceasedIndicator
    , deceaseddatetime
FROM database.patient;
```
Using the Athena Console, you can further sort and download the results.

**Example Create filters for a patient and their related conditions**  
The following example query demonstrates how you can find and sort all the related conditions for the patients found in a HealthLake data store.  

```
SELECT
	patient.id as patientId
    , condition.id  as conditionId
    , CONCAT(name[1].family, ' ', name[1].given[1]) as name
    , condition.meta.tag[1].display
    , json_extract(condition.modifierextension[1], '$.valueDecimal') AS confidenceScore
    , category[1].coding[1].code as categoryCode
    , category[1].coding[1].display as categoryDescription
    , code.coding[1].code as diagnosisCode
    , code.coding[1].display as diagnosisDescription
    , onsetdatetime
    , severity.coding[1].code as severityCode
    , severity.coding[1].display as severityDescription
    , verificationstatus.coding[1].display as verificationStatus
    , clinicalstatus.coding[1].display as clinicalStatus
    , encounter.reference as encounterId
    , encounter.type as encountertype
FROM database.patient, condition
WHERE CONCAT('Patient/', patient.id) = condition.subject.reference
ORDER BY name;
```
You can use the Athena console to further sort the results or download them for further analysis.

**Example Create filters for patients and their related observations**  
The following example query demonstrates how to find and sort all related observations for patients found in a HealthLake data store.  

```
SELECT
	patient.id as patientId
    , observation.id as observationId
    , CONCAT(name[1].family, ' ', name[1].given[1]) as name
    , meta.tag[1].display
    , json_extract(modifierextension[1], '$.valueDecimal') AS confidenceScore
    , status
    , category[1].coding[1].code as categoryCode
    , category[1].coding[1].display as categoryDescription
    , code.coding[1].code as observationCode
    , code.coding[1].display as observationDescription
    , effectivedatetime
    , CASE
		WHEN valuequantity.value IS NOT NULL THEN CONCAT(CAST(valuequantity.value AS VARCHAR),' ',valuequantity.unit)
      	WHEN valueCodeableConcept.coding [ 1 ].code IS NOT NULL THEN CAST(valueCodeableConcept.coding [ 1 ].code AS VARCHAR)
      	WHEN valuestring IS NOT NULL THEN CAST(valuestring AS VARCHAR)
      	WHEN valueboolean IS NOT NULL THEN CAST(valueboolean AS VARCHAR)
      	WHEN valueinteger IS NOT NULL THEN CAST(valueinteger AS VARCHAR)
      	WHEN valueratio IS NOT NULL THEN CONCAT(CAST(valueratio.numerator.value AS VARCHAR),'/',CAST(valueratio.denominator.value AS VARCHAR))
      	WHEN valuerange IS NOT NULL THEN CONCAT(CAST(valuerange.low.value AS VARCHAR),'-',CAST(valuerange.high.value AS VARCHAR))
      	WHEN valueSampledData IS NOT NULL THEN CAST(valueSampledData.data AS VARCHAR)
      	WHEN valueTime IS NOT NULL THEN CAST(valueTime AS VARCHAR)
      	WHEN valueDateTime IS NOT NULL THEN CAST(valueDateTime AS VARCHAR)
      	WHEN valuePeriod IS NOT NULL THEN valuePeriod.start
      	WHEN component[1] IS NOT NULL THEN CONCAT(CAST(component[2].valuequantity.value AS VARCHAR),' ',CAST(component[2].valuequantity.unit AS VARCHAR), '/', CAST(component[1].valuequantity.value AS VARCHAR),' ',CAST(component[1].valuequantity.unit AS VARCHAR))
    END AS observationvalue
	, encounter.reference as encounterId
    , encounter.type as encountertype
FROM database.patient, observation
WHERE CONCAT('Patient/', patient.id) = observation.subject.reference
ORDER BY name;
```

**Example Create filtering conditions for a patient and their related procedures**  
Connecting procedures to patients is an important aspect of healthcare. The following SQL example query demonstrates how to use FHIR `Patient` and `Procedure` resource types to accomplish this. The following SQL query will return all patients and their related procedures found in your HealthLake data store.  

```
SELECT
	patient.id  as patientId
	, PROCEDURE.id as procedureId
	, CONCAT(name[1].family, ' ', name[1].given[1]) as name
	, status
	, category.coding[1].code as categoryCode
	, category.coding[1].display as categoryDescription
	, code.coding[1].code as procedureCode
	, code.coding[1].display as procedureDescription
	, performeddatetime
	, performer[1]
	, encounter.reference as encounterId
	, encounter.type as encountertype
FROM database.patient, procedure
WHERE CONCAT('Patient/', patient.id) = procedure.subject.reference
ORDER BY name;
```
You can use the Athena console to download the results for further analysis or sort them to better understand the results.

**Example Create filtering conditions for a patient and their related prescriptions**  
Seeing a current list of medications that patients are taking is important. Using Athena, you can write a SQL query that uses both the `Patient` and `MedicationRequest` resource types found in your HealthLake data store.  
The following SQL query joins the `Patient` and `MedicationRequest` tables imported into Athena. It also organizes the prescriptions into their individual entries by using dot notation.  

```
SELECT
	patient.id  as patientId
	, medicationrequest.id  as medicationrequestid
	, CONCAT(name[1].family, ' ', name[1].given[1]) as name
	, status
	, statusreason.coding[1].code as categoryCode
	, statusreason.coding[1].display as categoryDescription
	, category[1].coding[1].code as categoryCode
	, category[1].coding[1].display as categoryDescription
	, priority
	, donotperform
	, encounter.reference as encounterId
	, encounter.type as encountertype
	, medicationcodeableconcept.coding[1].code as medicationCode
	, medicationcodeableconcept.coding[1].display as medicationDescription
	, dosageinstruction[1].text as dosage
FROM database.patient, medicationrequest
WHERE CONCAT('Patient/', patient.id ) = medicationrequest.subject.reference
ORDER BY name
```
You can use the Athena console to sort the results or download them for further analysis.

**Example See medications found in the `MedicationStatement` resource type**  
The following example query shows you how to organize the nested JSON imported into Athena using SQL. The query uses the FHIR `meta` element to indicate when a medication has been added by HealthLake's integrated natural language processing (NLP). It also uses `json_extract` to search for data inside the array of JSON strings. For more information, see [Natural language processing](integrating-nlp.md).  

```
SELECT
	medicationcodeableconcept.coding[1].code as medicationCode
	, medicationcodeableconcept.coding[1].display as medicationDescription
	, meta.tag[1].display
	, json_extract(modifierextension[1], '$.valueDecimal') AS confidenceScore
FROM medicationstatement;
```
You can use the Athena console to download these results or sort them.

**Example Filter for a specific disease type**  
The example shows how you can find a group of patients, aged 18 to 75, who have been diagnosed with diabetes.  

```
SELECT patient.id as patientId,
	condition.id as conditionId,
	CONCAT(name [ 1 ].family, ' ', name [ 1 ].given [ 1 ]) as name,
	(year(current_date) - year(date(birthdate))) AS age,
	CASE
		WHEN condition.encounter.reference IS NOT NULL THEN condition.encounter.reference
		WHEN observation.encounter.reference IS NOT NULL THEN observation.encounter.reference
	END as encounterId,
	CASE
		WHEN condition.encounter.type IS NOT NULL THEN observation.encounter.type
		WHEN observation.encounter.type IS NOT NULL THEN observation.encounter.type
	END AS encountertype,
	condition.code.coding [ 1 ].code as diagnosisCode,
	condition.code.coding [ 1 ].display as diagnosisDescription,
	observation.category [ 1 ].coding [ 1 ].code as categoryCode,
	observation.category [ 1 ].coding [ 1 ].display as categoryDescription,
	observation.code.coding [ 1 ].code as observationCode,
	observation.code.coding [ 1 ].display as observationDescription,
	effectivedatetime AS observationDateTime,
	CASE
      WHEN valuequantity.value IS NOT NULL THEN CONCAT(CAST(valuequantity.value AS VARCHAR),' ',valuequantity.unit)
      WHEN valueCodeableConcept.coding [ 1 ].code IS NOT NULL THEN CAST(valueCodeableConcept.coding [ 1 ].code AS VARCHAR)
      WHEN valuestring IS NOT NULL THEN CAST(valuestring AS VARCHAR)
      WHEN valueboolean IS NOT NULL THEN CAST(valueboolean AS VARCHAR)
      WHEN valueinteger IS NOT NULL THEN CAST(valueinteger AS VARCHAR)
      WHEN valueratio IS NOT NULL THEN CONCAT(CAST(valueratio.numerator.value AS VARCHAR),'/',CAST(valueratio.denominator.value AS VARCHAR))
      WHEN valuerange IS NOT NULL THEN CONCAT(CAST(valuerange.low.value AS VARCHAR),'-',CAST(valuerange.high.value AS VARCHAR))
      WHEN valueSampledData IS NOT NULL THEN CAST(valueSampledData.data AS VARCHAR)
      WHEN valueTime IS NOT NULL THEN CAST(valueTime AS VARCHAR)
      WHEN valueDateTime IS NOT NULL THEN CAST(valueDateTime AS VARCHAR)
      WHEN valuePeriod IS NOT NULL THEN valuePeriod.start
      WHEN component[1] IS NOT NULL THEN CONCAT(CAST(component[2].valuequantity.value AS VARCHAR),' ',CAST(component[2].valuequantity.unit AS VARCHAR), '/', CAST(component[1].valuequantity.value AS VARCHAR),' ',CAST(component[1].valuequantity.unit AS VARCHAR))
    END AS observationvalue,
	CASE
		WHEN condition.meta.tag [ 1 ].display = 'SYSTEM GENERATED' THEN 'YES'
		WHEN condition.meta.tag [ 1 ].display IS NULL THEN 'NO'
		WHEN observation.meta.tag [ 1 ].display = 'SYSTEM GENERATED' THEN 'YES'
		WHEN observation.meta.tag [ 1 ].display IS NULL THEN 'NO'
  	END AS IsSystemGenerated,
  CAST(
    json_extract(
      condition.modifierextension [ 1 ],
      '$.valueDecimal'
    ) AS int
  ) AS confidenceScore
FROM database.patient,
	database.condition,
	database.observation
WHERE CONCAT('Patient/', patient.id) = condition.subject.reference
	AND CONCAT('Patient/', patient.id) = observation.subject.reference
  	AND (year(current_date) - year(date(birthdate))) >= 18
  	AND (year(current_date) - year(date(birthdate))) <= 75
  	AND condition.code.coding [ 1 ].display like ('%diabetes%');
```
Now you can use the Athena console to sort the results or download them for further analysis.