

# Custom vocabularies
<a name="custom-vocabulary"></a>

Use custom vocabularies to improve transcription accuracy for one or more specific words. These are generally domain-specific terms, such as brand names and acronyms, proper nouns, and words that Amazon Transcribe isn't rendering correctly.

Custom vocabularies can be used with all supported languages. Note that only the characters listed in your language's [character set](charsets.md) can be used in a custom vocabulary.

**Important**  
You are responsible for the integrity of your own data when you use Amazon Transcribe. Do not enter confidential information, personal information (PII), or protected health information (PHI) into a custom vocabulary.

Considerations when creating a custom vocabulary:
+ You can have up to 100 custom vocabulary files per AWS account
+ The size limit for each custom vocabulary file is 50 Kb
+ If using the API to create your custom vocabulary, your vocabulary file must be in text (\$1.txt) format. If using the AWS Management Console, your vocabulary file can be in text (\$1.txt) format or comma-separated value (\$1.csv) format.
+ Each entry within a custom vocabulary cannot exceed 256 characters
+ To use a custom vocabulary, it must have been created in the same AWS Region as your transcription.

**Tip**  
You can test your custom vocabulary using the AWS Management Console. Once your custom vocabulary is ready to use, log in to the AWS Management Console, select **Real-time transcription**, scroll to **Customizations**, toggle on **Custom vocabulary**, and select your custom vocabulary from the dropdown list. Then select **start streaming**. Speak some of the words in your custom vocabulary into your microphone to see if they render correctly.

## Custom vocabulary tables versus lists
<a name="custom-vocabulary-tables-lists"></a>

**Important**  
Custom vocabularies in list format are being deprecated. If you're creating a new custom vocabulary, use the [table format](custom-vocabulary-create-table.md).

Tables give you more options for—and more control over—the input and output of words within your custom vocabulary. With tables, you must specify multiple categories (Phrase and DisplayAs), allowing you to fine-tune your output.

Lists don't have additional options, so you can only type in entries as you want them to appear in your transcript, replacing all spaces with hyphens.

The AWS Management Console, AWS CLI, and AWS SDKs all use custom vocabulary tables in the same way; lists are used differently for each method and thus may require additional formatting for successful use between methods.

For more information, see [Creating a custom vocabulary using a table](custom-vocabulary-create-table.md) and [Creating a custom vocabulary using a list](custom-vocabulary-create-list.md).

To dive a little deeper and learn how to use Amazon Augmented AI with custom vocabularies, see:

[![AWS Videos](http://img.youtube.com/vi/https://www.youtube.com/embed/65eVesNiJzY/0.jpg)](http://www.youtube.com/watch?v=https://www.youtube.com/embed/65eVesNiJzY)


**API operations specific to custom vocabularies**  
 [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html), [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_DeleteVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_DeleteVocabulary.html), [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_GetVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_GetVocabulary.html), [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_ListVocabularies.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_ListVocabularies.html), [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_UpdateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_UpdateVocabulary.html) 

# Creating a custom vocabulary using a table
<a name="custom-vocabulary-create-table"></a>

Using a table format is the preferred way to create your custom vocabulary. Vocabulary tables must consist of four columns (Phrase, SoundsLike, IPA, and DisplayAs), which can be included in any order:


| Phrase | SoundsLike | IPA | DisplayAs | 
| --- | --- | --- | --- | 
|  Required. Every row in your table must contain an entry in this column. Do not use spaces in this column. If your entry contains multiple words, separate each word with a hyphen (-). For example, **Andorra-la-Vella** or **Los-Angeles**. For acronyms, any pronounced letters must be separated by a period. The trailing period also needs to be pronounced. If your acronym is plural, you must use a hyphen between the acronym and the 's'. For example, 'CLI' is **C.L.I.** (not **C.L.I**) and 'ABCs' is **A.B.C.-s** (not **A.B.C-s**). If your phrase consists of both a word and an acronym, these two components must be separated by a hyphen. For example, 'DynamoDB' is **Dynamo-D.B.**. Do not include digits in this column; numbers must be spelled out. For example, 'VX02Q' is **V.X.-zero-two-Q.**.  |  `SoundsLike` is no longer supported for Custom Vocabulary. Please leave the column empty. Any values in this column will be ignored. We will remove the support for this column in the future.  |  `IPA` is no longer supported for Custom Vocabulary. Please leave the column empty. Any values in this column will be ignored. We will remove the support for this column in the future.  |  Optional. Rows in this column can be left empty. You can use spaces in this column. Defines the how you want your entry to look in your transcription output. For example, **Andorra-la-Vella** in the `Phrase` column is **Andorra la Vella** in the `DisplayAs` column. If a row in this column is empty, Amazon Transcribe uses the contents of the `Phrase` column to determine output. You can include digits (`0-9`) in this column.  | 

Things to note when creating your table:
+ Your table must contain all four column headers (Phrase, SoundsLike, IPA, and DisplayAs). The `Phrase` column must contain an entry on each row. The ability to provide pronunciation inputs through `IPA` and `SoundsLike` is no longer supported and you may leave the column empty. Any values in these columns will be ignored.
+ Each column must be TAB or comma (,) delineated; this applies to every row in your custom vocabulary file. If a row contains empty columns, you must still include a delineator (TAB or comma) for each column.
+ Spaces are only allowed within the `IPA` and `DisplayAs` columns. Do not use spaces to separate columns.
+ `IPA` and `SoundsLike` are no longer supported for Custom Vocabulary. Please leave the column empty. Any values in these column will be ignored. We will remove the support for this column in the future.
+ The `DisplayAs` column supports symbols and special characters (for example, C\$1\$1). All other columns support the characters that are listed on your language's [character set](charsets.md) page.
+ If you want to include numbers in the `Phrase` column, you must spell them out. Digits (`0-9`) are only supported in the `DisplayAs` column.
+ You must save your table as a plaintext (\$1.txt) file in `LF` format. If you use any other format, such as `CRLF`, your custom vocabulary can't be processed.
+ You must upload your custom vocabulary file into an Amazon S3 bucket and process it using [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html) before you can include it in a transcription request. Refer to [Creating custom vocabulary tables](#custom-vocabulary-create-table-examples) for instructions.

**Note**  
Enter acronyms, or other words whose letters should be pronounced individually, as single letters separated by periods (**A.B.C.**). To enter the plural form of an acronym, such as 'ABCs', separate the 's' from the acronym with a hyphen (**A.B.C.-s**). You can use upper or lower case letters to define an acronym. Acronyms are not supported in all languages; refer to [Supported languages and language-specific features](supported-languages.md).

Here is a sample custom vocabulary table (where **[TAB]** represents a tab character):

```
Phrase[TAB]SoundsLike[TAB]IPA[TAB]DisplayAs
Los-Angeles[TAB][TAB][TAB]Los Angeles
Eva-Maria[TAB][TAB][TAB]
A.B.C.-s[TAB][TAB][TAB]ABCs
Amazon-dot-com[TAB][TAB][TAB]Amazon.com
C.L.I.[TAB][TAB][TAB]CLI
Andorra-la-Vella[TAB][TAB][TAB]Andorra la Vella
Dynamo-D.B.[TAB][TAB][TAB]DynamoDB
V.X.-zero-two[TAB][TAB][TAB]VX02
V.X.-zero-two-Q.[TAB][TAB][TAB]VX02Q
```

For visual clarity, here is the same table with aligned columns. **Do not** add spaces between columns in your custom vocabulary table; your table should look misaligned like the preceding example.

```
Phrase          [TAB]SoundsLike          [TAB]IPA                [TAB]DisplayAs  
Los-Angeles     [TAB]                    [TAB]                   [TAB]Los Angeles   
Eva-Maria       [TAB]                    [TAB]                   [TAB]
A.B.C.-s        [TAB]                    [TAB]                   [TAB]ABCs  
amazon-dot-com  [TAB]                    [TAB]                   [TAB]amazon.com
C.L.I.          [TAB]                    [TAB]                   [TAB]CLI   
Andorra-la-Vella[TAB]                    [TAB]                   [TAB]Andorra la Vella
Dynamo-D.B.     [TAB]                    [TAB]                   [TAB]DynamoDB
V.X.-zero-two   [TAB]                    [TAB]                   [TAB]VX02
V.X.-zero-two-Q.[TAB]                    [TAB]                   [TAB]VX02Q
```

## Creating custom vocabulary tables
<a name="custom-vocabulary-create-table-examples"></a>

To process a custom vocabulary table for use with Amazon Transcribe, see the following examples:

### AWS Management Console
<a name="vocab-create-table-console"></a>

1. Sign in to the [AWS Management Console](https://console.aws.amazon.com/transcribe/).

1. In the navigation pane, choose **Custom vocabulary**. This opens the **Custom vocabulary** page where you can view existing vocabularies or create a new one.

1. Select **Create vocabulary**.  
![\[Amazon Transcribe console screenshot: the 'custom vocabulary' page.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-console.png)

   This takes you to the **Create vocabulary** page. Enter a name for your new custom vocabulary.

   Here, you have three options:

   1. Upload a txt or csv file from your computer.

      You can either create your custom vocabulary from scratch or download a template to help you get started. Your vocabulary is then auto-populated in the **View and edit vocabulary** pane.  
![\[Amazon Transcribe console screenshot: the 'create and import vocabulary' page.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-console-upload.png)

   1. Import a txt or csv file from an Amazon S3 location.

      You can either create your custom vocabulary from scratch or download a template to help you get started. Upload your finished vocabulary file to an Amazon S3 bucket and specify its URI in your request. Your vocabulary is then auto-populated in the **View and edit vocabulary** pane.  
![\[Amazon Transcribe console screenshot: the 'create and import vocabulary' page.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-console-s3.png)

   1. Manually create your vocabulary in the console.

      Scroll to the **View and edit vocabulary** pane and select **Add 10 rows**. You can now manually enter terms.  
![\[Amazon Transcribe console screenshot: the 'create and import vocabulary' page.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-console-manual.png)

1. You can edit your vocabulary the **View and edit vocabulary** pane. To make changes, click on the entry you want to modify.  
![\[Amazon Transcribe console screenshot: the 'create and edit vocabulary' pane.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-edit2.png)

   If you make an error, you get a detailed error message so you can correct any issues prior to processing your vocabulary. Note that if you don't correct all errors before selecting **Create vocabulary**, your vocabulary request fails.  
![\[Amazon Transcribe console screenshot: the 'create and edit vocabulary' pane.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-edit3.png)

   Select the check mark (✓) to save your changes or the 'X' to discard your changes.

1. Optionally, add tags to your custom vocabulary. Once you have all fields completed and are happy with your vocabulary, select **Create vocabulary** at the bottom of the page. This takes you back to the **Custom vocabulary** page where you can view the status of your custom vocabulary. When the status changes from 'Pending' to 'Ready' your custom vocabulary can be used with a transcription.  
![\[Amazon Transcribe console screenshot: custom vocabulary in pending status while processing.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-console-pending.png)

1. If the status changes to 'Failed', select the name of your custom vocabulary to go to its information page.  
![\[Amazon Transcribe console screenshot: 'custom vocabulary' page showing one vocabulary as complete and one as failed.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-console-failed.png)

   There is a **Failure reason** banner at the top of this page that provides information on why your custom vocabulary failed. Correct the error in your text file and try again.  
![\[Amazon Transcribe console screenshot: vocabulary's information page shows failure reason.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-create-console-failed2.png)

### AWS CLI
<a name="vocab-create-table-cli"></a>

This example uses the [create-vocabulary](https://docs.aws.amazon.com/cli/latest/reference/transcribe/create-vocabulary.html) command with a table-formatted vocabulary file. For more information, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html).

To use an existing custom vocabulary in a transcription job, set the `VocabularyName` in the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html) field when you call the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html) operation or, from the AWS Management Console, choose the custom vocabulary from the dropdown list.

```
aws transcribe create-vocabulary \ 
--vocabulary-name my-first-vocabulary \ 
--vocabulary-file-uri s3://amzn-s3-demo-bucket/my-vocabularies/my-vocabulary-file.txt \
--language-code en-US
```

Here's another example using the [create-vocabulary](https://docs.aws.amazon.com/cli/latest/reference/transcribe/create-vocabulary.html) command, and a request body that creates your custom vocabulary.

```
aws transcribe create-vocabulary \
--cli-input-json file://filepath/my-first-vocab-table.json
```

The file *my-first-vocab-table.json* contains the following request body.

```
{
  "VocabularyName": "my-first-vocabulary",
  "VocabularyFileUri": "s3://amzn-s3-demo-bucket/my-vocabularies/my-vocabulary-table.txt",
  "LanguageCode": "en-US"
}
```

Once `VocabularyState` changes from `PENDING` to `READY`, your custom vocabulary is ready to use with a transcription. To view the current status of your custom vocabulary, run:

```
aws transcribe get-vocabulary \
--vocabulary-name my-first-vocabulary
```

### AWS SDK for Python (Boto3)
<a name="vocab-create-table-python-batch"></a>

This example uses the AWS SDK for Python (Boto3) to create a custom vocabulary from a table using the [create\$1vocabulary](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.create_vocabulary) method. For more information, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html).

To use an existing custom vocabulary in a transcription job, set the `VocabularyName` in the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html) field when you call the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html) operation or, from the AWS Management Console, choose the custom vocabulary from the dropdown list.

For additional examples using the AWS SDKs, including feature-specific, scenario, and cross-service examples, refer to the [Code examples for Amazon Transcribe using AWS SDKs](service_code_examples.md) chapter.

```
from __future__ import print_function
import time
import boto3
transcribe = boto3.client('transcribe', 'us-west-2')
vocab_name = "my-first-vocabulary"
response = transcribe.create_vocabulary(
    LanguageCode = 'en-US',
    VocabularyName = vocab_name,
    VocabularyFileUri = 's3://amzn-s3-demo-bucket/my-vocabularies/my-vocabulary-table.txt'
)

while True:
    status = transcribe.get_vocabulary(VocabularyName = vocab_name)
    if status['VocabularyState'] in ['READY', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)
```

**Note**  
If you create a new Amazon S3 bucket for your custom vocabulary files, make sure the IAM role making the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html) request has permissions to access this bucket. If the role doesn't have the correct permissions, your request fails. You can optionally specify an IAM role within your request by including the `DataAccessRoleArn` parameter. For more information on IAM roles and policies in Amazon Transcribe, see [Amazon Transcribe identity-based policy examples](security_iam_id-based-policy-examples.md).

# Creating a custom vocabulary using a list
<a name="custom-vocabulary-create-list"></a>

**Important**  
Custom vocabularies in list format are being deprecated, so if you're creating a new custom vocabulary, we strongly recommend using the [table format](custom-vocabulary-create-table.md).

You can create custom vocabularies from lists using the AWS Management Console, AWS CLI, or AWS SDKs.
+ **AWS Management Console**: You must create and upload a text file containing your custom vocabulary. You can use line-separated or comma-separated entries. Note that your list must be saved as a text (\$1.txt) file in `LF` format. If you use any other format, such as `CRLF`, your custom vocabulary is not accepted by Amazon Transcribe.
+ **AWS CLI** and **AWS SDKs**: You must include your custom vocabulary as comma-separated entries within your API call using the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html#transcribe-CreateVocabulary-request-Phrases](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html#transcribe-CreateVocabulary-request-Phrases) flag.

If an entry contains multiple words, you must hyphenate each word. For example, you include 'Los Angeles' as **Los-Angeles** and 'Andorra la Vella' as **Andorra-la-Vella**.

Here are examples of the two valid list formats. Refer to [Creating custom vocabulary lists](#custom-vocabulary-create-list-examples) for method-specific examples.
+ Comma-separated entries:

  ```
  Los-Angeles,CLI,Eva-Maria,ABCs,Andorra-la-Vella
  ```
+ Line-separated entries:

  ```
  Los-Angeles
  CLI
  Eva-Maria
  ABCs
  Andorra-la-Vella
  ```

**Important**  
You can only use characters that are supported for your language. Refer to your language's [character set](charsets.md) for details.

Custom vocabulary lists are not supported with the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateMedicalVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateMedicalVocabulary.html) operation. If creating a custom medical vocabulary, you must use a table format; refer to [Creating a custom vocabulary using a table](custom-vocabulary-create-table.md) for instructions.

## Creating custom vocabulary lists
<a name="custom-vocabulary-create-list-examples"></a>

To process a custom vocabulary list for use with Amazon Transcribe, see the following examples:

### AWS CLI
<a name="vocab-create-list-cli"></a>

This example uses the [create-vocabulary](https://docs.aws.amazon.com/cli/latest/reference/transcribe/create-vocabulary.html) command with a list-formatted custom vocabulary file. For more information, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html).

```
aws transcribe create-vocabulary \ 
--vocabulary-name my-first-vocabulary \ 
--language-code en-US \ 
--phrases {CLI,Eva-Maria,ABCs}
```

Here's another example using the [create-vocabulary](https://docs.aws.amazon.com/cli/latest/reference/transcribe/create-vocabulary.html) command, and a request body that creates your custom vocabulary.

```
aws transcribe create-vocabulary \
--cli-input-json file://filepath/my-first-vocab-list.json
```

The file *my-first-vocab-list.json* contains the following request body.

```
{
  "VocabularyName": "my-first-vocabulary",
  "LanguageCode": "en-US",
  "Phrases": [
        "CLI","Eva-Maria","ABCs"
  ]
}
```

Once `VocabularyState` changes from `PENDING` to `READY`, your custom vocabulary is ready to use with a transcription. To view the current status of your custom vocabulary, run:

```
aws transcribe get-vocabulary \
--vocabulary-name my-first-vocabulary
```

### AWS SDK for Python (Boto3)
<a name="vocab-create-list-python-batch"></a>

This example uses the AWS SDK for Python (Boto3) to create a custom vocabulary from a list using the [create\$1vocabulary](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.create_vocabulary) method. For more information, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html).

For additional examples using the AWS SDKs, including feature-specific, scenario, and cross-service examples, refer to the [Code examples for Amazon Transcribe using AWS SDKs](service_code_examples.md) chapter.

```
from __future__ import print_function
import time
import boto3
transcribe = boto3.client('transcribe', 'us-west-2')
vocab_name = "my-first-vocabulary"
response = transcribe.create_vocabulary(
    LanguageCode = 'en-US',
    VocabularyName = vocab_name,
    Phrases = [
        'CLI','Eva-Maria','ABCs'
    ]
)

while True:
    status = transcribe.get_vocabulary(VocabularyName = vocab_name)
    if status['VocabularyState'] in ['READY', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)
```

**Note**  
If you create a new Amazon S3 bucket for your custom vocabulary files, make sure the IAM role making the [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_CreateVocabulary.html) request has permissions to access this bucket. If the role doesn't have the correct permissions, your request fails. You can optionally specify an IAM role within your request by including the `DataAccessRoleArn` parameter. For more information on IAM roles and policies in Amazon Transcribe, see [Amazon Transcribe identity-based policy examples](security_iam_id-based-policy-examples.md).

# Using a custom vocabulary
<a name="custom-vocabulary-using"></a>

Once your custom vocabulary is created, you can include it in your transcription requests; refer to the following sections for examples.

The language of the custom vocabulary you're including in your request must match the language code you specify for your media. If the languages don't match, your custom vocabulary is not applied to your transcription and there are no warnings or errors.

## Using a custom vocabulary in a batch transcription
<a name="custom-vocabulary-using-batch"></a>

To use a custom vocabulary with a batch transcription, see the following for examples:

### AWS Management Console
<a name="vocab-using-console-batch"></a>

1. Sign in to the [AWS Management Console](https://console.aws.amazon.com/transcribe/).

1. In the navigation pane, choose **Transcription jobs**, then select **Create job** (top right). This opens the **Specify job details** page.  
![\[Amazon Transcribe console screenshot: the 'specify job details' page.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/console-batch-job-details-1.png)

   Name your job and specify your input media. Optionally include any other fields, then choose **Next**.

1. At the bottom of the **Configure job** page, in the **Customization** panel, toggle on **Custom vocabulary**.  
![\[Amazon Transcribe console screenshot: the 'configure job' page.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/console-batch-configure-job-vocab.png)

1. Select your custom vocabulary from the dropdown menu.

   Select **Create job** to run your transcription job. 

### AWS CLI
<a name="vocab-using-cli"></a>

This example uses the [start-transcription-job](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/transcribe/start-transcription-job.html) command and `Settings` parameter with the `VocabularyName` sub-parameter. For more information, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html) and [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html).

```
aws transcribe start-transcription-job \
--region us-west-2 \
--transcription-job-name my-first-transcription-job \
--media MediaFileUri=s3://amzn-s3-demo-bucket/my-input-files/my-media-file.flac \
--output-bucket-name amzn-s3-demo-bucket \
--output-key my-output-files/ \
--language-code en-US \
--settings VocabularyName=my-first-vocabulary
```

Here's another example using the [start-transcription-job](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/transcribe/start-transcription-job.html) command, and a request body that includes your custom vocabulary with that job.

```
aws transcribe start-transcription-job \
--region us-west-2 \
--cli-input-json file://my-first-vocabulary-job.json
```

The file *my-first-vocabulary-job.json* contains the following request body.

```
{
  "TranscriptionJobName": "my-first-transcription-job",
  "Media": {
        "MediaFileUri": "s3://amzn-s3-demo-bucket/my-input-files/my-media-file.flac"
  },
  "OutputBucketName": "amzn-s3-demo-bucket",
  "OutputKey": "my-output-files/", 
  "LanguageCode": "en-US",
  "Settings": {
        "VocabularyName": "my-first-vocabulary"
   }
}
```

### AWS SDK for Python (Boto3)
<a name="vocab-using-python-batch"></a>

This example uses the AWS SDK for Python (Boto3) to include a custom vocabulary using the `Settings` argument for the [start\$1transcription\$1job](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.start_transcription_job) method. For more information, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html) and [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Settings.html).

For additional examples using the AWS SDKs, including feature-specific, scenario, and cross-service examples, refer to the [Code examples for Amazon Transcribe using AWS SDKs](service_code_examples.md) chapter.

```
from __future__ import print_function
import time
import boto3
transcribe = boto3.client('transcribe', 'us-west-2')
job_name = "my-first-transcription-job"
job_uri = "s3://amzn-s3-demo-bucket/my-input-files/my-media-file.flac"
transcribe.start_transcription_job(
    TranscriptionJobName = job_name,
    Media = {
        'MediaFileUri': job_uri
    },
    OutputBucketName = 'amzn-s3-demo-bucket',
    OutputKey = 'my-output-files/', 
    LanguageCode = 'en-US', 
    Settings = {
        'VocabularyName': 'my-first-vocabulary' 
   }
)

while True:
    status = transcribe.get_transcription_job(TranscriptionJobName = job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)
```

## Using a custom vocabulary in a streaming transcription
<a name="custom-vocabulary-using-stream"></a>

To use a custom vocabulary with a streaming transcription, see the following for examples:

### AWS Management Console
<a name="vocab-using-console-stream"></a>

1. Sign into the [AWS Management Console](https://console.aws.amazon.com/transcribe/).

1. In the navigation pane, choose **Real-time transcription**. Scroll down to **Customizations** and expand this field if it is minimized.  
![\[Amazon Transcribe console screenshot: the 'real-time transcription' page.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/stream-main.png)

1. Toggle on **Custom vocabulary** and select a custom vocabulary from the dropdown menu.  
![\[Amazon Transcribe console screenshot: the expanded 'customizations' pane.\]](http://docs.aws.amazon.com/transcribe/latest/dg/images/vocab-stream2.png)

   Include any other settings that you want to apply to your stream.

1. You're now ready to transcribe your stream. Select **Start streaming** and begin speaking. To end your dictation, select **Stop streaming**.

### HTTP/2 stream
<a name="vocab-using-http2"></a>

This example creates an HTTP/2 request that includes your custom vocabulary. For more information on using HTTP/2 streaming with Amazon Transcribe, see [Setting up an HTTP/2 stream](streaming-setting-up.md#streaming-http2). For more detail on parameters and headers specific to Amazon Transcribe, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_streaming_StartStreamTranscription.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_streaming_StartStreamTranscription.html).

```
POST /stream-transcription HTTP/2
host: transcribestreaming.us-west-2.amazonaws.com
X-Amz-Target: com.amazonaws.transcribe.Transcribe.StartStreamTranscription
Content-Type: application/vnd.amazon.eventstream
X-Amz-Content-Sha256: string
X-Amz-Date: 20220208T235959Z
Authorization: AWS4-HMAC-SHA256 Credential=access-key/20220208/us-west-2/transcribe/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date;x-amz-target;x-amz-security-token, Signature=string
x-amzn-transcribe-language-code: en-US
x-amzn-transcribe-media-encoding: flac
x-amzn-transcribe-sample-rate: 16000      
x-amzn-transcribe-vocabulary-name: my-first-vocabulary
transfer-encoding: chunked
```

Parameter definitions can be found in the [API Reference](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Reference.html); parameters common to all AWS API operations are listed in the [Common Parameters](https://docs.aws.amazon.com/transcribe/latest/APIReference/CommonParameters.html) section.

### WebSocket stream
<a name="vocab-using-websocket"></a>

This example creates a presigned URL that applies your custom vocabulary to a WebSocket stream. Line breaks have been added for readability. For more information on using WebSocket streams with Amazon Transcribe, see [Setting up a WebSocket stream](streaming-setting-up.md#streaming-websocket). For more detail on parameters, see [https://docs.aws.amazon.com/transcribe/latest/APIReference/API_streaming_StartStreamTranscription.html](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_streaming_StartStreamTranscription.html).

```
GET wss://transcribestreaming.us-west-2.amazonaws.com:8443/stream-transcription-websocket?
&X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=AKIAIOSFODNN7EXAMPLE%2F20220208%2Fus-west-2%2Ftranscribe%2Faws4_request
&X-Amz-Date=20220208T235959Z
&X-Amz-Expires=300
&X-Amz-Security-Token=security-token
&X-Amz-Signature=string
&X-Amz-SignedHeaders=content-type%3Bhost%3Bx-amz-date
&language-code=en-US
&media-encoding=flac
&sample-rate=16000    
&vocabulary-name=my-first-vocabulary
```

Parameter definitions can be found in the [API Reference](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_Reference.html); parameters common to all AWS API operations are listed in the [Common Parameters](https://docs.aws.amazon.com/transcribe/latest/APIReference/CommonParameters.html) section.