

# Kinesis Video Streams: How it works
<a name="how-it-works"></a>

**Topics**
+ [Kinesis Video Streams API and producer libraries support](how-it-works-kinesis-video-api-producer-sdk.md)
+ [Kinesis Video Streams data model](how-data.md)

You can use Amazon Kinesis Video Streams, a fully managed AWS service, to stream live video from devices to the AWS Cloud and durably store it. You can then build your own applications for real-time video processing or perform batch-oriented video analytics.

The following diagram provides an overview of how Kinesis Video Streams works.





![\[Diagram showing interaction of producers and consumers in Kinesis Video Streams.\]](http://docs.aws.amazon.com/kinesisvideostreams/latest/dg/images/acuity-arch-3a.png)


The diagram demonstrates the interaction among the following components:
+ **Producer** – Any source that puts data into a Kinesis video stream. A producer can be any video-generating device, such as a security camera, a body-worn camera, a smart phone camera, or a dashboard camera. A producer can also send non-video data, such as audio feeds, images, or RADAR data.

  A single producer can generate one or more video streams. For example, a video camera can push video data to one Kinesis video stream and audio data to another.
  + **Kinesis Video Streams producer libraries** – A set of software and libraries that you can install and configure on your devices. You can use these libraries to securely connect and reliably stream video in different ways, including in real time, after buffering it for a few seconds, or as after-the-fact media uploads.
+ **Kinesis video stream** – A resource that you can use to transport live video data, optionally store it, and make the data available for consumption both in real time and on a batch or one-time basis. In a typical configuration, a Kinesis video stream has only one producer publishing data into it. 

  The stream can carry audio, video, and similar time-encoded data streams, such as depth sensing feeds, RADAR feeds, and more. You create a Kinesis video stream using the AWS Management Console or programmatically using the AWS SDKs.

  Multiple independent applications can consume a Kinesis video stream in parallel. 
+ **Consumer** – Gets data, such as fragments and frames, from a Kinesis video stream to view, process, or analyze it. Generally these consumers are called Kinesis Video Streams applications. You can write applications that consume and process data in Kinesis Video Streams in real time, or after the data is stored and time-indexed when low latency processing isn't required. You can create these consumer applications to run on Amazon EC2 instances.
  + [Watch output from cameras using parser library](parser-library.md) – Enables Kinesis Video Streams applications to reliably get media from Kinesis video stream in a low-latency manner. Additionally, it parses the frame boundaries in the media so that applications can focus on processing and analyzing the frames themselves.

# Kinesis Video Streams API and producer libraries support
<a name="how-it-works-kinesis-video-api-producer-sdk"></a>

Kinesis Video Streams provides APIs for you to create and manage streams and read or write media data to and from a stream. The Kinesis Video Streams console, in addition to administration functionality, also supports live and video-on-demand playback. Kinesis Video Streams also provides a set of producer libraries that you can use in your application code to extract data from your media sources and upload to your Kinesis video stream.

**Topics**
+ [Kinesis Video Streams API](#how-it-works-kinesis-video-api)
+ [Endpoint discovery pattern](#how-it-works-api-pattern)
+ [Producer libraries](#how-it-works-producer-sdk)

## Kinesis Video Streams API
<a name="how-it-works-kinesis-video-api"></a>

Kinesis Video Streams provides APIs for creating and managing Kinesis Video Streams. It also provides APIs for reading and writing media data to a stream, as follows:
+ **Producer API** – Kinesis Video Streams provides a `PutMedia` API to write media data to a Kinesis video stream. In a `PutMedia` request, the producer sends a stream of media fragments. A *fragment* is a self-contained sequence of frames. The frames belonging to a fragment should have no dependency on any frames from other fragments. For more information, see [PutMedia](API_dataplane_PutMedia.md).

  As fragments arrive, Kinesis Video Streams assigns a unique fragment number, in increasing order. It also stores producer-side and server-side timestamps for each fragment, as Kinesis Video Streams-specific metadata. 
+ **Consumer APIs** – Consumers can use the following APIs to get data from a stream:
  + `GetMedia` - When using this API, consumers must identify the starting fragment. The API then returns fragments in the order in which they were added to the stream (in increasing order by fragment number). The media data in the fragments is packed into a structured format such as [Matroska (MKV)](https://www.matroska.org/technical/specs/index.html). For more information, see [GetMedia](API_dataplane_GetMedia.md).
**Note**  
`GetMedia` knows where the fragments are (archived in the data store or available in real time). For example, if `GetMedia` determines that the starting fragment is archived, it starts returning fragments from the data store. When it must return newer fragments that aren't archived yet, `GetMedia` switches to reading fragments from an in-memory stream buffer. 

    This is an example of a continuous consumer, which processes fragments in the order that they are ingested by the stream.

    `GetMedia` enables video-processing applications to fail or fall behind, and then catch up with no additional effort. Using `GetMedia`, applications can process data that's archived in the data store, and as the application catches up, `GetMedia` continues to feed media data in real time as it arrives. 
  + `GetMediaFromFragmentList` (and `ListFragments`) - Batch processing applications are considered offline consumers. Offline consumers might choose to explicitly fetch particular media fragments or ranges of video by combining the `ListFragments` and `GetMediaFromFragmentList` APIs. `ListFragments` and `GetMediaFromFragmentList` enable an application to identify segments of video for a particular time range or fragment range, and then fetch those fragments either sequentially or in parallel for processing. This approach is suitable for `MapReduce` application suites, which must quickly process large amounts of data in parallel.

    For example, suppose that a consumer wants to process one day's worth of video fragments. The consumer would do the following:

    1. Get a list of fragments by calling the `ListFragments` API and specifying a time range to select the desired collection of fragments.

       The API returns metadata from all the fragments in the specified time range. The metadata provides information such as fragment number, producer-side and server-side timestamps, and so on. 

    1. Take the fragment metadata list and retrieve fragments, in any order. For example, to process all the fragments for the day, the consumer might choose to split the list into sublists and have workers (for example, multiple Amazon EC2 instances) fetch the fragments in parallel using the `GetMediaFromFragmentList`, and process them in parallel.

The following diagram shows the data flow for fragments and chunks during these API calls.

![\[Diagram showing data flow for fragments and chucks during API calls\]](http://docs.aws.amazon.com/kinesisvideostreams/latest/dg/images/arch-20.png)


When a producer sends a `PutMedia` request, it sends media metadata in the payload, and then sends a sequence of media data fragments. Upon receiving the data, Kinesis Video Streams stores incoming media data as Kinesis Video Streams chunks. Each chunk consists of the following:
+ A copy of the media metadata
+ A fragment
+ Kinesis Video Streams-specific metadata; for example, the fragment number and server-side and producer-side timestamps

When a consumer requests media metadata, Kinesis Video Streams returns a stream of chunks, starting with the fragment number that you specify in the request.

If you enable data persistence for the stream, after receiving a fragment on the stream, Kinesis Video Streams also saves a copy of the fragment to the data store. 

## Endpoint discovery pattern
<a name="how-it-works-api-pattern"></a>

**Control Plane REST APIs**

To access the [Kinesis Video Streams Control Plane REST APIs](https://docs.aws.amazon.com//kinesisvideostreams/latest/dg/API_Operations_Amazon_Kinesis_Video_Streams.html), use the [Kinesis Video Streams service endpoints](https://docs.aws.amazon.com//general/latest/gr/akv.html#akv_region).

**Data Plane REST APIs**

Kinesis Video Streams is built using a [cellular architecture](https://docs.aws.amazon.com//wellarchitected/latest/reducing-scope-of-impact-with-cell-based-architecture/what-is-a-cell-based-architecture.html) to ensure better scaling and traffic isolation properties. Because each stream is mapped to a specific cell in a region, your application must use the correct cell-specific endpoints that your stream has been mapped to. When accessing the Data Plane REST APIs, you will need to manage and map the correct endpoints yourself. This process, the endpoint discovery pattern, is described below:

1. The endpoint discovery pattern starts with a call to one of the `GetEndpoints` actions. These actions belong to the Control Plane.

   1. If you are retrieving the endpoints for the [Amazon Kinesis Video Streams Media](API_Operations_Amazon_Kinesis_Video_Streams_Media.md) or [Amazon Kinesis Video Streams Archived Media](API_Operations_Amazon_Kinesis_Video_Streams_Archived_Media.md) services, use [GetDataEndpoint](API_GetDataEndpoint.md).

   1. If you are retrieving the endpoints for [Amazon Kinesis Video Signaling Channels](API_Operations_Amazon_Kinesis_Video_Signaling_Channels.md), [Amazon Kinesis Video WebRTC Storage](API_Operations_Amazon_Kinesis_Video_WebRTC_Storage.md), or [Kinesis Video Signaling](https://docs.aws.amazon.com//kinesisvideostreams-webrtc-dg/latest/devguide/kvswebrtc-websocket-apis.html), use [GetSignalingChannelEndpoint](API_GetSignalingChannelEndpoint.md).

1. Cache and reuse the endpoint.

1. If the cached endpoint no longer works, make a new call to `GetEndpoints` to refresh the endpoint.

## Producer libraries
<a name="how-it-works-producer-sdk"></a>

After you create a Kinesis video stream, you can start sending data to the stream. In your application code, you can use these libraries to extract data from your media sources and upload to your Kinesis video stream. For more information about the available producer libraries, see [Upload to Kinesis Video Streams](producer-sdk.md).

# Kinesis Video Streams data model
<a name="how-data"></a>

The [Upload to Kinesis Video Streams](producer-sdk.md) and [Watch output from cameras using parser library](parser-library.md) send and receive video data in a format that supports embedding information alongside video data. This format is based on the Matroska (MKV) specification.

The [MKV format](https://en.wikipedia.org/wiki/Matroska) is an open specification for media data. All the libraries and code examples in the *Amazon Kinesis Video Streams Developer Guide* send or receive data in the MKV format. 

The [Upload to Kinesis Video Streams](producer-sdk.md) uses the `StreamDefinition` and `Frame` types to produce MKV stream headers, frame headers, and frame data.

For information about the full MKV specification, see [Matroska Specifications](https://www.matroska.org/technical/specs/index.html).

The following sections describe the components of MKV-formatted data produced by the [C\$1\$1](producer-sdk-cpp.md).

**Topics**
+ [Stream header elements](#how-data-header-streamdefinition)
+ [Stream track data](#how-data-header-streamtrack)
+ [Frame header elements](#how-data-header-frame)
+ [MKV frame data](#how-data-frame)

## Stream header elements
<a name="how-data-header-streamdefinition"></a>

The following MKV header elements are used by `StreamDefinition` (defined in `StreamDefinition.h`).


****  

| Element | Description | Typical values | 
| --- | --- | --- | 
| stream\$1name | Corresponds to the name of the Kinesis video stream. | my-stream | 
| retention\$1period | The duration, in hours, that stream data is persisted by Kinesis Video Streams. Specify 0 for a stream that doesn't retain data.  | 24 | 
| tags | A key-value collection of user data. This data is displayed in the AWS Management Console and can be read by client applications to filter or get information about a stream. |  | 
| kms\$1key\$1id | If present, the user-defined AWS KMS key is used to encrypt data on the stream. If absent, the data is encrypted by the Kinesis-supplied key (aws/kinesisvideo). | 01234567-89ab-cdef-0123-456789ab | 
| streaming\$1type | Currently, the only valid streaming type is STREAMING\$1TYPE\$1REALTIME. | STREAMING\$1TYPE\$1REALTIME | 
| content\$1type | The user-defined content type. For streaming video data to play in the console, the content type must be video/h264. | video/h264 | 
| max\$1latency | This value isn't currently used and should be set to 0. | 0 | 
| fragment\$1duration | The estimate of how long your fragments should be, which is used for optimization. The actual fragment duration is determined by the streaming data. | 2 | 
| timecode\$1scale | Indicates the scale used by frame timestamps. The default is 1 millisecond. Specifying `0` also assigns the default value of 1 millisecond. This value can be between 100 nanoseconds and 1 second. For more information, see [TimecodeScale](https://matroska.org/technical/specs/notes.html#TimecodeScale) in the Matroska documentation. |  | 
| key\$1frame\$1fragmentation | If true, the stream starts a new cluster when a keyframe is received. | true | 
| frame\$1timecodes | If true, Kinesis Video Streams uses the presentation time stamp (pts) and decoding time stamp (dts) values of the received frames. If false, Kinesis Video Streams stamps the frames when they are received with system-generated time values. | true | 
| absolute\$1fragment\$1time |  If true, the cluster timecodes are interpreted as using absolute time (for example, from the producer's system clock). If false, the cluster timecodes are interpreted as being relative to the start time of the stream. | true | 
| fragment\$1acks |  If true, acknowledgements (ACKs) are sent when Kinesis Video Streams receives the data. The ACKs can be received using the KinesisVideoStreamFragmentAck or KinesisVideoStreamParseFragmentAck callbacks. | true | 
| restart\$1on\$1error | Indicates whether the stream should resume transmission after a stream error is raised. | true | 
| nal\$1adaptation\$1flags | Indicates whether NAL (Network Abstraction Layer) adaptation or codec private data is present in the content. Valid flags include NAL\$1ADAPTATION\$1ANNEXB\$1NALS and NAL\$1ADAPTATION\$1ANNEXB\$1CPD\$1NALS. | NAL\$1ADAPTATION\$1ANNEXB\$1NALS | 
| frame\$1rate | An estimate of the content frame rate. This value is used for optimization; the actual frame rate is determined by the rate of incoming data. Specifying 0 assigns the default of 24. | 24 | 
| avg\$1bandwidth\$1bps | An estimate of the content bandwidth, in Mbps. This value is used for optimization; the actual rate is determined by the bandwidth of incoming data. For example, for a 720 p resolution video stream running at 25 FPS, you can expect the average bandwidth to be 5 Mbps. | 5 | 
| buffer\$1duration | The duration that content is to be buffered on the producer. If there's low network latency, this value can be reduced. If network latency is high, increasing this value prevents frames from being dropped before they can be sent, due to allocation failing to put frames into the smaller buffer. |  | 
| replay\$1duration | The amount of time the video data stream is "rewound" if connection is lost. This value can be zero if lost frames due to connection loss are not a concern. The value can be increased if the consuming application can remove redundant frames. This value should be less than the buffer duration, otherwise the buffer duration is used. |  | 
| connection\$1staleness | The duration that a connection is maintained when no data is received. |  | 
| codec\$1id | The codec used by the content. For more information, see [CodecID](https://matroska.org/technical/specs/codecid/index.html) in the Matroska specification. | V\$1MPEG2 | 
| track\$1name | The user-defined name of the track. | my\$1track | 
| codecPrivateData | Data provided by the encoder used to decode the frame data, such as the frame width and height in pixels, which is needed by many downstream consumers. In the [C\$1\$1 producer library](producer-sdk-cpp.md), the gMkvTrackVideoBits array in MkvStatics.cpp includes pixel width and height for the frame. |  | 
| codecPrivateDataSize | The size of the data in the codecPrivateData parameter. |  | 
| track\$1type | The type of the track for the stream. | MKV\$1TRACK\$1INFO\$1TYPE\$1AUDIO or MKV\$1TRACK\$1INFO\$1TYPE\$1VIDEO | 
| segment\$1uuid | User-defined segment uuid (16 bytes). |  | 
| default\$1track\$1id | Unique non-zero number for the track. | 1 | 

## Stream track data
<a name="how-data-header-streamtrack"></a>

The following MKV track elements are used by `StreamDefinition` (defined in `StreamDefinition.h`).


****  

| Element | Description | Typical Values | 
| --- | --- | --- | 
| track\$1name  | User-defined track name. For example, "audio" for the audio track.  | audio | 
| codec\$1id | Codec id for the track. For example, "A\$1AAC" for an audio track. | A\$1AAC | 
| cpd | Data provided by the encoder used to decode the frame data. This data can include frame width and height in pixels, which is needed by many downstream consumers. In the [C\$1\$1 producer library](https://docs.aws.amazon.com//kinesisvideostreams/latest/dg/producer-sdk-cpp.html), the gMkvTrackVideoBits array in MkvStatics.cpp includes pixel width and height for the frame.  |  | 
| cpd\$1size | The size of the data in the codecPrivateData parameter. |  | 
| track\$1type | The type of the track. For example, you can use the enum value of MKV\$1TRACK\$1INFO\$1TYPE\$1AUDIO for audio. | MKV\$1TRACK\$1INFO\$1TYPE\$1AUDIO | 

## Frame header elements
<a name="how-data-header-frame"></a>

The following MKV header elements are used by `Frame` (defined in the `KinesisVideoPic` package, in `mkvgen/Include.h`):
+ **Frame Index:** A monotonically increasing value.
+ **Flags:** The type of frame. Valid values include the following:
  + `FRAME_FLAGS_NONE`
  + `FRAME_FLAG_KEY_FRAME`: If `key_frame_fragmentation` is set on the stream, key frames start a new fragment.
  + `FRAME_FLAG_DISCARDABLE_FRAME`: Tells the decoder that it can discard this frame if decoding is slow.
  + `FRAME_FLAG_INVISIBLE_FRAME`: Duration of this block is 0.
+ **Decoding Timestamp:** The timestamp of when this frame was decoded. If previous frames depend on this frame for decoding, this timestamp might be earlier than that of earlier frames. This value is relative to the start of the fragment.
+ **Presentation Timestamp:** The timestamp of when this frame is displayed. This value is relative to the start of the fragment.
+ **Duration:** The playback duration of the frame.
+ **Size:** The size of the frame data in bytes

## MKV frame data
<a name="how-data-frame"></a>

The data in `frame.frameData` might contain only media data for the frame, or it might contain further nested header information, depending on the encoding schema used. To be displayed in the AWS Management Console, the data must be encoded in the [H.264](https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC) codec, but Kinesis Video Streams can receive time-serialized data streams in any format.