本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 向量索引的 GPU 加速
<a name="gpu-acceleration-vector-index"></a>

GPU 加速可協助您更快、更有效率地建置大規模向量資料庫。您可以在新的或現有的 OpenSearch 網域和 OpenSearch Serverless 集合上啟用此功能。此功能使用 GPU 加速來減少將資料索引為向量索引所需的時間。

透過 GPU 加速，您可以提高向量索引速度高達四分之一的索引成本 10X。

## 先決條件
<a name="gpu-acceleration-prerequisites"></a>

執行 OpenSearch 版本 `3.1` 或更新版本的 OpenSearch 網域和 OpenSearch Serverless 集合支援 GPU 加速。如需詳細資訊，請參閱 [升級 Amazon OpenSearch Service 網域](version-migration.md)、[UpdateDomainConfig](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_UpdateDomainConfig.html) 和 [UpdateCollection](https://docs.aws.amazon.com/opensearch-service/latest/ServerlessAPIReference/API_UpdateCollection.html) APIs。

## 運作方式
<a name="gpu-acceleration-how-it-works"></a>

向量索引需要大量的運算資源來建置資料結構，例如階層式導航小型世界 (HNSW) 圖形。當您在網域或集合上啟用 GPU 加速時，OpenSearch 會自動偵測加速索引建置的機會，並將索引建置卸載至 GPU 執行個體。OpenSearch Service 會代表您管理 GPU 執行個體，並在需要時將其指派給您的網域或集合。這表示您不管理使用率或支付閒置時間。

您只需透過運算單位 (OCU) - 向量加速支付有用的處理費用。每個向量加速 OCU 是大約 8 GiB CPU 記憶體、2 vCPUs 和 6 GiB GPU 記憶體的組合。如需詳細資訊，請參閱[GPU 加速定價](#gpu-acceleration-pricing)。

若要為您的網域或集合啟用 GPU 加速，請參閱 [啟用 GPU 加速](gpu-acceleration-enabling.md)。

## GPU 加速定價
<a name="gpu-acceleration-pricing"></a>

AWS 當 OpenSearch 偵測到加速網域或集合索引建置工作負載的機會時， 會向您收取費用。每個向量加速 OCU 是大約 8 GiB CPU 記憶體、2 vCPUs 和 6 GiB GPU 記憶體的組合。

AWS 以第二級精細程度向 OCU 收費。在您的帳戶陳述式中，您會看到以 OCU 小時為單位的運算項目。

例如，當您使用 GPU 加速一小時來建立索引時，使用 2 個 vCPU 和 1 GiB 的 GPU 記憶體，則會向您收取 1 個 OCU。如果您在使用 GPU 加速時使用 9 GiB 的 CPU 記憶體，則會向您收取 2 個 OCU 的費用。

OpenSearch Serverless 會根據支援集合所需的運算能力和儲存體， OCUs 以 1 個 OCU 的增量新增額外的 OCU。您可以為您的帳戶設定 OCU 數量上限，以控制成本。

**注意**  
隨時佈建OCUs 數量可能不同，也不確切。隨著時間的推移，OpenSearch 和 OpenSearch Serverless 使用的演算法將繼續改進，以更好地最大限度地減少系統使用量。

如需完整定價詳細資訊，請參閱 [Amazon OpenSearch Service 定價](https://aws.amazon.com/opensearch-service/pricing/)。

## GPU 加速和寫入操作
<a name="gpu-acceleration-write-operations"></a>

當 OpenSearch 的向量擷取速率 (MB/秒） 在範圍內時，會啟用 GPU 加速。在 OpenSearch 網域上，您可以靈活地透過 `index.knn.remote_index_build.size.min`和 [設定此範圍](https://docs.opensearch.org/3.2/vector-search/remote-index-build/#using-the-remote-index-build-service)`index.knn.remote_index_build.size.max`。例如，在較低的範圍預設值為 50 MB 的情況下，寫入 15，000 個完整精確度向量，且[重新整理間隔](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/bp.html#bp-perf)之間具有 768 個維度，預設會觸發 GPU 加速。

使用下列 API 操作寫入資料：
+ [排清](https://docs.opensearch.org/latest/api-reference/index-apis/flush/)
+ [大批](https://docs.opensearch.org/latest/api-reference/document-apis/bulk/)
+ [重新索引](https://docs.opensearch.org/latest/api-reference/document-apis/reindex/)
+ [索引](https://docs.opensearch.org/latest/api-reference/index-apis/index/)
+ [更新](https://docs.opensearch.org/latest/api-reference/document-apis/update-document/)
+ [刪除](https://docs.opensearch.org/latest/api-reference/document-apis/delete-document/)
+ [強制合併](https://docs.opensearch.org/latest/api-reference/index-apis/force-merge/)

GPU 加速可透過自動和[手動](https://docs.opensearch.org/latest/api-reference/index-apis/force-merge/)區段合併來啟用。

## 支援的索引組態
<a name="gpu-acceleration-index-configurations"></a>

[Faiss](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#faiss-engine) 引擎支援 GPU 加速。

下列組態不支援 GPU 加速：
+ [Faiss 產品量化](https://docs.opensearch.org/latest/vector-search/optimizing-storage/faiss-product-quantization/)
+ [反轉檔案索引 (IVF)](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#ivf-parameters)
+ [非指標空間程式庫](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#nmslib-engine-deprecated)
+ [Lucene 引擎](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#lucene-engine)

## 支援的 AWS 區域
<a name="gpu-acceleration-regions"></a>

GPU 加速可在下列內容中使用 AWS 區域：
+ 美國東部 (維吉尼亞北部)
+ 美國西部 (奧勒岡)
+ 亞太地區 (雪梨)
+ 亞太地區 (東京)
+ 歐洲 (愛爾蘭)

## 最佳實務
<a name="gpu-acceleration-best-practices"></a>

遵循下列最佳實務，將向量搜尋工作負載的 GPU 加速優勢發揮到最大：
+ **增加索引用戶端** - 若要在索引建置期間充分利用 GPUs，請增加將資料擷取至 OpenSearch 的索引用戶端數量。這可提高 GPU 資源的平行處理和使用率。
+ **調整近似閾值** - 變更 `index.knn.advanced.approximate_threshold` 設定，以確保不會發生較小的區段索引組建，從而改善擷取的整體速度。值 10，000 是很好的起點。對於集合，您必須明確指定此設定的值。
+ **最佳化碎片大小** - 嘗試建立至少具有 100 萬份文件的碎片。少於此數量的文件碎片可能無法從 GPU 加速中看到整體優勢。