$vectorSearch - Amazon DocumentDB

$vectorSearch

New from version 8.0

Not supported by Elastic cluster.

The $vectorSearch operator in Amazon DocumentDB allows you to perform vector search, a method used in machine learning to find similar data points by comparing their vector representations using distance or similarity metrics. This capability combines the flexibility and rich querying of a JSON-based document database with the power of vector search, enabling you to build machine learning and generative AI use cases such as semantic search, product recommendation, and more.

Parameters

  • <exact> (optional): Flag that specifies whether to run Exact Nearest Neighbor (ENN) or Approximate Nearest Neighbor (ANN) search. Value can be one of the following:

  • false - to run ANN search

  • true - to run ENN search

If omitted or set to false, numCandidates is required.

- `<index>` : Name of the Vector Search index to use. - `<limit>` : Number of documents to return in the results. - `<numCandidates>` (optional): This field is required if 'exact' is false or omitted. Number of nearest neighbors to use during the search. Value must be less than or equal to (<=) 10000. You can't specify a number less than the number of documents to return ('limit'). - `<path>` : Indexed vector type field to search. - `<queryVector>` : Array of numbers that represent the query vector.

Example (MongoDB Shell)

The following example demonstrates how to use the $vectorSearch operator to find similar product descriptions based on their vector representations.

Create sample documents

db.products.insertMany([ { _id: 1, name: "Product A", description: "A high-quality, eco-friendly product for your home.", description_vector: [ 0.2, 0.5, 0.8 ] }, { _id: 2, name: "Product B", description: "An innovative and modern kitchen appliance.", description_vector: [0.7, 0.3, 0.9] }, { _id: 3, name: "Product C", description: "A comfortable and stylish piece of furniture.", description_vector: [0.1, 0.2, 0.4] } ]);

Create vector search index

db.runCommand( { createIndexes: "products", indexes: [{ key: { "description_vector": "vector" }, vectorOptions: { type: "hnsw", dimensions: 3, similarity: "cosine", m: 16, efConstruction: 64 }, name: "description_index" }] } );

Query example

db.products.aggregate([ { $vectorSearch: { index: "description_index", limit: 2, numCandidates: 10, path: "description_vector", queryVector: [0.1, 0.2, 0.3] } } ]);

Output

[ { "_id": 1, "name": "Product A", "description": "A high-quality, eco-friendly product for your home.", "description_vector": [ 0.2, 0.5, 0.8 ] }, { "_id": 3, "name": "Product C", "description": "A comfortable and stylish piece of furniture.", "description_vector": [ 0.1, 0.2, 0.4 ] } ]

Code examples

To view a code example for using the $vectorSearch command, choose the tab for the language that you want to use:

Node.js
const { MongoClient } = require('mongodb'); async function findSimilarProducts(queryVector) { const client = await MongoClient.connect('mongodb://<username>:<password>@<cluster-endpoint>:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false'); const db = client.db('test'); const collection = db.collection('products'); const result = await collection.aggregate([ { $vectorSearch: { index: "description_index", limit: 2, numCandidates: 10, path: "description_vector", queryVector: queryVector } } ]).toArray(); console.log(result); client.close(); } findSimilarProducts([0.1, 0.2, 0.3]);
Python
from pymongo import MongoClient def find_similar_products(query_vector): client = MongoClient('mongodb://<username>:<password>@<cluster-endpoint>:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false') db = client.test collection = db.products result = list(collection.aggregate([ { '$vectorSearch': { 'index': "description_index", 'limit': 2, 'numCandidates': 10, 'path': "description_vector", 'queryVector': query_vector } } ])) print(result) client.close() find_similar_products([0.1, 0.2, 0.3])