SingleStore is a robust, high-performance distributed SQL database solution designed to excel in both cloud and on-premises environments. Boasting a versatile feature set, it offers seamless deployment options while delivering unparalleled performance.A standout feature of SingleStore is its advanced support for vector storage and operations, making it an ideal choice for applications requiring intricate AI capabilities such as text similarity matching. With built-in vector functions like dot_product and euclidean_distance, SingleStore empowers developers to implement sophisticated algorithms efficiently. For developers keen on leveraging vector data within SingleStore, a comprehensive tutorial is available, guiding them through the intricacies of working with vector data. This tutorial delves into the Vector Store within SingleStoreDB, showcasing its ability to facilitate searches based on vector similarity. Leveraging vector indexes, queries can be executed with remarkable speed, enabling swift retrieval of relevant data. Moreover, SingleStore’s Vector Store seamlessly integrates with full-text indexing based on Lucene, enabling powerful text similarity searches. Users can filter search results based on selected fields of document metadata objects, enhancing query precision. What sets SingleStore apart is its ability to combine vector and full-text searches in various ways, offering flexibility and versatility. Whether prefiltering by text or vector similarity and selecting the most relevant data, or employing a weighted sum approach to compute a final similarity score, developers have multiple options at their disposal. In essence, SingleStore provides a comprehensive solution for managing and querying vector data, offering unparalleled performance and flexibility for AI-driven applications.
Class | Package | JS support |
---|---|---|
SingleStoreVectorStore | langchain_singlestore | ✅ |
SingleStoreDB
(deprecated), seethe v0.2 documentation.langchain-singlestore
integration package.
%pip install -qU “langchain-singlestore”
SingleStoreVectorStore
, you need an Embeddings
object and connection parameters for the SingleStore database.
Embeddings
): A text embedding model.DistanceStrategy
): Strategy for calculating vector distances. Defaults to DOT_PRODUCT
. Options:
DOT_PRODUCT
: Computes the scalar product of two vectors.EUCLIDEAN_DISTANCE
: Computes the Euclidean distance between two vectors.str
): Name of the table. Defaults to embeddings
.
str
): Field for storing content. Defaults to content
.
str
): Field for storing metadata. Defaults to metadata
.
str
): Field for storing vectors. Defaults to vector
.
str
): Field for storing IDs. Defaults to id
.
bool
): Enables vector indexing (requires SingleStore 8.5+). Defaults to False
.
str
): Name of the vector index. Ignored if use_vector_index
is False
.
dict
): Options for the vector index. Ignored if use_vector_index
is False
.
int
): Size of the vector. Required if use_vector_index
is True
.
bool
): Enables full-text indexing on content. Defaults to False
.
int
): Number of active connections in the pool. Defaults to 5
.int
): Maximum connections beyond pool_size
. Defaults to 10
.float
): Connection timeout in seconds. Defaults to 30
.str
): Hostname, IP, or URL for the database.str
): Database username.str
): Database password.int
): Database port. Defaults to 3306
.str
): Database name.bool
): Enables pure Python mode.bool
): Allows local file uploads.str
): Character set for string values.str
): Paths to SSL files.bool
): Disables SSL.bool
): Verifies server’s certificate.bool
): Verifies server’s identity.bool
): Enables autocommits.str
): Structure of query results (e.g., tuples
, dicts
).SingleStoreVectorStore
assumes that a Document’s ID is an integer. Below are examples of how to manage the vector store.
use_vector_index=True
during vector store object creation, you can activate this feature. Additionally, if your vectors differ in dimensionality from the default OpenAI embedding size of 1536, ensure to specify the vector_size
parameter accordingly.
VECTOR_ONLY
strategy utilizes vector operations such as dot_product
or euclidean_distance
to calculate similarity scores directly between vectors, while TEXT_ONLY
employs Lucene-based full-text search, particularly advantageous for text-centric applications. For users seeking a balanced approach, FILTER_BY_TEXT
first refines results based on text similarity before conducting vector comparisons, whereas FILTER_BY_VECTOR
prioritizes vector similarity, filtering results before assessing text similarity for optimal matches. Notably, both FILTER_BY_TEXT
and FILTER_BY_VECTOR
necessitate a full-text index for operation. Additionally, WEIGHTED_SUM
emerges as a sophisticated strategy, calculating the final similarity score by weighing vector and text similarities, albeit exclusively utilizing dot_product distance calculations and also requiring a full-text index. These versatile strategies empower users to fine-tune searches according to their unique needs, facilitating efficient and precise data retrieval and analysis. Moreover, SingleStoreDB’s hybrid approaches, exemplified by FILTER_BY_TEXT
, FILTER_BY_VECTOR
, and WEIGHTED_SUM
strategies, seamlessly blend vector and text-based searches to maximize efficiency and accuracy, ensuring users can fully leverage the platform’s capabilities for a wide range of applications.