Cloud SQL is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. It offers MySQL, PostgreSQL, and SQL Server database engines. Extend your database application to build AI-powered experiences leveraging Cloud SQL’s Langchain integrations.This notebook goes over how to use Cloud SQL for MySQL to save, load and delete langchain documents with
MySQLLoader
and MySQLDocumentSaver
.
Learn more about the package on GitHub.
langchain-google-cloud-sql-mysql
package, so we need to install it.
gcloud config list
.gcloud projects list
.MySQLEngine
configures a connection pool to your Cloud SQL database, enabling successful connections from your application and following industry best practices.
To create a MySQLEngine
using MySQLEngine.from_instance()
you need to provide only 4 things:
project_id
: Project ID of the Google Cloud Project where the Cloud SQL instance is located.region
: Region where the Cloud SQL instance is located.instance
: The name of the Cloud SQL instance.database
: The name of the database to connect to on the Cloud SQL instance.user
and password
arguments to MySQLEngine.from_instance()
:
user
: Database user to use for built-in database authentication and loginpassword
: Database password to use for built-in database authentication and login.MySQLEngine.init_document_table(<table_name>)
. Table Columns:
overwrite_existing=True
flag means the newly initialized table will replace any existing table of the same name.
MySQLDocumentSaver.add_documents(<documents>)
. To initialize MySQLDocumentSaver
class you need to provide 2 things:
engine
- An instance of a MySQLEngine
engine.table_name
- The name of the table within the Cloud SQL database to store langchain documents.MySQLLoader.load()
or MySQLLoader.lazy_load()
. lazy_load
returns a generator that only queries database during the iteration. To initialize MySQLLoader
class you need to provide:
engine
- An instance of a MySQLEngine
engine.table_name
- The name of the table within the Cloud SQL database to store langchain documents.MySQLDocumentSaver.delete(<documents>)
.
For table with default schema (page_content, langchain_metadata), the deletion criteria is:
A row
should be deleted if there exists a document
in the list, such that
document.page_content
equals row[page_content]
document.metadata
equals row[langchain_metadata]
MySQLLoader
from this example table, the page_content
of loaded documents will be the first column of the table, and metadata
will be consisting of key-value pairs of all the other columns.
content_columns
and metadata_columns
when initializing the MySQLLoader
.
content_columns
: The columns to write into the page_content
of the document.metadata_columns
: The columns to write into the metadata
of the document.content_columns
will be joined together into a space-separated string, as page_content
of loaded documents, and metadata
of loaded documents will only contain key-value pairs of columns specified in metadata_columns
.
MySQLEngine.init_document_table()
, and specify the list of metadata_columns
we want it to have. In this example, the created table will have table columns:
MySQLEngine.init_document_table()
to create the table:
table_name
: The name of the table within the Cloud SQL database to store langchain documents.metadata_columns
: A list of sqlalchemy.Column
indicating the list of metadata columns we need.content_column
: The name of column to store page_content
of langchain document. Default: page_content
.metadata_json_column
: The name of JSON column to store extra metadata
of langchain document. Default: langchain_metadata
.MySQLDocumentSaver.add_documents(<documents>)
. As you can see in this example,
document.page_content
will be saved into description
column.document.metadata.fruit_name
will be saved into fruit_name
column.document.metadata.organic
will be saved into organic
column.document.metadata.fruit_id
will be saved into other_metadata
column in JSON format.MySQLDocumentSaver.delete(<documents>)
. The deletion criteria is:
A row
should be deleted if there exists a document
in the list, such that
document.page_content
equals row[page_content]
k
in document.metadata
document.metadata[k]
equals row[k]
or document.metadata[k]
equals row[langchain_metadata][k]
row
but not in document.metadata
.