- Connect to Oracle Database from Python
- Load documents using
OracleDocLoader(from database tables, files, or directories) - Chunk content with
OracleTextSplitterto prepare for embeddings and retrieval
Why in-database document processing? You can apply Oracle Database capabilities—security, transactions, scalability, and high availability—to the same pipeline that loads, chunks, and stores content for AI search and retrieval.If you are just starting with Oracle Database, consider exploring the free Oracle 26 AI, which provides a simple way to get set up. While working with the database, it’s often advisable to avoid using the
SYSTEM user for application workloads; instead, create a dedicated user with the minimum required privileges. For an end-to-end setup walkthrough, see the Oracle AI Vector Search demo notebook. For background on user administration, refer to the official Oracle guide.
Prerequisites
Installlangchain-oracledb. The python-oracledb driver will be installed automatically as a dependency.
Connect to Oracle Database
The following sample code will show how to connect to Oracle Database. By default, python-oracledb runs in a ‘Thin’ mode which connects directly to Oracle Database. This mode does not need Oracle Client libraries. However, some additional functionality is available when python-oracledb uses them. Python-oracledb is said to be in ‘Thick’ mode when Oracle Client libraries are used. Both modes have comprehensive functionality supporting the Python Database API v2.0 Specification. See the following guide that talks about features supported in each mode. You might want to switch to thick-mode if you are unable to use thin-mode.Load documents
Users have the flexibility to load documents from either the Oracle Database, a file system, or both, by appropriately configuring the loader parameters. For comprehensive details on these parameters, please consult the Oracle AI Vector Search Guide. A significant advantage of utilizing OracleDocLoader is its capability to process over 150 distinct file formats, eliminating the need for multiple loaders for different document types. For a complete list of the supported formats, please refer to the Oracle Text Supported Document Formats. Below is a sample code snippet that demonstrates how to useOracleDocLoader:
Split documents
The documents may vary in size, ranging from small to very large. Users often prefer to chunk their documents into smaller sections to facilitate the generation of embeddings. A wide array of customization options is available for this splitting process. For comprehensive details regarding these parameters, please consult the Oracle AI Vector Search Guide. Below is a sample code illustrating how to implement this:End to end demo
Please refer to our complete demo guide Oracle AI Vector Search End-to-End Demo Guide to build an end to end RAG pipeline with the help of Oracle AI Vector Search.Connect these docs to Claude, VSCode, and more via MCP for real-time answers.