- Connect to Oracle Database from Python
- Load documents using
OracleDocLoader(from database tables, files, or directories) - Chunk content with
OracleTextSplitterto prepare for embeddings and retrieval
Why in-database document processing? You can apply Oracle Database capabilities—security, transactions, scalability, and high availability—to the same pipeline that loads, chunks, and stores content for AI search and retrieval.If you are just starting with Oracle Database, consider exploring the free Oracle 26 AI, which provides a simple way to get set up. While working with the database, it’s often advisable to avoid using the
SYSTEM user for application workloads; instead, create a dedicated user with the minimum required privileges. For an end-to-end setup walkthrough, see the Oracle AI Vector Search demo notebook. For background on user administration, refer to the official Oracle guide.
Prerequisites
Installlangchain-oracledb. The python-oracledb driver will be installed automatically as a dependency.
Connect to Oracle Database
The following sample code will show how to connect to Oracle Database. By default, python-oracledb runs in a ‘Thin’ mode which connects directly to Oracle Database. This mode does not need Oracle Client libraries. However, some additional functionality is available when python-oracledb uses them. Python-oracledb is said to be in ‘Thick’ mode when Oracle Client libraries are used. Both modes have comprehensive functionality supporting the Python Database API v2.0 Specification. See the following guide that talks about features supported in each mode. You might want to switch to thick-mode if you are unable to use thin-mode.Load documents
Users have the flexibility to load documents from either the Oracle Database, a file system, or both, by appropriately configuring the loader parameters. For comprehensive details on these parameters, please consult the Oracle AI Vector Search Guide. A significant advantage of utilizing OracleDocLoader is its capability to process over 150 distinct file formats, eliminating the need for multiple loaders for different document types. For a complete list of the supported formats, please refer to the Oracle Text Supported Document Formats. Below is a sample code snippet that demonstrates how to useOracleDocLoader: