Airbyte is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.This covers how to load any source from Airbyte into LangChain documents
AirbyteLoader
you need to install the langchain-airbyte
integration package.
airbyte
library does not support Pydantic v2.
Please downgrade to Pydantic v1 to use this package.
Note: This package also currently requires Python 3.10+.
AirbyteLoader
will load any structured data from a stream and output yaml-formatted documents.
AirbyteLoader
is its ability to load large documents from upstream sources. When working with large datasets, the default .load()
behavior can be slow and memory-intensive. To avoid this, you can use the .lazy_load()
method to load documents in a more memory-efficient manner.
.alazy_load()
:
AirbyteLoader
can be configured with the following options:
source
(str, required): The name of the Airbyte source to load from.stream
(str, required): The name of the stream to load from (Airbyte sources can return multiple streams)config
(dict, required): The configuration for the Airbyte sourcetemplate
(PromptTemplate, optional): A custom prompt template for formatting documentsinclude_metadata
(bool, optional, default True): Whether to include all fields as metadata in the output documentsconfig
, and you can find the specific configuration options in the “Config field reference” for each source in the Airbyte documentation.