Class | Package | Local | Serializable | JS support |
---|---|---|---|---|
PowerScaleDocumentLoader | powerscale-rag-connector | ✅ | ❌ | ❌ |
PowerScaleUnstructuredLoader | powerscale-rag-connector | ✅ | ❌ | ❌ |
Source | Document Lazy Loading | Native Async Support |
---|---|---|
PowerScaleDocumentLoader | ✅ | ✅ |
PowerScaleUnstructuredLoader | ✅ | ✅ |
PowerScaleUnstructuredLoader
can be used to locate the changed files and automatically process the files producing elements of the source file. This is done using LangChain’s UnstructuredLoader
class.
es_host_url
is the endpoint to MetadataIQ Elasticsearch databasees_index_index
is the name of the index where PowerScale writes it file system metadataes_api_key
is the encoded version of your elasticsearch API keyfolder_path
is the path on PowerScale to be queried for changesmetadata
fields in the returned Document
will return the path on PowerScale that contains the modified file. You will use this path to read the data via NFS (or S3) and process the data in your application (e.g.: create chunks and embedding).source
field is the path on PowerScale and not necessarily on your local system (depending on your mount strategy); OneFS expresses the entire storage system as a single tree rooted at /ifs
.change_types
property will inform you on what change occurred since the last one - e.g.: new, modified or delete.change_types
to add, update or delete entries your chunk and vector store.
When using PowerScaleUnstructuredLoader
the page_content
field will be filled with data from the Unstructured Loader
Document
is returned as the load function with all the same properties mentioned above.