Azure AI Document Intelligence (formerly known asThis current implementation of a loader usingAzure Form Recognizer
) is machine-learning based service that extracts texts (including handwriting), tables, document structures (e.g., titles, section headings, etc.) and key-value-pairs from digital or scanned PDFs, images, Office and HTML files. Document Intelligence supportsJPEG/JPG
,PNG
,BMP
,TIFF
,HEIF
,DOCX
,XLSX
,PPTX
andHTML
.
Document Intelligence
can incorporate content page-wise and turn it into LangChain documents. The default output format is markdown, which can be easily chained with MarkdownHeaderTextSplitter
for semantic document chunking. You can also use mode="single"
or mode="page"
to return pure texts in a single page or document split by page.
Prerequisite
An Azure AI Document Intelligence resource in one of the 3 preview regions: East US, West US2, West Europe - follow this document to create one if you don’t have. You will be passing<endpoint>
and <key>
as parameters to the loader.
Example 1
The first example uses a local file which will be sent to Azure AI Document Intelligence. With the initialized document analysis client, we can proceed to create an instance of the DocumentIntelligenceLoader:Example 2
The input file can also be a public URL path. E.g., raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/layout.png.Example 3
You can also specifymode="page"
to load document by pages.
Example 4
You can also specifyanalysis_feature=["ocrHighResolution"]
to enable add-on capabilities. For more information, see: aka.ms/azsdk/python/documentintelligence/analysisfeature.