GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents. It is designed and expected to be used to parse academic papers, where it works particularly well. Note: if the articles supplied to Grobid are large documents (e.g. dissertations) exceeding a certain number of elements, they might not be processed. This loader uses Grobid to parse PDFs intoDocumentation Index
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt
Use this file to discover all available pages before exploring further.
Documents that retain metadata associated with the section of text.
The best approach is to install Grobid via docker, see grobid.readthedocs.io/en/latest/Grobid-docker/. For additional instructions, see the Grobid provider page. Once grobid is up-and-running you can interact as described below. Now, we can use the data loader.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

