Apify Dataset is a scalable append-only storage with sequential access built for storing structured web scraping results, such as a list of products or Google SERPs, and then export them to various formats like JSON, CSV, or Excel. Datasets are mainly used to save results of Apify Actors—serverless cloud programs for various web scraping, crawling, and data extraction use cases.This notebook shows how to load Apify datasets to LangChain.
ApifyDatasetLoader
into your source code:
Document
format.
For example, if your dataset items are structured like this:
Document
format, so that you can use them further with any LLM model (e.g. for question answering).