Skip to main content

Web Loaders

These loaders are used to load web resources. They do not involve the local file system.

info

If you'd like to write your own document loader, see this how-to. If you'd like to contribute an integration, see Contributing integrations.

All web loaders

NameDescription
PlaywrightOnly available on Node.js.
Apify DatasetThis guide shows how to use Apify with LangChain to load documents fr...
AssemblyAI Audio TranscriptThis covers how to load audio (and video) transcripts as document obj...
Azure Blob Storage ContainerOnly available on Node.js.
Azure Blob Storage FileOnly available on Node.js.
Browserbase LoaderDescription
College ConfidentialThis example goes over how to load data from the college confidential...
ConfluenceOnly available on Node.js.
CouchbaseCouchbase is an award-winning distributed NoSQL cloud database that d...
FigmaThis example goes over how to load data from a Figma file.
FireCrawlThis notebook provides a quick overview for getting started with
GitBookThis example goes over how to load data from any GitBook, using Cheer...
GitHubThis example goes over how to load data from a GitHub repository.
Hacker NewsThis example goes over how to load data from the hacker news website,...
IMSDBThis example goes over how to load data from the internet movie scrip...
LangSmithThis notebook provides a quick overview for getting started with the
Notion APIThis guide will take you through the steps required to load documents...
PDF filesThis notebook provides a quick overview for getting started with
RecursiveUrlLoaderThis notebook provides a quick overview for getting started with
S3 FileOnly available on Node.js.
SearchApi LoaderThis guide shows how to use SearchApi with LangChain to load web sear...
SerpAPI LoaderThis guide shows how to use SerpAPI with LangChain to load web search...
Sitemap LoaderThis notebook goes over how to use the SitemapLoader class to load si...
Sonix AudioOnly available on Node.js.
Blockchain DataThis example shows how to load blockchain data, including NFT metadat...
SpiderSpider is the fastest crawler. It converts any website into pure HTML...
TaskadeTaskade is the ultimate tool for AI-driven writing, project managemen...
CheerioThis notebook provides a quick overview for getting started with
PuppeteerThis notebook provides a quick overview for getting started with
YouTube transcriptsThis covers how to load youtube transcript into LangChain documents.

Was this page helpful?


You can also leave detailed feedback on GitHub.