Adaptive Web Scraping

Scrapling One library, zero compromises.

An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl. Its parser learns from website changes and automatically relocates your elements when pages update. Its fetchers bypass anti-bot systems like Cloudflare Turnstile out of the box.

50.5K
GitHub Stars
92%
Test Coverage
1,441+
Commits
100%
Type Hints

What is Scrapling?

Scrapling is an adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl. Its parser learns from website changes and automatically relocates your elements when pages update.

Its fetchers bypass anti-bot systems like Cloudflare Turnstile out of the box. And its spider framework lets you scale up to concurrent, multi-session crawls with pause/resume and automatic proxy rotation — all in a few lines of Python.

Language Python 3.10+
License BSD-3-Clause
Author Karim Shoair
Install pip install scrapling
# Fetch and scrape in 3 lines
from scrapling.fetchers import Fetcher

page = Fetcher.get('https://example.com')
items = page.css('.product', auto_save=True)

# Adaptive — survives website redesigns
# items = page.css('.product', adaptive=True)

Powerful, Yet Remarkably Simple

Scrapling's expressive API lets you scrape websites, build crawlers, and extract data with minimal code. Whether you need a single request or a full-scale crawling pipeline with concurrent sessions, proxy rotation, and pause/resume — it's just a few lines away.

Everything You Need for Web Scraping

From adaptive element tracking to anti-bot bypass, Scrapling is built by Web Scrapers for Web Scrapers. One library, zero compromises.

Adaptive Parser

Smart element tracking that learns from website changes and automatically relocates your elements when pages update — no code changes needed.

Anti-bot Bypass

Built-in stealth capabilities with fingerprint spoofing. Easily bypass Cloudflare Turnstile, Interstitial, and other anti-bot protections.

Full Crawling Framework

Scrapy-like Spider API with concurrent requests, multi-session support, pause/resume, checkpoints, streaming mode, and built-in JSON export.

Proxy Rotation

Built-in ProxyRotator with cyclic or custom rotation strategies across all session types, plus per-request proxy overrides for flexible routing.

MCP Server for AI

Built-in MCP server for AI-assisted Web Scraping and data extraction. Leverage Scrapling to extract targeted content before passing it to AI, reducing token usage.

Blazing Fast

Optimized performance outperforming most Python scraping libraries. 10x faster JSON serialization, memory-efficient data structures, and lazy loading.

Battle-Tested Architecture

Scrapling is used daily by hundreds of Web Scrapers. With 92% test coverage and full type hints, it's built for production at scale.

Spiders — Full Crawling Framework

Define spiders with start_urls, async parse callbacks, and Request/Response objects. Configurable concurrency limits, per-domain throttling, and download delays.

Multi-Session Support

Unified interface for HTTP requests and stealthy headless browsers in a single spider. Route requests to different sessions by ID.

Pause & Resume

Checkpoint-based crawl persistence. Press Ctrl+C for a graceful shutdown; restart to resume from where you left off.

Streaming Mode

Stream scraped items as they arrive via async for with real-time stats — ideal for UI, pipelines, and long-running crawls.

Stealth & Anti-bot

Advanced stealth capabilities with finger print spoofing. Bypass all types of Cloudflare Turnstile and Interstitial with automation.

Adaptive Element Tracking

Relocate elements after website changes using intelligent similarity algorithms. CSS selectors, XPath, filter-based search, text search, and regex.

Interactive Shell & CLI

Built-in IPython shell with Scrapling integration, shortcuts, and tools. Extract pages to file directly without writing code.

DNS & Ad Blocking

Block requests to specific domains or enable built-in ad blocking (~3,500 known ad/tracker domains). DNS-over-HTTPS for DNS leak prevention.

Development Mode

Cache responses to disk on the first run and replay them on subsequent runs. Iterate on parsing logic without re-hitting target servers.

Docker Ready

Official Docker image with all browsers and extras pre-installed. Automatically built and pushed with each release.

Blazing Fast By Design

Scrapling isn't just powerful — it's also blazing fast. The following benchmarks compare Scrapling's parser with the latest versions of other popular libraries.

#
Library
Time (ms)
vs Scrapling
1
Scrapling
2.02
1.0x
2
Parsel/Scrapy
2.04
1.01x
3
Raw Lxml
2.54
1.26x
4
PyQuery
24.17
~12x
5
Selectolax
82.63
~41x
6
MechanicalSoup
1549.71
~767x
7
BS4 with Lxml
1584.31
~784x
8
BS4 with html5lib
3391.91
~1679x
#
Library
Time (ms)
vs Scrapling
1
Scrapling
2.39
1.0x
2
AutoScraper
12.45
5.21x

Start Building with Scrapling

Install Scrapling now and experience the most adaptive web scraping framework ever built. One library, zero compromises.

Install Scrapling