You ask Claude Code to research a competitor's pricing page. It searches the web and returns a snippet: "Starts at $29/month." That's not enough. You need the full pricing table, the feature comparison, the enterprise tier — the actual page content.
Web search gives you summaries. Web crawl gives you the page.
Here's how to add web crawling to Claude Code — so your agent can read full web pages, extract structured data, and feed that research directly into its workflow.
Web Search vs Web Crawl: What's the Difference?
They're related but do different jobs:
| Web Search | Web Crawl | |
|---|---|---|
| What it returns | Snippets, links, citations | Full page content as clean Markdown |
| Best for | Quick answers, discovery, fact-checking | Deep research, content extraction, competitor analysis |
| Speed | Seconds | Seconds to a minute (full page fetch) |
| Data depth | Surface-level | Complete — every heading, paragraph, table |
| Use case | "What's the pricing for X?" | "Extract the entire pricing page and compare it to our pricing" |
Your agent needs both. Search to find the right pages. Crawl to read them properly.
Why Claude Code Needs Web Crawl
Claude Code reasons about your codebase. It can refactor functions, write tests, and debug issues across files. But when it needs to research something — a competitor's API docs, a library's changelog, a product's feature list — it hits a wall.
Web search helps, but snippets only go so far. A pricing page might have 12 tiers. A docs page might have 40 sections. A changelog might span 3 years of releases. A 150-character snippet tells you one thing. The full page tells you everything.
Web crawl gives your agent the complete page. It can then:
- Extract structured data (pricing tiers, feature lists, API endpoints)
- Compare competitor offerings point by point
- Feed documentation into code generation ("implement the authentication exactly as described in the docs")
- Monitor changes over time (crawl the same page weekly, diff the results)
Method 1: Manual Web Scraping (The Fragile Way)
You can configure Claude Code to call a scraping service directly. Pick a provider (Firecrawl, Jina, ScrapingBee), sign up, get an API key, and wire it into your agent.
The manual approach:
- Sign up for a scraping service
- Get an API key
- Write a shell script or MCP config that Claude Code can call
- Handle rate limits, retries, and failed fetches
- Parse the response and feed it back into the agent context
This works for occasional use. It breaks when you scale — different websites block different scrapers, rate limits vary by provider, and maintaining the integration consumes time you wanted to spend building.
Method 2: MCP Server for Crawling
MCP servers for web crawling bundle the scraping logic into a reusable integration. Firecrawl's MCP server is the most common — Claude Code calls it, and it returns clean Markdown from any URL.
The setup is lighter than manual API wiring, but you're still managing:
- One MCP server per capability (crawl is separate from search)
- Provider-specific rate limits and authentication
- Format inconsistencies when switching between scraping providers
Method 3: One CLI for Search + Crawl (The AnyCap Way)
This approach bundles search and crawl into one command surface. Your agent searches to find pages, then crawls to read them fully — all through the same CLI.
# Step 1: Search for relevant pages
anycap search --prompt "competitor pricing pages SaaS 2026" --citations
# Step 2: Crawl the most relevant result for full content
anycap crawl --url "https://competitor.com/pricing" -o pricing.md
The runtime handles:
- Structured output. Pages convert to clean Markdown — headings, paragraphs, tables, and code blocks preserved.
- JavaScript rendering. Dynamic pages (SPAs, React apps) render before extraction.
- Clean content. Navigation, ads, and boilerplate are stripped. What's left is the article body.
- Consistent format. Every crawled page returns the same Markdown structure, regardless of the source.
Install:
npm i -g anycap
anycap login
anycap skill install --target ~/.claude/skills/anycap-cli/
→ Install AnyCap free — 250 credits for new users
Real Use Case: Competitor Research Pipeline
Your agent needs to compare your product's pricing against three competitors. Here's the full workflow:
# 1. Search for competitor pricing pages
anycap search --prompt "competitor A pricing plans 2026" --citations
anycap search --prompt "competitor B pricing plans 2026" --citations
anycap search --prompt "competitor C pricing plans 2026" --citations
# 2. Crawl each pricing page for full content
anycap crawl --url "https://competitor-a.com/pricing" -o competitor-a.md
anycap crawl --url "https://competitor-b.com/pricing" -o competitor-b.md
anycap crawl --url "https://competitor-c.com/pricing" -o competitor-c.md
# 3. Feed the crawled content to Claude Code for analysis
# Claude Code now has the full pricing data and can produce:
# - A comparison table
# - Pricing positioning recommendations
# - Feature gap analysis
Your agent researched, crawled, analyzed, and recommended — all in one session. No manual browser tabs. No copy-pasting.
Real Use Case: Documentation-Driven Development
Your agent needs to implement an API integration. Instead of guessing the auth flow, it crawls the official docs:
# Crawl the API authentication docs
anycap crawl --url "https://api.provider.com/docs/auth" -o auth-docs.md
# Crawl the endpoint reference
anycap crawl --url "https://api.provider.com/docs/endpoints" -o endpoints.md
# Claude Code now implements the integration from the actual docs,
# not from its training data which may be outdated
This is the difference between "Claude Code, implement the Stripe integration" (works from training data, may be outdated) and "Claude Code, crawl the latest Stripe docs and implement the integration exactly as described" (accurate, current, reliable).
Real Use Case: Competitive Monitoring
Set up a recurring research workflow. Your agent crawls competitor pages on a schedule and diffs the results:
# Crawl competitor changelog
anycap crawl --url "https://competitor.com/changelog" -o competitor-changelog-$(date +%Y%m%d).md
# Crawl competitor feature page
anycap crawl --url "https://competitor.com/features" -o competitor-features-$(date +%Y%m%d).md
# Compare against last week's crawl
diff competitor-features-20260511.md competitor-features-20260518.md
Run this weekly. Your agent flags new features, changed pricing, updated messaging — before your product team hears about it from a customer.
Search + Crawl: The Complete Research Stack
Web search finds. Web crawl reads. Together, they form a complete research capability for your agent:
| Step | Command | What it does |
|---|---|---|
| 1. Discover | anycap search |
Finds relevant pages with grounded citations |
| 2. Extract | anycap crawl |
Pulls full page content as clean Markdown |
| 3. Analyze | Claude Code | Reasons about the extracted content |
| 4. Act | Claude Code | Implements, compares, or reports based on findings |
This is grounded research — your agent doesn't rely on training data or partial snippets. It works from the actual, current content of the pages that matter.
When to Crawl vs When to Search
| Use search when... | Use crawl when... |
|---|---|
| You need a quick answer | You need the complete page |
| You're discovering which pages exist | You know which page you need and want all of it |
| You need cited, grounded summaries | You need structured data extraction |
| Speed is the priority | Depth is the priority |
| The answer fits in a snippet | The answer is a table, a list, or spans multiple sections |
Most research workflows use both: search to discover, crawl to extract.
FAQ
Does web crawl work on JavaScript-rendered pages?
Yes. The runtime renders dynamic content (React, Vue, SPAs) before extracting. What you see in a browser is what your agent gets.
How is web crawl different from Claude Code's built-in web search?
Claude Code's built-in web search returns snippets and summaries. Web crawl returns the full page content as Markdown — every heading, paragraph, table, and code block. Use search for quick answers. Use crawl when you need depth.
Can I crawl multiple pages in one session?
Yes. Run anycap crawl once per URL. Your agent can loop through a list of URLs and crawl them sequentially. All results are saved as local Markdown files.
What if a page blocks crawlers?
Some pages block automated access. The runtime respects robots.txt and handles access restrictions gracefully. If a page can't be crawled, your agent gets a clear error message — it doesn't silently fail.
Does this work across Cursor and Codex too?
Yes. anycap crawl uses the same CLI and works across Claude Code, Cursor, and Codex. One install, all agents.
The Bottom Line
Web search tells your agent what exists. Web crawl lets your agent read it. For competitive research, documentation-driven development, and content extraction, search alone isn't enough.
Give your agent both. Search to discover. Crawl to understand.
→ Give Claude Code full web access — search + crawl through one CLI
📖 What to Read Next
- How to Give Your AI Agent Web Search Capability — One CLI Command — The web search companion to this crawl guide.
- Competitive Monitoring with AI Agents — Build a cron job that watches your competitors automatically.
- AI-Powered Search for AI Agents: Grounded Search vs Traditional RAG — When to use grounded search vs vector search.
Related Articles
- How to Generate Video with Claude Code: The Complete 2026 Guide — The capabilities keep stacking.
- What Is a Capability Runtime? — The infrastructure that bundles search, crawl, image, video, and storage into one CLI.
Written by the AnyCap team. We build the capability runtime that gives your agent web search with citations, full-page crawling, and everything it needs to research without you.