LLM Web Scraper
Overview
The LLM Web Scraper API combines powerful web scraping capabilities with Language Model processing to extract and structure web content intelligently. It can analyze web pages and return structured data based on your specific requirements.
Base URL
Authentication
All requests require an API key passed in the x-api-key
header.
Request Headers
Content-Type
Yes
Must be application/json
x-api-key
Yes
Your API authentication key
Request Body
Request Parameters
url
string
Yes
-
The URL of the website to scrape
output_type
string
No
plaintext
Either "plaintext" or "markdown"
use_llm
boolean
Yes
-
Must be set to true for LLM processing
prompt
string
Yes
-
Instructions for the LLM about what to extract
openai_key_id
string
No
null
Optional ID of your registered OpenAI key
use_cache
boolean
No
false
If true, returns cached result if available
webhook
string
No
null
URL to receive webhook notification when processing complete
Important Parameter Notes
Cache Behavior
When
use_cache: true
, all other parameters excepturl
are ignoredReturns most recent cached result for the URL
404 error if no cache exists
OpenAI Key ID
Optional parameter
If provided, uses the specified OpenAI key from your account
If not provided, uses your most recently added OpenAI key
Manage multiple keys through your yetanotherapi dashboard
Webhook
Optional callback URL for asynchronous processing
Receives full results when processing completes
Must be publicly accessible HTTPS endpoint
Responses
Immediate Success Response (HTTP 200)
When processing completes within 20 seconds:
Processing Response (HTTP 202)
When processing takes longer than 20 seconds:
Error Response (HTTP 4XX/5XX)
Error Codes
E001
Invalid request format
400
E003
Invalid URL format
400
E004
Authentication error
401
E008
Content processing failed
500
E009
Validation error
400
Webhook Integration
When providing a webhook URL, you'll receive a POST request with the complete results:
Last updated
Was this helpful?