Webpage links Scrapper
This API endpoint allows you to extract content from websites in either plaintext or markdown format.
Endpoint
Headers
Content-Type
Yes
Must be application/json
x-api-key
Yes
Your API authentication key
Request Body
Though you don't have to explicitly mention about links. we will scrape it by default and just make API call using below payload.
Parameters
url
string
Yes
The URL of the website to scrape
output_type
string
No
Either "plaintext" (default) or "markdown"
use_cache
boolean
No
If true, returns cached result if available. Default: false
webhook
string
No
URL to receive webhook notification when processing is complete
Response
Immediate Response (HTTP 200)
If processing completes within 20 seconds, you'll receive the full result:
Processing Response (HTTP 202)
If processing takes longer than 20 seconds:
Error Response (HTTP 4XX/5XX)
Common error codes:
E001: Invalid request format
E003: Invalid URL format
E006: Storage service error
E008: Content processing failed
Response Fields
request_id
string
Unique identifier for the request
url
string
The URL that was scraped
status
string
Status of the request ("completed" or "processing")
timestamp
number
Unix timestamp of when the request was processed
text_content
string
The extracted text content (if output_type=plaintext)
meta
object
Metadata from the page
links
array
Array of links found on the page
images
array
Array of images found on the page
Example Curl Request
Notes
The API supports both synchronous and asynchronous processing
For pages requiring longer processing time, use the status check endpoint to poll for results
Use webhooks for automatic notification when processing completes
Cache results are available for 15 days
Last updated
Was this helpful?