Status Check

This endpoint allows you to check the status and retrieve results of a previously submitted scraping request.

Endpoint

GET https://api.yetanotherapi.com/web-scrapper/{request_id}

Headers

Header

Required

Description

x-api-key

Yes

Your API authentication key

Path Parameters

Parameter

Type

Description

request_id

string

The request ID returned from the scraper API

Response

Success Response (HTTP 200)

For completed requests:

{
    "request_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com",
    "status": "completed",
    "timestamp": 1635545600,
    "content": {
        "text": "Extracted text content...",
        "markdown": "# Extracted markdown content...", // If markdown was requested
        "meta": {
            "title": "Page Title",
            "description": "Meta description..."
        },
        "links": [
            {
                "text": "Link text",
                "url": "https://example.com/link",
                "type": "internal"
            }
        ],
        "images": [
            {
                "url": "https://example.com/image.jpg",
                "alt": "Image description"
            }
        ]
    },
    "llm_output": { // Only present if LLM processing was requested
        // Structured JSON output based on the prompt
    }
}

Processing Response (HTTP 200)

For requests still processing:

{
    "request_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com",
    "status": "processing",
    "timestamp": 1635545600
}

Error Response (HTTP 4XX/5XX)

{
    "error": "ERROR_CODE: Error message"
}

Common error codes:

E001: Invalid request format
E002: Request ID not found
E006: Storage service error

Response Fields

Field

Type

Description

request_id

string

Unique identifier for the request

url

string

The URL that was scraped

status

string

Current status of the request

timestamp

number

Unix timestamp of last status update

content

object

Contains extracted content if status is "completed"

llm_output

object

Present only if LLM processing was requested

Possible Status Values

Status

Description

received

Request has been received but not yet processed

processing

Request is currently being processed

completed

Processing has completed successfully

failed

Processing failed with an error

Example Curl Request

curl --location 'https://api.yetanotherapi.com/web-scrapper/550e8400-e29b-41d4-a716-446655440000' \
--header 'x-api-key: your-api-key'

Error Handling

If processing failed, the response will include error details:

{
    "request_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com",
    "status": "failed",
    "timestamp": 1635545600,
    "error": "E008: Content processing failed",
    "error_trace": "Detailed error information" // Only in development environment
}

Notes

Polling interval should be at least 5 minutes
Results are available for 15 hours after completion

PreviousWebhook Notification NextLLM Web Scraper

Last updated 7 months ago

Was this helpful?