Status Check

This endpoint allows you to check the status and retrieve results of a previously submitted scraping request.

Endpoint

GET https://api.yetanotherapi.com/web-scrapper/{request_id}

Headers

Header
Required
Description

x-api-key

Yes

Your API authentication key

Path Parameters

Parameter
Type
Description

request_id

string

The request ID returned from the scraper API

Response

Success Response (HTTP 200)

For completed requests:

{
    "request_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com",
    "status": "completed",
    "timestamp": 1635545600,
    "content": {
        "text": "Extracted text content...",
        "markdown": "# Extracted markdown content...", // If markdown was requested
        "meta": {
            "title": "Page Title",
            "description": "Meta description..."
        },
        "links": [
            {
                "text": "Link text",
                "url": "https://example.com/link",
                "type": "internal"
            }
        ],
        "images": [
            {
                "url": "https://example.com/image.jpg",
                "alt": "Image description"
            }
        ]
    },
    "llm_output": { // Only present if LLM processing was requested
        // Structured JSON output based on the prompt
    }
}

Processing Response (HTTP 200)

For requests still processing:

{
    "request_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com",
    "status": "processing",
    "timestamp": 1635545600
}

Error Response (HTTP 4XX/5XX)

{
    "error": "ERROR_CODE: Error message"
}

Common error codes:

  • E001: Invalid request format

  • E002: Request ID not found

  • E006: Storage service error

Response Fields

Field
Type
Description

request_id

string

Unique identifier for the request

url

string

The URL that was scraped

status

string

Current status of the request

timestamp

number

Unix timestamp of last status update

content

object

Contains extracted content if status is "completed"

llm_output

object

Present only if LLM processing was requested

Possible Status Values

Status
Description

received

Request has been received but not yet processed

processing

Request is currently being processed

completed

Processing has completed successfully

failed

Processing failed with an error

Example Curl Request

curl --location 'https://api.yetanotherapi.com/web-scrapper/550e8400-e29b-41d4-a716-446655440000' \
--header 'x-api-key: your-api-key'

Error Handling

If processing failed, the response will include error details:

{
    "request_id": "550e8400-e29b-41d4-a716-446655440000",
    "url": "https://example.com",
    "status": "failed",
    "timestamp": 1635545600,
    "error": "E008: Content processing failed",
    "error_trace": "Detailed error information" // Only in development environment
}

Notes

  • Polling interval should be at least 5 minutes

  • Results are available for 15 hours after completion

Last updated

Was this helpful?