Web Scrapper

Web Scraper API Documentation

Overview The Web Scraper API allows you to extract specific information from web pages using natural language prompts. It combines web scraping with optional Large Language Model (LLM) processing to provide structured data based on your requirements.

Cost

  • Cost per API call: 1 credit

Base URL https://api.yetanotherapi.com/v1/llm-web-scrapper

Authentication Authentication is required for all API requests. Use your API key in the x-api-key header. x-api-key: YOUR_API_KEY_HERE

Endpoints Scrape Web Page Scrapes a specified URL and extracts information based on a given prompt.

  • HTTP Method: POST

  • Endpoint: /

Request Headers

  • x-api-key: YOUR_API_KEY_HERE

  • Content-Type: application/json

Request Body

  • url: The URL of the web page to scrape.

  • prompt: The natural language prompt specifying the information to extract.

  • use_llm: A boolean value indicating whether to use LLM for processing.

  • webhook: (Optional) A webhook URL to send the response to.

Example Request

curl --location 'https://api.yetanotherapi.com/v1/llm-web-scrapper' \
--header 'x-api-key: $API_KEY_HERE' \
--header 'Content-Type: application/json' \
--data '{
  "url": "https://www.amazon.in/AMVR-Controller-Compatible-Accessories-Adjustable/dp/B0CJRK7B8J/ref=pd_rhf_gw_s_pd_crcd_d_sccl_1_3/261-2157292-0625645",
  "prompt": "product name and 5 of its features",
  "use_llm": true,
  "webhook": "https://connect.pabbly.com/workflow/sendwebhookdata/IjU3NjYwNTZkMDYzNTA0M2M1MjZiNTUzNjUxMzYi_pc"
}'

Response The API will attempt to process the scraped content and provide JSON before 20 seconds. If it is not processed within that time, a request ID will be returned, which can be used to call the following endpoint to get the output. The API also supports sending the response to a webhook URL if provided.

Example Response

curl --location 'https://api.yetanotherapi.com/v1/llm-web-scrapper/c07ab203-1b4a-42f2-99dd-6628268668d2'

Response Structure The API returns a JSON object with the following structure:

Example Response

{
    "request_id": "c03995eb-e117-4eca-85c8-e6d398a968d9",
    "llm_json_structure": {
        "dowJonesIndexValue": 42313.0
    }
}

Error Handling In case of errors, the API will return a JSON object with an error message and HTTP status code.

Example Error Response:

{
  "error": "Invalid URL format",
  "status_code": 400
}

Rate Limiting Currently, there are no rate limits enforced during the beta phase. However, users should design their applications to handle potential rate limiting in the future.

Beta Version Notice This is a beta version of the API. Users may encounter occasional issues or bugs.

  • The beta release has limited scope and features.

  • Subscription plans will be introduced once the API is out of beta.

  • Not all websites are supported for scraping.

Support For additional questions or to report issues, please contact support at hey@manojlk.work.


Last updated