> For the complete documentation index, see [llms.txt](https://docs.yetanotherapi.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.yetanotherapi.com/web-scrapper-deprecated.md).

# Web Scrapper \[deprecated]

## **Web Scraper API Documentation**

**Overview**\
The Web Scraper API allows you to extract specific information from web pages using natural language prompts. It combines web scraping with optional Large Language Model (LLM) processing to provide structured data based on your requirements.

**Cost**

* Cost per API call: 1 credit

**Base URL**\
`https://api.yetanotherapi.com/v1/llm-web-scrapper`

**Authentication**\
Authentication is required for all API requests. Use your API key in the `x-api-key` header.\
`x-api-key: YOUR_API_KEY_HERE`

**Endpoints**\
**Scrape Web Page**\
Scrapes a specified URL and extracts information based on a given prompt.

* **HTTP Method**: POST
* **Endpoint**: /

**Request Headers**

* `x-api-key`: YOUR\_API\_KEY\_HERE
* `Content-Type`: application/json

**Request Body**

* `url`: The URL of the web page to scrape.
* `prompt`: The natural language prompt specifying the information to extract.
* `use_llm`: A boolean value indicating whether to use LLM for processing.
* `webhook`: (Optional) A webhook URL to send the response to.

**Example Request**

```bash
curl --location 'https://api.yetanotherapi.com/v1/llm-web-scrapper' \
--header 'x-api-key: $API_KEY_HERE' \
--header 'Content-Type: application/json' \
--data '{
  "url": "https://www.amazon.in/AMVR-Controller-Compatible-Accessories-Adjustable/dp/B0CJRK7B8J/ref=pd_rhf_gw_s_pd_crcd_d_sccl_1_3/261-2157292-0625645",
  "prompt": "product name and 5 of its features",
  "use_llm": true,
  "webhook": "https://connect.pabbly.com/workflow/sendwebhookdata/IjU3NjYwNTZkMDYzNTA0M2M1MjZiNTUzNjUxMzYi_pc"
}'
```

**Response**\
The API will attempt to process the scraped content and provide JSON before 20 seconds. If it is not processed within that time, a request ID will be returned, which can be used to call the following endpoint to get the output. The API also supports sending the response to a webhook URL if provided.

**Example Response**

```bash
curl --location 'https://api.yetanotherapi.com/v1/llm-web-scrapper/c07ab203-1b4a-42f2-99dd-6628268668d2'
```

**Response Structure**\
The API returns a JSON object with the following structure:

**Example Response**

```json
{
    "request_id": "c03995eb-e117-4eca-85c8-e6d398a968d9",
    "llm_json_structure": {
        "dowJonesIndexValue": 42313.0
    }
}
```

**Error Handling**\
In case of errors, the API will return a JSON object with an error message and HTTP status code.

**Example Error Response**:

```json
{
  "error": "Invalid URL format",
  "status_code": 400
}
```

**Rate Limiting**\
Currently, there are no rate limits enforced during the beta phase. However, users should design their applications to handle potential rate limiting in the future.

**Beta Version Notice**\
This is a beta version of the API. Users may encounter occasional issues or bugs.

* The beta release has limited scope and features.
* Subscription plans will be introduced once the API is out of beta.
* Not all websites are supported for scraping.

**Support**\
For additional questions or to report issues, please contact support at <hey@manojlk.work>.

***