# Web Scrapper \[deprecated]

## **Web Scraper API Documentation**

**Overview**\
The Web Scraper API allows you to extract specific information from web pages using natural language prompts. It combines web scraping with optional Large Language Model (LLM) processing to provide structured data based on your requirements.

**Cost**

* Cost per API call: 1 credit

**Base URL**\
`https://api.yetanotherapi.com/v1/llm-web-scrapper`

**Authentication**\
Authentication is required for all API requests. Use your API key in the `x-api-key` header.\
`x-api-key: YOUR_API_KEY_HERE`

**Endpoints**\
**Scrape Web Page**\
Scrapes a specified URL and extracts information based on a given prompt.

* **HTTP Method**: POST
* **Endpoint**: /

**Request Headers**

* `x-api-key`: YOUR\_API\_KEY\_HERE
* `Content-Type`: application/json

**Request Body**

* `url`: The URL of the web page to scrape.
* `prompt`: The natural language prompt specifying the information to extract.
* `use_llm`: A boolean value indicating whether to use LLM for processing.
* `webhook`: (Optional) A webhook URL to send the response to.

**Example Request**

```bash
curl --location 'https://api.yetanotherapi.com/v1/llm-web-scrapper' \
--header 'x-api-key: $API_KEY_HERE' \
--header 'Content-Type: application/json' \
--data '{
  "url": "https://www.amazon.in/AMVR-Controller-Compatible-Accessories-Adjustable/dp/B0CJRK7B8J/ref=pd_rhf_gw_s_pd_crcd_d_sccl_1_3/261-2157292-0625645",
  "prompt": "product name and 5 of its features",
  "use_llm": true,
  "webhook": "https://connect.pabbly.com/workflow/sendwebhookdata/IjU3NjYwNTZkMDYzNTA0M2M1MjZiNTUzNjUxMzYi_pc"
}'
```

**Response**\
The API will attempt to process the scraped content and provide JSON before 20 seconds. If it is not processed within that time, a request ID will be returned, which can be used to call the following endpoint to get the output. The API also supports sending the response to a webhook URL if provided.

**Example Response**

```bash
curl --location 'https://api.yetanotherapi.com/v1/llm-web-scrapper/c07ab203-1b4a-42f2-99dd-6628268668d2'
```

**Response Structure**\
The API returns a JSON object with the following structure:

**Example Response**

```json
{
    "request_id": "c03995eb-e117-4eca-85c8-e6d398a968d9",
    "llm_json_structure": {
        "dowJonesIndexValue": 42313.0
    }
}
```

**Error Handling**\
In case of errors, the API will return a JSON object with an error message and HTTP status code.

**Example Error Response**:

```json
{
  "error": "Invalid URL format",
  "status_code": 400
}
```

**Rate Limiting**\
Currently, there are no rate limits enforced during the beta phase. However, users should design their applications to handle potential rate limiting in the future.

**Beta Version Notice**\
This is a beta version of the API. Users may encounter occasional issues or bugs.

* The beta release has limited scope and features.
* Subscription plans will be introduced once the API is out of beta.
* Not all websites are supported for scraping.

**Support**\
For additional questions or to report issues, please contact support at <hey@manojlk.work>.

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.yetanotherapi.com/web-scrapper-deprecated.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
