Document parser
Document Parser API Documentation
Base URL: https://api.yetanotherapi.com
Overview
The Document Parser API allows you to extract text content from various document formats including PDF, Word documents, and images. The service provides both synchronous and asynchronous processing with optional webhook notifications for completion.
Authentication
All API requests require an API key sent in the header:
API Endpoints
Submit Document for Processing
Submit a document for text extraction.
Endpoint: POST /documents/parse
Headers:
Request Body:
Supported File Types:
pdf
: PDF documentsdoc
: Microsoft Word documents (.doc)docx
: Microsoft Word documents (.docx)jpg
/jpeg
: JPEG imagespng
: PNG imagestxt
: Plain text files
Output Formats:
plain
: Plain text (default)markdown
: Formatted markdown text
Response:
Quick Processing (< 20 seconds):
Async Processing (> 20 seconds):
Error Response:
Status Codes:
200: Success (processing completed)
202: Accepted (processing continues asynchronously)
400: Bad Request (invalid input)
401: Unauthorized (invalid API key)
500: Internal Server Error
Example Usage
cURL Example
Error Codes and Descriptions
400-001
Invalid file type
400-002
Invalid URL format
400-003
Invalid webhook URL
400-004
Missing required field
400-005
Invalid output format
401-001
Invalid API key
429-001
Rate limit exceeded
500-001
Processing error
500-002
Storage error
500-003
Webhook delivery failed
Notes
Processing time varies based on document size and complexity
Files are stored temporarily and deleted after 7 days
Webhook endpoints should respond within 30 seconds
All timestamps are in Unix epoch format
Last updated