TXT Parser

Text File Processing Guide

Document Parser API supports processing of plain text files.

Endpoint

POST /documents/parse

Text File Configuration

{
    "url": "string",          // URL of the text file
    "type": "txt",           // Specify "txt" for text files
    "output": "plain|markdown",
    "webhook": "string"      // Optional webhook URL
}

Text Processing Features

  • UTF-8 encoding (default)

  • Fallback to Latin-1 encoding

  • Form feed character (\f) recognition for page breaks

  • Maximum file size: 50MB

Example Request

curl --location 'https://api.yetanotherapi.com/documents/parse' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "url": "https://example.com/document.txt",
    "type": "txt",
    "output": "markdown"
}'

Markdown Output Features

When output is set to "markdown":

  1. Lines ending with ':' are converted to H3 headers

  2. Short lines (<50 chars) at paragraph starts become H2 headers

  3. Empty lines create paragraph breaks

  4. Form feeds create page breaks

Response Format

{
    "requestId": "string",
    "status": "COMPLETED",
    "data": "Processed text content with optional markdown formatting"
}

Text-Specific Limitations

  1. Binary text files not supported

  2. Maximum line length: 1MB

  3. Maximum number of lines: 1,000,000

  4. Non-standard encodings may cause issues

  5. Control characters (except \f) are stripped

Last updated