YetAnotherAPI Documentation
Signup for APIGo to main websiteContact Support
  • API Documentation
    • YetAnotherAPI Overview
    • Authentication
  • Integrations
    • Pabbly-Connect
  • Document parser
    • PDF Parser
    • Doc Parser
    • PNG & JPG Parser
    • TXT Parser
    • Parser Processing Status
  • Web Scrapper [deprecated]
    • Basic Web Scraper
    • LLM Web Scraper
    • Scrapper Processing Status
  • Web Scraper
    • Basic Web Scraper
    • Webpage links Scrapper
    • Metadata Scrapper
    • Webhook Notification
    • Status Check
  • LLM Web Scraper
    • Basic Text
    • Structured JSON
    • Best Practices
    • Use Cases
    • Status Check
  • UChat Webhook System
Powered by GitBook
On this page
  • Text File Processing Guide
  • Endpoint
  • Text File Configuration
  • Text Processing Features
  • Example Request
  • Markdown Output Features
  • Response Format
  • Text-Specific Limitations

Was this helpful?

  1. Document parser

TXT Parser

Text File Processing Guide

Document Parser API supports processing of plain text files.

Endpoint

POST /documents/parse

Text File Configuration

{
    "url": "string",          // URL of the text file
    "type": "txt",           // Specify "txt" for text files
    "output": "plain|markdown",
    "webhook": "string"      // Optional webhook URL
}

Text Processing Features

  • UTF-8 encoding (default)

  • Fallback to Latin-1 encoding

  • Form feed character (\f) recognition for page breaks

  • Maximum file size: 50MB

Example Request

curl --location 'https://api.yetanotherapi.com/documents/parse' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "url": "https://example.com/document.txt",
    "type": "txt",
    "output": "markdown"
}'

Markdown Output Features

When output is set to "markdown":

  1. Lines ending with ':' are converted to H3 headers

  2. Short lines (<50 chars) at paragraph starts become H2 headers

  3. Empty lines create paragraph breaks

  4. Form feeds create page breaks

Response Format

{
    "requestId": "string",
    "status": "COMPLETED",
    "data": "Processed text content with optional markdown formatting"
}

Text-Specific Limitations

  1. Binary text files not supported

  2. Maximum line length: 1MB

  3. Maximum number of lines: 1,000,000

  4. Non-standard encodings may cause issues

  5. Control characters (except \f) are stripped

PreviousPNG & JPG ParserNextParser Processing Status

Last updated 5 months ago

Was this helpful?