Getting Started with Suparse

Suparse is an AI-powered document processing platform designed to turn unstructured documents—like invoices, receipts, bank statements, and purchase orders—into clean, structured JSON data.

Whether you want to process a single folder of PDFs via terminal or build a high-scale automated pipeline, Suparse provides the tools to get it done in minutes.

1. Get Your API Key

To start using Suparse, you need an API Key:

Sign in at suparse.com.
Navigate to the API Keys tab.
Generate a new key and save it securely.

export SUPARSE_API_KEY="your_api_key_here"

2. Choose Your Path

Suparse offers three ways to integrate, depending on your workflow.

💻 Command Line (CLI)

Best for: Quick tests, batch processing local folders, shell scripts, use by AI Agents (Claude Code, Codex etc.).

Install: pip install suparse or npx suparse
Quick Start:

suparse process invoice.pdf -o result.json

🐍 Python SDK

Best for: Backend services (Django/FastAPI), Data processing.

Install: pip install suparse
Guide: Python SDK Reference

📦 JavaScript / TypeScript SDK

Best for: Web apps (React/Next.js), Node.js services, and Edge functions.

Install: npm install @suparse/sdk
Guide: JS/TS SDK Reference

🔌 REST API

Best for: Languages without official SDKs or custom low-level integrations.

Endpoint: https://api.suparse.com/api/v1/
Guide: API Reference

3. Core Concepts

Understanding these three features will help you get the most out of the API:

📑 Templates

Templates define what data is extracted.

Auto-Detect: If you don't specify a template, Suparse uses AI to identify the document type automatically.
Custom Templates: Create your own schemas in the Suparse UI to extract niche fields.
Reference: Run suparse templates to see what's available to your account.

✂️ Auto-Splitting (`--with-split`)

If you upload a 50-page PDF containing 10 different invoices and 5 receipts, enable Split. Suparse will logically divide the document and process each sub-document with the correct template.

SDK/CLI: Simply pass split=True or --with-split.

🧹 Automatic Cleanup

For privacy-sensitive workflows, use the cleanup flag. This instructs Suparse to delete the document from our servers immediately after you download the extraction results.

4. Quick Reference: The Extraction Lifecycle

If you are using the REST API directly, the process follows a 3-step "Direct-to-Storage" flow for maximum security and performance:

Request Upload URL: POST /documents/upload-url (Get a presigned S3 link).
Upload File: PUT the raw bytes to that presigned link.
Confirm & Process: POST /documents/confirm-upload to start the AI extraction.
Poll Task: GET /tasks/{task_id} until the status is completed.

Tip: The Python and JS SDKs handle all four steps automatically with a single client.extract() call.

Next Steps

Browse Templates: See what data we extract by default using the Templates API.
Error Handling: Learn how to handle Rate Limits and Auth errors.
Exporting: Need CSV or Excel instead of JSON? Check the Exports API.