Classify Document

Use this endpoint to classify a document into one of your defined categories in a single API call. No saved classifier is required — you provide the labels inline with each request. This is ideal for one-off classifications or when your label set changes per request.

For recurring automation (e.g. auto-routing every incoming document by type), consider creating a saved Classifier instead and calling Run Classification.

Prerequisite

The document must already be uploaded and OCR-processed before it can be classified. Upload the document and run an extract first to prepare it.

API Endpoint

POST https://api.documentpro.ai/v1/classify

Headers

x-api-key (required): Your API key for authentication.
Content-Type: application/json

Request Body

Field	Type	Required	Description
`document_id`	string (UUID)	Yes	The ID of the document to classify.
`classification_schema`	array	Yes	List of category objects, each with a `label` and `description`. Minimum 2 categories.
`page_range`	string	No	Pages to use for classification (e.g. `"1-3"`, `"1,3,5"`). Defaults to all pages.
`query_model`	string	No	AI model to use. Options: `"gpt-4o-mini"` (default), `"gpt-4o"`.

Each item in classification_schema must have:

label (string): The category name that will be returned if the document matches (e.g. "invoice").
description (string): A plain-English description that helps the AI understand what this category means.

Example Implementation

Using cURL

curl --location 'https://api.documentpro.ai/v1/classify' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
  "document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8",
  "classification_schema": [
    { "label": "invoice", "description": "A document requesting payment for goods or services rendered" },
    { "label": "purchase_order", "description": "A buyer-issued document authorizing a purchase from a supplier" },
    { "label": "contract", "description": "A legally binding agreement between two or more parties" },
    { "label": "other", "description": "Any document that does not fit the above categories" }
  ],
  "page_range": "1-2",
  "query_model": "gpt-4o-mini"
}'

Using Python

import requests
import json

url = "https://api.documentpro.ai/v1/classify"

headers = {
    'x-api-key': 'YOUR_API_KEY',
    'Content-Type': 'application/json'
}

payload = {
    "document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8",
    "classification_schema": [
        { "label": "invoice", "description": "A document requesting payment for goods or services rendered" },
        { "label": "purchase_order", "description": "A buyer-issued document authorizing a purchase from a supplier" },
        { "label": "contract", "description": "A legally binding agreement between two or more parties" },
        { "label": "other", "description": "Any document that does not fit the above categories" }
    ],
    "page_range": "1-2",
    "query_model": "gpt-4o-mini"
}

response = requests.post(url, headers=headers, data=json.dumps(payload))

if response.status_code == 200:
    result = response.json()
    print(f"Classification: {result['classification']}")
    print(f"Confidence scores: {result['confidence_scores']}")
else:
    print('Classification failed')
    print(response.text)

Response

Successful Response (Status Code: 200)

{
    "request_id": "a7813466-6f9a-4c33-8128-427e7a4df755",
    "document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8",
    "classifier_id": null,
    "classification": "invoice",
    "confidence_scores": {
        "invoice": 0.9312,
        "purchase_order": 0.0421,
        "contract": 0.0198,
        "other": 0.0069
    },
    "request_status": "completed",
    "credits_used": 2,
    "num_pages": 2,
    "duration": 1.4
}

Error Response (Status Codes: 400, 403, 404, 500)

{
    "success": false,
    "error": "error_code",
    "message": "descriptive error message"
}

Response Fields Explained

request_id: Unique identifier for this classification run. Store it if you want to audit classification history.
document_id: The document that was classified.
classifier_id: Always null for inline classification. Populated when using a saved classifier.
classification: The single best-matching label from your classification_schema.
confidence_scores: A score between 0.0 and 1.0 for every label you provided. All scores sum to approximately 1.0.
request_status: "completed" on success, "failed" or "exception" on error.
credits_used: Credits deducted from your plan for this classification.
num_pages: Number of pages in the document (or in the page_range if specified).
duration: Time in seconds the classification took.

Important Notes

The document must have been processed through the Extract pipeline before classifying. If the document has not been OCR'd, you will receive a 400 error.
You must provide at least 2 labels in classification_schema.
Write clear, distinct descriptions for each label. The more specific the description, the more accurate the classification.
Use page_range to limit classification to the most informative pages of long documents — this reduces credit usage and improves speed.
Each call to /v1/classify consumes credits based on the number of pages processed.

Next Steps

Create a saved Classifier to reuse a label set across many documents without repeating it in every request.
Run Classification with a saved Classifier for automated document routing pipelines.

API Endpoint​

Headers​

Request Body​

Example Implementation​

Using cURL​

Using Python​

Response​

Successful Response (Status Code: 200)​

Error Response (Status Codes: 400, 403, 404, 500)​

Response Fields Explained​

Important Notes​

Next Steps​