Classify Document
Use this endpoint to classify a document into one of your defined categories in a single API call. No saved classifier is required — you provide the labels inline with each request. This is ideal for one-off classifications or when your label set changes per request.
For recurring automation (e.g. auto-routing every incoming document by type), consider creating a saved Classifier instead and calling Run Classification.
Prerequisite
The document must already be uploaded and OCR-processed before it can be classified. Upload the document and run an extract first to prepare it.
API Endpoint
POST https://api.documentpro.ai/v1/classify
Headers
x-api-key(required): Your API key for authentication.Content-Type: application/json
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
document_id | string (UUID) | Yes | The ID of the document to classify. |
classification_schema | array | Yes | List of category objects, each with a label and description. Minimum 2 categories. |
page_range | string | No | Pages to use for classification (e.g. "1-3", "1,3,5"). Defaults to all pages. |
query_model | string | No | AI model to use. Options: "gpt-4o-mini" (default), "gpt-4o". |
Each item in classification_schema must have:
label(string): The category name that will be returned if the document matches (e.g."invoice").description(string): A plain-English description that helps the AI understand what this category means.
Example Implementation
Using cURL
curl --location 'https://api.documentpro.ai/v1/classify' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8",
"classification_schema": [
{ "label": "invoice", "description": "A document requesting payment for goods or services rendered" },
{ "label": "purchase_order", "description": "A buyer-issued document authorizing a purchase from a supplier" },
{ "label": "contract", "description": "A legally binding agreement between two or more parties" },
{ "label": "other", "description": "Any document that does not fit the above categories" }
],
"page_range": "1-2",
"query_model": "gpt-4o-mini"
}'
Using Python
import requests
import json
url = "https://api.documentpro.ai/v1/classify"
headers = {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json'
}
payload = {
"document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8",
"classification_schema": [
{ "label": "invoice", "description": "A document requesting payment for goods or services rendered" },
{ "label": "purchase_order", "description": "A buyer-issued document authorizing a purchase from a supplier" },
{ "label": "contract", "description": "A legally binding agreement between two or more parties" },
{ "label": "other", "description": "Any document that does not fit the above categories" }
],
"page_range": "1-2",
"query_model": "gpt-4o-mini"
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
if response.status_code == 200:
result = response.json()
print(f"Classification: {result['classification']}")
print(f"Confidence scores: {result['confidence_scores']}")
else:
print('Classification failed')
print(response.text)
Response
Successful Response (Status Code: 200)
{
"request_id": "a7813466-6f9a-4c33-8128-427e7a4df755",
"document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8",
"classifier_id": null,
"classification": "invoice",
"confidence_scores": {
"invoice": 0.9312,
"purchase_order": 0.0421,
"contract": 0.0198,
"other": 0.0069
},
"request_status": "completed",
"credits_used": 2,
"num_pages": 2,
"duration": 1.4
}
Error Response (Status Codes: 400, 403, 404, 500)
{
"success": false,
"error": "error_code",
"message": "descriptive error message"
}
Response Fields Explained
request_id: Unique identifier for this classification run. Store it if you want to audit classification history.document_id: The document that was classified.classifier_id: Alwaysnullfor inline classification. Populated when using a saved classifier.classification: The single best-matching label from yourclassification_schema.confidence_scores: A score between 0.0 and 1.0 for every label you provided. All scores sum to approximately 1.0.request_status:"completed"on success,"failed"or"exception"on error.credits_used: Credits deducted from your plan for this classification.num_pages: Number of pages in the document (or in thepage_rangeif specified).duration: Time in seconds the classification took.
Important Notes
- The document must have been processed through the Extract pipeline before classifying. If the document has not been OCR'd, you will receive a
400error. - You must provide at least 2 labels in
classification_schema. - Write clear, distinct descriptions for each label. The more specific the description, the more accurate the classification.
- Use
page_rangeto limit classification to the most informative pages of long documents — this reduces credit usage and improves speed. - Each call to
/v1/classifyconsumes credits based on the number of pages processed.
Next Steps
- Create a saved Classifier to reuse a label set across many documents without repeating it in every request.
- Run Classification with a saved Classifier for automated document routing pipelines.