Skip to main content

Run Classification

Run a saved Classifier against a document. The Classifier's label set and settings are applied automatically — you only need to provide the document_id.

This is the recommended endpoint for automation pipelines where the same label set is used repeatedly (e.g. routing every incoming document to the right Workflow). For a one-off classification without a saved Classifier, use Classify Document instead.

Prerequisite

The document must already be uploaded and OCR-processed before it can be classified. Upload the document and run an extract first to prepare it.

API Endpoint

POST https://api.documentpro.ai/v1/classifiers/{classifier_id}/classify

Path Parameters

  • classifier_id (required): The unique identifier of the saved Classifier to use.

Headers

  • x-api-key (required): Your API key for authentication.
  • Content-Type: application/json

Request Body

FieldTypeRequiredDescription
document_idstring (UUID)YesThe ID of the document to classify.
page_rangestringNoOverride the Classifier's default page range for this request (e.g. "1-3").
query_modelstringNoOverride the Classifier's default model for this request ("gpt-4o-mini" or "gpt-4o").

Example Implementation

Using cURL

curl --location 'https://api.documentpro.ai/v1/classifiers/f47ac10b-58cc-4372-a567-0e02b2c3d479/classify' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8"
}'

Using Python

import requests
import json

classifier_id = "f47ac10b-58cc-4372-a567-0e02b2c3d479"
url = f"https://api.documentpro.ai/v1/classifiers/{classifier_id}/classify"

headers = {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json'
}

payload = {
"document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8"
}

response = requests.post(url, headers=headers, data=json.dumps(payload))

if response.status_code == 200:
result = response.json()
print(f"Classification: {result['classification']}")
print(f"Confidence scores: {result['confidence_scores']}")
else:
print('Classification failed')
print(response.text)

Response

Successful Response (Status Code: 200)

{
"request_id": "a7813466-6f9a-4c33-8128-427e7a4df755",
"document_id": "0b13c9f2-5148-4ffb-bb7b-de03bb071ca8",
"classifier_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"classification": "invoice",
"confidence_scores": {
"invoice": 0.9312,
"purchase_order": 0.0421,
"contract": 0.0198,
"other": 0.0069
},
"request_status": "completed",
"credits_used": 2,
"num_pages": 2,
"duration": 1.4
}

Error Response (Status Codes: 400, 403, 404, 500)

{
"success": false,
"error": "error_code",
"message": "descriptive error message"
}

Response Fields Explained

  • request_id: Unique identifier for this classification run.
  • document_id: The document that was classified.
  • classifier_id: The saved Classifier that was applied.
  • classification: The single best-matching label from the Classifier's classes list.
  • confidence_scores: A score between 0.0 and 1.0 for every label in the Classifier. All scores sum to approximately 1.0.
  • request_status: "completed" on success, "failed" or "exception" on error.
  • credits_used: Credits deducted from your plan for this classification.
  • num_pages: Number of pages processed.
  • duration: Time in seconds the classification took.

Important Notes

  1. The document must have been processed through the Extract pipeline before classifying. If the document has not been OCR'd, you will receive a 400 error.
  2. page_range and query_model in the request body override the Classifier's saved defaults for this call only — the Classifier itself is not modified.
  3. Each call consumes credits based on the number of pages processed.

Next Steps

  • Use the classification result to route the document to the right Workflow by passing the appropriate template_id.
  • Update the Classifier to refine labels based on results.