Skip to main content

List Documents

This endpoint allows you to retrieve a list of documents in your account. You can use various query parameters to filter and paginate the results.

API Endpoint

GET https://api.documentpro.ai/v1/documents

Query Parameters

All parameters are optional:

  • limit (optional): Number of documents to return. Default is 100.
  • document_id (optional): Filter by a specific document ID.
  • created_at (optional): Filter by creation date.
  • search_term (optional): Search for documents by name or content.

Headers

  • x-api-key (required): Your API key for authentication.

Example Implementation

Using cURL

curl --location 'https://api.documentpro.ai/v1/documents?limit=10' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Accept: application/json'

Using Python

import requests

url = "https://api.documentpro.ai/v1/documents"
headers = {
"x-api-key": "YOUR_API_KEY",
"Accept": "application/json"
}
params = {
"limit": 10,
# Add other optional parameters as needed
}

response = requests.get(url, headers=headers, params=params)

if response.status_code == 200:
result = response.json()
print("Documents retrieved successfully:")
for doc in result['files']:
print(f"Document ID: {doc['document_id']}, Name: {doc['file_name']}")
if result['pagination_key']:
print("More documents available. Use pagination_key for next request.")
else:
print("Failed to retrieve documents")
print(response.text)

Response

The response will contain an array of document objects and a pagination key if there are more results.

Successful Response (Status Code: 200)

{
"files": [
{
"document_id": "0b33c9f2-5148-4ffb-bb7b-de03bb071ca8",
"user_id": "24d9b25a-9bba-4de4-be67-c06233d2f305",
"source_name": "api",
"s3_key": "https://documentpro-user-accounts.s3.amazonaws.com/...",
"file_name": "file_name.pdf",
"file_extension": "pdf",
"num_pages": 1,
"meta_tags": {},
"parser_runs": [
{
"template_id": "a62e2b5f-ccc2-4e2a-8fda-d7bd6579a692",
"template_title": "Invoice",
"request_id": "123e337c-0294-4cf8-b684-1310c8cd6a4e",
"params": {
"end_regex": null,
"page_ranges": "5",
"use_ocr": true,
"detect_tables": true,
"start_regex": null,
"query_model": "gpt-4o-mini",
"use_all_matches": false,
"detect_layout": true,
"split_regex": null
},
"datetime": "2024-07-25T10:01:06.536970"
},
// ... additional parser runs ...
],
"created_at": "2024-07-25T14:16:44.540197",
"updated_at": "2024-07-25T14:16:44.540223"
},
{
"document_id": "8edd4104-cb56-42eb-b1f4-cbd6da0894ca",
"user_id": "2s49b2ca-9bba-4de4-be67-c06233d2f305",
"source_name": "web",
"s3_key": "https://documentpro-user-accounts.s3.amazonaws.com/...",
"file_name": "file_name.pdf",
"file_extension": "pdf",
"num_pages": 8,
"meta_tags": {},
"parser_runs": [],
"created_at": "2024-07-25T10:00:30.677936",
"updated_at": "2024-07-25T10:01:07.109837"
}
],
"pagination_key": {
"document_id": "8ed94104-cb56-42eb-b1f4-cbd6da0894ca",
"created_at": "2024-07-25T10:00:30.677936"
}
}

Error Response (Status Codes: 400, 401, 403, 500)

{
"success": false,
"error": "error_code",
"message": "descriptive error message"
}

Response Fields Explained

  • files: An array of document objects, each containing:

    • document_id: Unique identifier for the document.
    • user_id: ID of the user who owns the document.
    • source_name: Source of the document upload (e.g., "api", "web").
    • s3_key: S3 URL for accessing the document.
    • file_name: Name of the document file.
    • file_extension: File extension (e.g., "pdf").
    • num_pages: Number of pages in the document.
    • meta_tags: Any metadata associated with the document.
    • parser_runs: Array of parsing operations performed on the document.
    • created_at: Timestamp of document creation.
    • updated_at: Timestamp of last update.
  • pagination_key: If present, contains document_id and created_at of the last document in the current set. Use these values to fetch the next set of results.

Pagination

To retrieve the next set of results, include the pagination_key values in your next request:

GET https://api.documentpro.ai/v1/documents?document_id=8ed94104-cb56-42eb-b1f4-cbd6da0894ca&created_at=2024-07-25T10:00:30.677936

Important Notes

  1. The default limit is 100 documents per request. Adjust using the limit parameter.
  2. The parser_runs array shows the history of parsing operations on each document.
  3. Use the pagination_key to fetch subsequent pages of results if available.
  4. The s3_key URLs are temporary and will expire after a certain period.

Next Steps