Skip to main content

Get Parser Result

Once a document has been parsed, you can retrieve the results using a GET request with the request_id of the document. This guide explains how to use the API to fetch parsing results.

API Endpoint#

GET https://api.documentpro.ai/files

Query Parameters#

  • request_id (required): The unique identifier of the parsing job you want to retrieve results for.

Headers#

  • x-api-key (required): Your API key for authentication.

Example Implementation#

Using Python#

import requests
api_key = 'YOUR_API_KEY'request_id = 'YOUR_REQUEST_ID'
url = f"https://api.documentpro.ai/files"
headers = {  'x-api-key': api_key}
params = {  'request_id': request_id}
response = requests.get(url, headers=headers, params=params)
if response.status_code == 200:    print('Results retrieved successfully')    print(response.json())else:    print('Failed to retrieve results')    print(response.text)

Response#

The response will contain information about the parsing job and its results.

Successful Response (Status Code: 200)#

{    "request_id": "a7813466-6f9a-4c33-8128-427e7a4df755",    "request_status": "completed",    "response_body": {        "file_name": "Q2_Financial_Report_2024.pdf",        "file_presigned_url": "https://documentpro-parsed-files.s3.amazonaws.com/Q2_Financial_Report_2024_parsed.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=...",        "user_error_msg": null,        "template_id": "8e9beda9-5cba-42eb-a70a-b3e5eec9120a",        "template_type": "financial_report",        "template_title": "Quarterly Financial Report Parser",        "num_pages": 15,        "human_verification_status": "approved",        "has_missing_required_fields": false,        "result_json_data": {            "company_name": "TechCorp Innovations Inc.",            "report_period": "Q2 2024",            "financial_highlights": {                "total_revenue": 1250000,                "net_income": 450000,                "earnings_per_share": 2.25,                "operating_cash_flow": 550000            },            "balance_sheet_summary": {                "total_assets": 10000000,                "total_liabilities": 4000000,                "total_equity": 6000000            },            "key_ratios": {                "gross_margin": 0.45,                "operating_margin": 0.22,                "return_on_equity": 0.075,                "debt_to_equity": 0.67            },            "segment_performance": [                {                    "segment_name": "Software Solutions",                    "revenue": 750000,                    "operating_income": 225000                },                {                    "segment_name": "Hardware Products",                    "revenue": 500000,                    "operating_income": 150000                }            ],            "risk_factors": [                "Intense market competition",                "Rapid technological changes",                "Global economic uncertainties"            ]        }    },    "created_at": "2024-07-25T14:30:10.696893",    "updated_at": "2024-07-25T14:30:29.565249"}

Error Response (Status Codes: 400, 401, 403, 404, 500)#

{    "success": false,    "error": "error_code",    "message": "descriptive error message"}

Response Fields Explained#

  • request_id: Unique identifier for the parsing job.
  • request_status: Current status of the parsing job. Possible values are:
    • "pending": The document has not started the parsing process
    • "processing": The document is being parsed
    • "completed": The document has been parsed successfully
    • "failed": Parsing failed due to an application or document error
    • "exception": Parser failed. These are retryable requests
  • file_name: Name of the original document file.
  • file_presigned_url: Temporary URL to download the parsed document (if available).
  • user_error_msg: Contains a human-readable error message if status is "failed" or "exception".
  • template_id: Unique identifier of the parser used.
  • template_type and template_title: Type and title of the parser used.
  • num_pages: Number of pages in the document.
  • human_verification_status: Can be "pending", "approved", or "rejected".
  • has_missing_required_fields: Indicates if any required fields were not extracted.
  • result_json_data: Contains the extracted data when parsing is completed.

Important Notes#

  1. The file_presigned_url is temporary and will expire after a certain period.
  2. If request_status is "pending" or "processing", result_json_data will be null.
  3. The structure of result_json_data depends on the parser used and the document type.
  4. Always check the request_status before attempting to use the parsed data.
  5. If status is "failed" or "exception", check the user_error_msg for more information.

Next Steps#

After retrieving the parsing results:

  1. If the status is "completed", you can use the extracted data in result_json_data for your application.
  2. If the status is "pending" or "processing", wait and retry the request after a short delay.
  3. If the status is "failed" or "exception", check the user_error_msg and consider resubmitting the document for parsing.
  4. You may want to download the parsed document using the file_presigned_url if available.