API Responses

The chat completion object

Represents a chat completion response returned by model, based on the provided input.

Response Fields

`choices`

Type: array

A list of chat completion choices. Can be more than one if n is greater than 1.

Show properties

index (integer): The index of this choice in the list
message (object): The chat message generated by the model
- role (string): The role of the message author (e.g., "assistant")
- content (string): The content of the message
- refusal (string or null): The refusal message if the model refuses to answer
- annotations (array): Additional annotations on the message
logprobs (object or null): Log probability information
finish_reason (string): The reason the model stopped generating (e.g., "stop", "length", "content_filter")

`created`

Type: integer

The Unix timestamp (in seconds) of when the chat completion was created.

`id`

Type: string

A unique identifier for the chat completion.

`model`

Type: string

The model used for the chat completion.

`object`

Type: string

The object type, which is always chat.completion.

`service_tier`

Type: string or null

Specifies the latency tier to use for processing the request. This parameter is relevant for customers subscribed to the scale tier service:

If set to 'auto', and the Project is Scale tier enabled, the system will utilize scale tier credits until they are exhausted.
If set to 'auto', and the Project is not Scale tier enabled, the request will be processed using the default service tier with a lower uptime SLA and no latency guarantee.
If set to 'default', the request will be processed using the default service tier with a lower uptime SLA and no latency guarantee.
If set to 'flex', the request will be processed with the Flex Processing service tier. Learn more.
When not set, the default behavior is 'auto'.

When this parameter is set, the response body will include the service_tier utilized.

`system_fingerprint`

Type: string

This fingerprint represents the backend configuration that the model runs with.

Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.

`usage`

Type: object

Usage statistics for the completion request.

Show properties

prompt_tokens (integer): Number of tokens in the prompt
completion_tokens (integer): Number of tokens in the generated completion
total_tokens (integer): Total number of tokens used in the request (prompt + completion)
prompt_tokens_details (object): Detailed breakdown of prompt tokens
- cached_tokens (integer): Number of tokens that were cached
- audio_tokens (integer): Number of audio tokens in the prompt
completion_tokens_details (object): Detailed breakdown of completion tokens
- reasoning_tokens (integer): Number of tokens used for reasoning
- audio_tokens (integer): Number of audio tokens in the completion
- accepted_prediction_tokens (integer): Number of accepted prediction tokens
- rejected_prediction_tokens (integer): Number of rejected prediction tokens

Example Response

{
  "id": "chatcmpl-8YMnDbsifkBeAs814beb0dFOJdPeG",
  "object": "chat.completion",
  "created": 1741570285,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The image shows a wooden boardwalk extending through a forest.",
        "refusal": null,
        "annotations": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "feedback_token": "fb_8YMnDbsifkBeAs814beb0dFOJdPeG_0",
      "duration_seconds": 2.34,
      "cost_usd": 0.001234
    }
  ],
  "usage": {
    "prompt_tokens": 1117,
    "completion_tokens": 46,
    "total_tokens": 1163,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": "fp_fc7f1d7035"
}

Extra fields in completion response

In the completion response, the choice object includes the following fields that are not defined in the OpenAI API:

feedback_token: the feedback token for the completion, as described in the User Feedback documentation.
duration_seconds: the duration of the completion in seconds
cost_usd: the cost of the generation in USD for that specific choice

Streaming Responses

When stream: true is set, the response will be sent as data-only server-sent events. See the Streaming guide for more details on handling streaming responses.

API Responses

On this page