Skip to main content
POST
/
v1
/
chat
/
completions
curl --request POST \ --url https://api.powertokens.ai/v1/chat/completions \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "glm-5-turbo", "messages": [ { "role": "system", "content": "You are a concise and professional assistant." }, { "role": "user", "content": "Explain vector databases in three sentences." } ], "temperature": 0.7, "max_tokens": 1024 } '
{
  "id": "chatcmpl_zhipu_123",
  "object": "chat.completion",
  "created": 1775174400,
  "model": "glm-5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "A vector database is a system designed to store and retrieve vector embeddings, commonly used for semantic search, recommendations, and RAG.",
        "reasoning_content": "Define the concept first, then add the main use cases."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 31,
    "total_tokens": 73,
    "prompt_tokens_details": {
      "cached_tokens": 0
    }
  }
}

Authorizations

Authorization
string
header
required

Send Authorization: Bearer <token> in the request headers.

Body

application/json

Request body for Zhipu chat completions.

model
enum<string>
required

Model name. Supported models include glm-5-turbo, glm-5, glm-4.7, glm-4.7-flash, and glm-4.5-air.

Available options:
glm-5-turbo,
glm-5,
glm-4.7,
glm-4.7-flash,
glm-4.5-air
Example:

"glm-5-turbo"

messages
object[]
required

Message list. Supports plain text messages and image input through image_url content parts. The request must include at least one message that is not system or assistant; requests made up only of system messages or only of assistant messages are rejected upstream.

Minimum array length: 1
stream
boolean
default:false

Whether to enable streaming output. When true, the response content type is text/event-stream.

thinking
object

Thinking mode configuration. Applies to models that support the thinking parameter.

temperature
number
default:1

Sampling temperature, in the range [0, 1].

Required range: 0 <= x <= 1
top_p
number
default:0.95

Nucleus sampling threshold, in the range [0.01, 1].

Required range: 0.01 <= x <= 1
max_tokens
integer

Maximum number of output tokens.

Required range: x >= 1
stop
string[]

List of stop sequences. Only a single stop sequence is currently supported.

Maximum array length: 1
tools
object[]

Tool definitions. Only the function tool shape is currently supported.

Maximum array length: 128
tool_choice
enum<string>
default:auto

Tool selection strategy. This Zhipu interface currently exposes only auto.

Available options:
auto

Response

Success. Non-streaming mode returns JSON, while streaming mode returns an SSE event stream.

Non-streaming chat completion response.

id
string

Response ID.

object
string

Object type.

Example:

"chat.completion"

created
integer<int64>

Unix timestamp in seconds.

model
string

Actual model used.

choices
object[]

Candidate outputs returned by the model.

usage
object

Token usage statistics.