curl --request POST \
--url https://api.powertokens.ai/v1/audio/speech \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data @- <<EOF
{
"model": "speech-2.8-hd",
"input": "Summarize today's system status in a natural tone.",
"voice": "Chinese (Mandarin)_Lyrical_Voice",
"response_format": "mp3",
"metadata": {
"output_format": "url"
}
}
EOF"<string>"Call the historical MiniMax t2a_v2 capability through the unified /v1/audio/speech endpoint.
Stable models currently exposed by the gateway: speech-2.8-hd, speech-2.8-turbo, speech-2.6-hd, speech-2.6-turbo, speech-02-hd, and speech-02-turbo.
curl --request POST \
--url https://api.powertokens.ai/v1/audio/speech \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data @- <<EOF
{
"model": "speech-2.8-hd",
"input": "Summarize today's system status in a natural tone.",
"voice": "Chinese (Mandarin)_Lyrical_Voice",
"response_format": "mp3",
"metadata": {
"output_format": "url"
}
}
EOF"<string>"Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
speech-2.8-hd, speech-2.8-turbo, speech-2.6-hd, speech-2.6-turbo, speech-02-hd, speech-02-turbo Input text to synthesize.
Voice name. Mapped to upstream voice_setting.voice_id.
Speaking rate. Mapped to upstream voice_setting.speed.
Target audio encoding format, mapped to upstream audio_setting.format. Common values include mp3, wav, and pcm.
Streaming trigger field. Any non-empty value enables upstream stream=true; the current gateway only relies on its non-empty semantics.
MiniMax extension container. The only stable field currently documented here is output_format.
Show child attributes
Streaming success. Returns MiniMax SSE payloads. For internal settlement, the gateway reads extra_info.usage_characters as the unified input audio character count.
SSE payload. Each data: event can be parsed as a MiniMax response chunk containing data.audio, trace_id, and base_resp.