语音合成 - Mint Starter Kit

curl --request POST \ --url https://api.powertokens.ai/v1/audio/speech \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "speech-2.8-hd", "input": "请用自然语气介绍今天的运行情况。", "voice": "Chinese (Mandarin)_Lyrical_Voice", "response_format": "mp3", "metadata": { "output_format": "url" } } '

授权

Authorization

string

header

必填

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

请求体

application/json

model

enum<string>

必填

可用选项:

speech-2.8-hd,

speech-2.8-turbo,

speech-2.6-hd,

speech-2.6-turbo,

speech-02-hd,

speech-02-turbo

input

string

必填

待合成文本。

voice

string

必填

音色名称，会映射到上游 voice_setting.voice_id。

speed

number

语速，会映射到上游 voice_setting.speed。

response_format

string

目标音频编码格式，会映射到上游 audio_setting.format。常见值包括 mp3、wav、pcm。

stream_format

string

流式触发字段。任何非空值都会启用上游 stream=true；当前项目仅使用其“非空”语义，不额外约束具体取值。

metadata

object

MiniMax 扩展字段容器。当前公开稳定字段仅包含 output_format。

Show child attributes

响应

流式调用成功，返回 MiniMax SSE 数据。项目内部结算会读取 extra_info.usage_characters 作为统一输入音频字符数。

SSE 数据流。每个 data: 事件都可反序列化为包含 data.audio、trace_id、base_resp 的 MiniMax 响应片段。

模型接口

授权

请求体

响应