OpenAPI · Live-Playground

Die gesamte API in Scalar durchsuchen

Integrations use X-API-Key (Dashboard → API keys) on every request. Raw schema: /api/openapi.

Interaktive Referenz starten Rohdaten OpenAPI JSON

Playground

Im Browser ausprobieren

Anmelden für echte Requests gegen Produktion (Credits). Schritt-für-Schritt-Anleitung unten.

Batch speech-to-text API for serious files

Stop proxying gigabytes through your edge. Create a slot, PUT bytes straight to object storage, then let workers chew through the media while you poll a single job row.

Why presigned uploads matter

Your API servers should not become tape drives. VoiceChangerAPI issues short-lived PUT URLs so browsers, mobile apps, or FFmpeg pipelines can stream directly to the bucket that also feeds inference.

Batch STT reuses the same Job model as TTS: explicit states, credit reservation at start, and a transcript JSON endpoint so you do not have to parse binary payloads blindly.

Content creators, legal tech, and podcast platforms all land here when Whisper-class accuracy needs to coexist with enterprise storage hygiene.

End-to-end flow

This is the same three-step wizard you see under Speech to text in the dashboard—just expressed as REST.

1
Create an upload slot
POST /v1/stt/uploads?content_type=audio/wav (or video/mp4, etc.). You receive job_id, upload_url, upload_method, headers, and expires_in while the job waits in waiting_upload.
2
PUT the bytes
Use the returned method (PUT) and include the Content-Type header echoed in headers—signatures are strict. When the upload succeeds, move on even if the dashboard still says waiting_upload until you start.
3
Start transcription
POST /v1/stt/jobs/{job_id}/start with optional JSON { language, model }. Credits are reserved here and the job enters queued.
4
Poll and read JSON
GET /v1/jobs/{job_id} for lifecycle. When status is completed, GET /v1/stt/jobs/{job_id}/transcript returns structured JSON, while GET /v1/jobs/{job_id}/result still exposes the raw object for archival.

Create slot + start (trimmed)

After the PUT upload succeeds, call start. Authentication matches other endpoints.

# 1) Slot
curl -X POST "$API/v1/stt/uploads?content_type=audio%2Fwav" \
  -H "X-API-Key: $KEY"

# 2) curl --upload-file ./episode.wav "$UPLOAD_URL" -H "Content-Type: audio/wav"

# 3) Start
curl -X POST "$API/v1/stt/jobs/$JOB_ID/start" \
  -H "X-API-Key: $KEY" \
  -H "Content-Type: application/json" \
  -d {"language":"en","model":"default"}

Run batch STT from the UI

Upload a file, watch each state transition, and copy job IDs when you are ready to wire automation—the dashboard is your living integration checklist.

Open Speech-to-text dashboard Manage API keys

Large files? Keep using presigned uploads—your origin never touches the full binary.

Die gesamte API in Scalar durchsuchen

Why presigned uploads matter

End-to-end flow

Create an upload slot

PUT the bytes

Start transcription

Poll and read JSON

Create slot + start (trimmed)