OpenAPI · Live-Playground

Alle Endpunkte erkunden und echte Requests im Browser ausführen

Scalar lädt das live OpenAPI-Schema (gespiegelt unter /api/openapi). Authentifizierung wie im Dashboard (Bearer Firebase-ID-Token) oder mit X-API-Key aus Konto → API-Schlüssel. HTML-/admin-Routen sind für Betrieb — hinter Edge oder VPN absichern.

Batch speech-to-text API for serious files

Stop proxying gigabytes through your edge. Create a slot, PUT bytes straight to object storage, then let workers chew through the media while you poll a single job row.

Audio becomes structured text

Why presigned uploads matter

Your API servers should not become tape drives. VoiceChangerAPI issues short-lived PUT URLs so browsers, mobile apps, or FFmpeg pipelines can stream directly to the bucket that also feeds inference.

Batch STT reuses the same Job model as TTS: explicit states, credit reservation at start, and a transcript JSON endpoint so you do not have to parse binary payloads blindly.

Content creators, legal tech, and podcast platforms all land here when Whisper-class accuracy needs to coexist with enterprise storage hygiene.

End-to-end flow

This is the same three-step wizard you see under Speech to text in the dashboard—just expressed as REST.

  1. 1

    Create an upload slot

    POST /v1/stt/uploads?content_type=audio/wav (or video/mp4, etc.). You receive job_id, upload_url, upload_method, headers, and expires_in while the job waits in waiting_upload.

  2. 2

    PUT the bytes

    Use the returned method (PUT) and include the Content-Type header echoed in headers—signatures are strict. When the upload succeeds, move on even if the dashboard still says waiting_upload until you start.

  3. 3

    Start transcription

    POST /v1/stt/jobs/{job_id}/start with optional JSON { language, model }. Credits are reserved here and the job enters queued.

  4. 4

    Poll and read JSON

    GET /v1/jobs/{job_id} for lifecycle. When status is completed, GET /v1/stt/jobs/{job_id}/transcript returns structured JSON, while GET /v1/jobs/{job_id}/result still exposes the raw object for archival.

Create slot + start (trimmed)

After the PUT upload succeeds, call start. Authentication matches other endpoints.

# 1) Slot
curl -X POST "$API/v1/stt/uploads?content_type=audio%2Fwav" \
  -H "Authorization: Bearer $FIREBASE_JWT"

# 2) curl --upload-file ./episode.wav "$UPLOAD_URL" -H "Content-Type: audio/wav"

# 3) Start
curl -X POST "$API/v1/stt/jobs/$JOB_ID/start" \
  -H "Authorization: Bearer $FIREBASE_JWT" \
  -H "Content-Type: application/json" \
  -d {"language":"en","model":"default"}
Run batch STT from the UI
Upload a file, watch each state transition, and copy job IDs when you are ready to wire automation—the dashboard is your living integration checklist.

Large files? Keep using presigned uploads—your origin never touches the full binary.