Documentation
¶
Overview ¶
Package audio provides bindings for the audio endpoint. Converts audio into text.
Index ¶
Constants ¶
View Source
const ( BaseEndpoint = common.BaseURL + "audio/" TransciptionEndpoint = BaseEndpoint + "transcriptions" TranslationEndpoint = BaseEndpoint + "translations" SpeechEndpoint = BaseEndpoint + "speech" )
View Source
const ( // TODO: Support non-json return formats. ResponseFormatJSON = "json" // [deprecated]: Use ResponseFormatJSON instead JSONResponseFormat = ResponseFormatJSON )
View Source
const ( VoiceAlloy = "alloy" VoiceEcho = "echo" VoiceFable = "fable" VoiceOnyx = "onyx" VoiceNova = "nova" VoiceShimmer = "shimmer" SpeechFormatMp3 = "mp3" SpeechFormatOpus = "opus" SpeechFormatAac = "aac" SpeechFormatFlac = "flac" )
Variables ¶
This section is empty.
Functions ¶
func MakeSpeechRequest ¶
func MakeSpeechRequest(request *SpeechRequest, organizationID *string) ([]byte, error)
Types ¶
type Response ¶
type Response struct { Text string `json:"text"` Usage common.ResponseUsage `json:"usage"` Error *common.ResponseError `json:"error,omitempty"` }
Response structure for both Transcription and Translation requests.
func MakeTranscriptionRequest ¶
func MakeTranscriptionRequest(request *TranscriptionRequest, organizationID *string) (*Response, error)
func MakeTranslationRequest ¶
func MakeTranslationRequest(request *TranslationRequest, organizationID *string) (*Response, error)
type ResponseFormat ¶
type ResponseFormat = string
type SpeechRequest ¶
type SpeechRequest struct { // One of the available TTS models. Model string `json:"model"` // The text to generate audio for. The maximum length is 4096 characters. Input string `json:"input"` // The voice to use when generating the audio. Voice string `json:"voice"` // The format to audio in. ResponseFormat ResponseFormat `json:"response_format,omitempty"` // The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default. Speed float64 `json:"speed,omitempty"` }
Request structure for the create speech endpoint.
type TranscriptionRequest ¶
type TranscriptionRequest struct { // The audio file to transcribe, in one of these formats: // mp3, mp4, mpeg, mpga, m4a, wav, or webm. // This can be a file path or a URL. File string `json:"file"` // ID of the model to use. You can use the List models API // to see all of your available models, or see our Model // overview for descriptions of them. Model string `json:"model"` // An optional text to guide the model's style or continue a // previous audio segment. The prompt should match the audio language. Prompt string `json:"prompt,omitempty"` // The format of the transcript output, in one of these options: // json, text, srt, verbose_json, or vtt. ResponseFormat ResponseFormat `json:"response_format,omitempty"` // The sampling temperature, between 0 and 1. Higher values like 0.8 will // make the output more random, while lower values like 0.2 will make it // more focused and deterministic. If set to 0, the model will use log // probability to automatically increase the temperature until certain // thresholds are hit. Temperature *float64 `json:"temperature,omitempty"` // The language of the input audio. Supplying the input language in // ISO-639-1 format will improve accuracy and latency. Language string `json:"language,omitempty"` }
Request structure for the transcription endpoint.
type TranslationRequest ¶
type TranslationRequest struct { // The audio file to transcribe, in one of these formats: // mp3, mp4, mpeg, mpga, m4a, wav, or webm. // This can be a file path or a URL. File string `json:"file"` // ID of the model to use. You can use the List models API // to see all of your available models, or see our Model // overview for descriptions of them. Model string `json:"model"` // An optional text to guide the model's style or continue a // previous audio segment. The prompt should be in English. Prompt string `json:"prompt,omitempty"` // The format of the transcript output, in one of these options: // json, text, srt, verbose_json, or vtt. ResponseFormat ResponseFormat `json:"response_format,omitempty"` // The sampling temperature, between 0 and 1. Higher values like 0.8 will // make the output more random, while lower values like 0.2 will make it // more focused and deterministic. If set to 0, the model will use log // probability to automatically increase the temperature until certain // thresholds are hit. Temperature *float64 `json:"temperature,omitempty"` }
Request structure for the Translations endpoint.
Click to show internal directories.
Click to hide internal directories.