audio

package

v0.3.1-beta Latest Latest Go to latest Published: Dec 2, 2024 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Kardbord/gopenai

README ¶

Audio

Bindings for the audio endpoint.

Example

See audio-example.go.

Documentation ¶

Overview ¶

Package audio provides bindings for the audio endpoint. Converts audio into text.

Index ¶

Constants
func MakeSpeechRequest(request *SpeechRequest, organizationID *string) ([]byte, error)
type Response
- func MakeTranscriptionRequest(request *TranscriptionRequest, organizationID *string) (*Response, error)
- func MakeTranslationRequest(request *TranslationRequest, organizationID *string) (*Response, error)
type ResponseFormat
type SpeechRequest
type TranscriptionRequest
type TranslationRequest

Constants ¶

View Source

const (
	BaseEndpoint         = common.BaseURL + "audio/"
	TransciptionEndpoint = BaseEndpoint + "transcriptions"
	TranslationEndpoint  = BaseEndpoint + "translations"
	SpeechEndpoint       = BaseEndpoint + "speech"
)

View Source

const (
	// TODO: Support non-json return formats.
	ResponseFormatJSON = "json"
	// [deprecated]: Use ResponseFormatJSON instead
	JSONResponseFormat = ResponseFormatJSON
)

View Source

const (
	VoiceAlloy   = "alloy"
	VoiceEcho    = "echo"
	VoiceFable   = "fable"
	VoiceOnyx    = "onyx"
	VoiceNova    = "nova"
	VoiceShimmer = "shimmer"

	SpeechFormatMp3  = "mp3"
	SpeechFormatOpus = "opus"
	SpeechFormatAac  = "aac"
	SpeechFormatFlac = "flac"
)

Variables ¶

This section is empty.

Functions ¶

func MakeSpeechRequest ¶

func MakeSpeechRequest(request *SpeechRequest, organizationID *string) ([]byte, error)

Types ¶

type Response ¶

type Response struct {
	Text  string                `json:"text"`
	Usage common.ResponseUsage  `json:"usage"`
	Error *common.ResponseError `json:"error,omitempty"`
}

Response structure for both Transcription and Translation requests.

func MakeTranscriptionRequest ¶

func MakeTranscriptionRequest(request *TranscriptionRequest, organizationID *string) (*Response, error)

func MakeTranslationRequest ¶

func MakeTranslationRequest(request *TranslationRequest, organizationID *string) (*Response, error)

type ResponseFormat ¶

type ResponseFormat = string

type SpeechRequest ¶

type SpeechRequest struct {
	// One of the available TTS models.
	Model string `json:"model"`

	// The text to generate audio for. The maximum length is 4096 characters.
	Input string `json:"input"`

	// The voice to use when generating the audio.
	Voice string `json:"voice"`

	// The format to audio in.
	ResponseFormat ResponseFormat `json:"response_format,omitempty"`

	// The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
	Speed float64 `json:"speed,omitempty"`
}

Request structure for the create speech endpoint.

type TranscriptionRequest ¶

type TranscriptionRequest struct {
	// The audio file to transcribe, in one of these formats:
	// mp3, mp4, mpeg, mpga, m4a, wav, or webm.
	// This can be a file path or a URL.
	File string `json:"file"`

	// ID of the model to use. You can use the List models API
	// to see all of your available models, or see our Model
	// overview for descriptions of them.
	Model string `json:"model"`

	// An optional text to guide the model's style or continue a
	// previous audio segment. The prompt should match the audio language.
	Prompt string `json:"prompt,omitempty"`

	// The format of the transcript output, in one of these options:
	// json, text, srt, verbose_json, or vtt.
	ResponseFormat ResponseFormat `json:"response_format,omitempty"`

	// The sampling temperature, between 0 and 1. Higher values like 0.8 will
	// make the output more random, while lower values like 0.2 will make it
	// more focused and deterministic. If set to 0, the model will use log
	// probability to automatically increase the temperature until certain
	// thresholds are hit.
	Temperature *float64 `json:"temperature,omitempty"`

	// The language of the input audio. Supplying the input language in
	// ISO-639-1 format will improve accuracy and latency.
	Language string `json:"language,omitempty"`
}

Request structure for the transcription endpoint.

type TranslationRequest ¶

type TranslationRequest struct {
	// The audio file to transcribe, in one of these formats:
	// mp3, mp4, mpeg, mpga, m4a, wav, or webm.
	// This can be a file path or a URL.
	File string `json:"file"`

	// ID of the model to use. You can use the List models API
	// to see all of your available models, or see our Model
	// overview for descriptions of them.
	Model string `json:"model"`

	// An optional text to guide the model's style or continue a
	// previous audio segment. The prompt should be in English.
	Prompt string `json:"prompt,omitempty"`

	// The format of the transcript output, in one of these options:
	// json, text, srt, verbose_json, or vtt.
	ResponseFormat ResponseFormat `json:"response_format,omitempty"`

	// The sampling temperature, between 0 and 1. Higher values like 0.8 will
	// make the output more random, while lower values like 0.2 will make it
	// more focused and deterministic. If set to 0, the model will use log
	// probability to automatically increase the temperature until certain
	// thresholds are hit.
	Temperature *float64 `json:"temperature,omitempty"`
}

Request structure for the Translations endpoint.

Source Files ¶

View all Source files

audio.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL