vad

package
v0.0.0-...-a4649ec Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 1, 2024 License: MIT Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func VAD

func VAD(frame []float32, energyThresh, silenceThresh float32) (bool, float32, float32)

NOTE This is a very rough implemntation. We should improve it :D VAD performs voice activity detection on a frame of audio data.

Types

type CapturedAudio

type CapturedAudio struct {
	ID string `json:"id"`

	PCM     []float32     `json:"-"`
	Packets []*rtp.Packet `json:"-"`

	Final bool `json:"final"`

	StartTimestamp uint64 `json:"start"`
	EndTimestamp   uint64 `json:"end"`
}

type CapturedSample

type CapturedSample struct {
	PCM          []float32
	EndTimestamp uint32
	Packet       *rtp.Packet
}

type Config

type Config struct {
	// // This is determined by the hyperparameter configuration that whisper was trained on.
	// // See more here: https://github.com/ggerganov/whisper.cpp/issues/909
	SampleRate int //   = 16000 // 16kHz
	// sampleRateMs = SampleRate / 1000
	// // This determines how much audio we will be passing to whisper inference.
	// // We will buffer up to (whisperSampleWindowMs - pcmSampleRateMs) of old audio and then add
	// // audioSampleRateMs of new audio onto the end of the buffer for inference
	SampleWindow time.Duration // = 24000 // 24 second sample window

}

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

func New

func New(config Config) *Engine

func (*Engine) Push

func (e *Engine) Push(captured *CapturedSample) *CapturedAudio

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL