vad

package

v0.0.0-...-a4649ec Latest Latest Go to latest Published: Feb 1, 2024 License: MIT Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/progrium/webrtc-sessions

Links

Open Source Insights

Documentation ¶

Index ¶

func VAD(frame []float32, energyThresh, silenceThresh float32) (bool, float32, float32)
type CapturedAudio
type CapturedSample
type Config
type Engine
- func New(config Config) *Engine
- func (e *Engine) Push(captured *CapturedSample) *CapturedAudio

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func VAD ¶

func VAD(frame []float32, energyThresh, silenceThresh float32) (bool, float32, float32)

NOTE This is a very rough implemntation. We should improve it :D VAD performs voice activity detection on a frame of audio data.

Types ¶

type CapturedAudio ¶

type CapturedAudio struct {
	ID string `json:"id"`

	PCM     []float32     `json:"-"`
	Packets []*rtp.Packet `json:"-"`

	Final bool `json:"final"`

	StartTimestamp uint64 `json:"start"`
	EndTimestamp   uint64 `json:"end"`
}

type CapturedSample ¶

type CapturedSample struct {
	PCM          []float32
	EndTimestamp uint32
	Packet       *rtp.Packet
}

type Config ¶

type Config struct {
	// // This is determined by the hyperparameter configuration that whisper was trained on.
	// // See more here: https://github.com/ggerganov/whisper.cpp/issues/909
	SampleRate int //   = 16000 // 16kHz
	// sampleRateMs = SampleRate / 1000
	// // This determines how much audio we will be passing to whisper inference.
	// // We will buffer up to (whisperSampleWindowMs - pcmSampleRateMs) of old audio and then add
	// // audioSampleRateMs of new audio onto the end of the buffer for inference
	SampleWindow time.Duration // = 24000 // 24 second sample window

}

type Engine ¶

type Engine struct {
	// contains filtered or unexported fields
}

func New ¶

func New(config Config) *Engine

func (*Engine) Push ¶

func (e *Engine) Push(captured *CapturedSample) *CapturedAudio

Source Files ¶

View all Source files

vad.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL