auditory

package module

v0.9.1 Latest Latest Go to latest Published: Aug 14, 2021 License: BSD-3-Clause Imports: 0 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/emer/auditory

Links

Open Source Insights

README ¶

auditory

Auditory is the our repository for audition processing code in Go (golang) focused on filtering speech wav files via mel filters. A further step using gabors provides filtering for input to neural networks. The processing code is split into 4 packages, sound, mel, dft and agabor, that can be used independently. A fifth package, trm, is a work in progress port of Gnuspeech. Example code is in examples/processspeech.

The sound package contains code for loading a wav file into a buffer and then converting to a floating point tensor. There are functions for trimming and padding.

The 'dft' package does a fourier transform and computes the power spectrum on the sound samples passed in.

The 'mel' package creates a set of mel filter banks and applies them to the power data to create a spectrogram.

The 'agabor' package produces an edge detector that detects oriented contrast transitions between light and dark which can be convolved with the output of the mel processing.

11/13/20 - Important fix to example code (processspeech.go). The mel InitFilters call was passing WindowSamples/2 rather than WindowSamples. The effect was to cut off the upper half of frequencies in the mel output.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

doc.go

Directories ¶

Path	Synopsis
agabor
dft
examples
processspeech
mel
sound

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL