vfilter

package
v2.0.0-dev0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 16, 2024 License: BSD-3-Clause Imports: 13 Imported by: 3

Documentation

Overview

Package vfilter provides filtering methods for the vision package. These apply tensor.Tensor filters to a 2D visual input via Conv (convolution) function, using filter-parallel approach: Each go routine does a different filter in a set of filters, e.g., different angles of Gabor filters. This is coarse-grained, strictly parallel, and thus very efficient.

image.go contains routines for converting an image into the float32 tensor.Float32 that is required for doing the convolution. * RGBToGrey converts an RGB image to a greyscale float32.

MaxPool function does Max-pooling over filtered results to reduce dimensionality, consistent with standard DCNN approaches.

Geom manages the geometry for going from an input image to the filtered output of that image.

Unlike the C++ version, no wrapping or clipping is supported directly: all input images must be padded so that the filters can be applied with appropriate padding border, guaranteeing that there are no bounds issues. See WrapPad for wrapping-based padding.

Index

Constants

View Source
const (
	// Version is the version of this package being used
	Version = "v2.0.0-dev0.0.8"
	// GitCommit is the commit just before the latest version commit
	GitCommit = "cc4e0c3"
	// VersionDate is the date-time of the latest version commit in UTC (in the format 'YYYY-MM-DD HH:MM', which is the Go format '2006-01-02 15:04')
	VersionDate = "2024-02-05 22:52"
)

Variables

This section is empty.

Functions

func Conv

func Conv(geom *Geom, flt *tensor.Float32, img, out *tensor.Float32, gain float32)

Conv performs convolution of filter over img into out. img *must* have border (padding) so that filters are applied without any bounds checking -- wrapping etc is all done in the padding process, which is much more efficient. Computation is parallel in number of different filter types (outer dim of flt) as that will be most memory efficient. img must be a 2D tensor of image values (convert RGB to grey first). Everything must be organized row major as tensor default. Out shape dims are: Y, X, Polarity (2), Angle where the 2 polarities (on, off) are for positive and and negative filter values, respectively.

func Conv1

func Conv1(geom *Geom, flt *tensor.Float32, img, out *tensor.Float32, gain float32)

Conv1 performs convolution of single filter over img into out. img *must* have border (padding) so that filters are applied without any bounds checking -- wrapping etc is all done in the padding process, which is much more efficient. Computation is parallel in image lines. img must be a 2D tensor of image values (convert RGB to grey first). Everything must be organized row major as tensor default. Output has 2 outer dims for positive vs. negative values, inner is Y, X todo: add option to interleave polarity as inner-most dim.

func ConvDiff

func ConvDiff(geom *Geom, fltOn, fltOff *tensor.Float32, imgOn, imgOff, out *tensor.Float32, gain, gainOn float32)

ConvDiff computes difference of two separate filter convolutions (fltOn - fltOff) over two images into out. There are separate gain multipliers for On and overall gain. images *must* have border (padding) so that filters are applied without any bounds checking -- wrapping etc is all done in the padding process, which is much more efficient. Computation is parallel in image lines. img must be a 2D tensor of image values (grey or single components). Everything must be organized row major as tensor default. Output has 2 outer dims for positive vs. negative values, inner is Y, X

func Deconv

func Deconv(geom *Geom, flt *tensor.Float32, img, out *tensor.Float32, gain float32)

Deconv performs reverse convolution of filter -- given output of filter, accumulates an input image as sum of filter * output activation. img *must* have border (padding) so that filters are applied without any bounds checking -- wrapping etc is all done in the padding process, which is much more efficient. img must be a 2D tensor of image values (convert RGB to grey first). Everything must be organized row major as tensor default. Out shape dims are: Y, X, Polarity (2), Angle where the 2 polarities (on, off) are for positive and and negative filter values, respectively.

func EdgeAvg

func EdgeAvg(tsr *tensor.Float32, padWidth int) float32

EdgeAvg returns the average value around the effective edge of image at padWidth in from each side

func FadePad

func FadePad(tsr *tensor.Float32, padWidth int)

FadePad fades given padding width of float32 image around sides gradually fading the edge value toward a mean edge value

func FadePadRGB

func FadePadRGB(tsr *tensor.Float32, padWidth int)

FadePadRGB fades given padding width of float32 image around sides gradually fading the edge value toward a mean edge value. RGB version iterates over outer-most dimension of components.

func FeatAgg

func FeatAgg(srcRows []int, trgStart int, src, out *tensor.Float32)

FeatAgg does simple aggregation of feature rows from one feature map to another. One row (inner-most of 4D dimensions) is assumed to be an angle, common across feature rows. srcRows is the list of rows in the source to copy. outStart is starting row in output to start copy -- srcRows will be contiguous in output from that row up. no bounds checking is done on output so it will just fail if there isn't enough room -- allocate the output size before calling!

func GreyTensorToImage

func GreyTensorToImage(img *image.Gray, tsr *tensor.Float32, padWidth int, topZero bool) *image.Gray

GreyTensorToImage converts a greyscale tensor to image -- uses existing img if it is of correct size, otherwise makes a new one. padWidth is the amount of padding to subtract from all sides. topZero retains the Y=0 value at the top of the tensor -- otherwise it is flipped with Y=0 at the bottom to be consistent with the emergent / OpenGL standard coordinate system

func LeftHalf

func LeftHalf(x int) int

LeftHalf returns the left / top half of a filter

func MaxPool

func MaxPool(psize, spc image.Point, in, out *tensor.Float32)

MaxPool performs max-pooling over given pool size and spacing. size must = spacing or 2 * spacing. Pooling is sensitive to the feature structure of the input, which must have shape: Y, X, Polarities, Angles.

func MaxReduceFilterY

func MaxReduceFilterY(in, out *tensor.Float32)

MaxReduceFilterY performs max-pooling reduce over inner Filter Y dimension (polarities, colors) must have shape: Y, X, Polarities, Angles.

func OuterAgg

func OuterAgg(innerPos, rowOff int, src, out *tensor.Float32)

OuterAgg does simple aggregation of outer-most dimension from tensor into another 4D tensor, with Y, X as outer-most two dimensions, starting at given inner-most feature offset, and inner row-wise offset. inner row-wise dimension maps the outer-most dimension of source tensor. no bounds checking is done on output so it will just fail if there isn't enough room -- allocate the output size before calling!

func RGBTensorToImage

func RGBTensorToImage(img *image.RGBA, tsr *tensor.Float32, padWidth int, topZero bool) *image.RGBA

RGBTensorToImage converts an RGB tensor to image -- uses existing image if it is of correct size, otherwise makes a new one. tensor must have outer dimension as RGB components. padWidth is the amount of padding to subtract from all sides. topZero retains the Y=0 value at the top of the tensor -- otherwise it is flipped with Y=0 at the bottom to be consistent with the emergent / OpenGL standard coordinate system

func RGBToGrey

func RGBToGrey(img image.Image, tsr *tensor.Float32, padWidth int, topZero bool)

RGBToGrey converts an RGB input image to a greyscale tensor in preparation for processing. padWidth is the amount of padding to add on all sides. topZero retains the Y=0 value at the top of the tensor -- otherwise it is flipped with Y=0 at the bottom to be consistent with the emergent / OpenGL standard coordinate system

func RGBToTensor

func RGBToTensor(img image.Image, tsr *tensor.Float32, padWidth int, topZero bool)

RGBToTensor converts an RGB input image to an RGB tensor with outer dimension as RGB components. padWidth is the amount of padding to add on all sides. topZero retains the Y=0 value at the top of the tensor -- otherwise it is flipped with Y=0 at the bottom to be consistent with the emergent / OpenGL standard coordinate system

func TensorLogNorm

func TensorLogNorm(tsr tensor.Tensor, ndim int)

TensorLogNorm computes 1 + log of all the numbers and then does Max Div renorm so result is normalized in 0-1 range. computed on the first ndim dims of the tensor, where 0 = all values, 1 = norm each of the sub-dimensions under the first outer-most dimension etc. ndim must be < NumDims() if not 0 (panics).

func UnPool

func UnPool(psize, spc image.Point, in, out *tensor.Float32, rnd bool)

UnPool performs inverse max-pooling over given pool size and spacing. This is very dumb and either uses a random number if rnd = true, or just copies the max pooled value over all of the individual elements that were pooled. A smarter solution would require maintaining the index of the max item, but that requires more infrastructure size must = spacing or 2 * spacing. Pooling is sensitive to the feature structure of the input, which must have shape: Y, X, Polarities, Angles.

func WrapPad

func WrapPad(tsr *tensor.Float32, padWidth int)

WrapPad wraps given padding width of float32 image around sides i.e., padding for left side of image is the (mirrored) bits from the right side of image, etc.

func WrapPadRGB

func WrapPadRGB(tsr *tensor.Float32, padWidth int)

WrapPadRGB wraps given padding width of float32 image around sides i.e., padding for left side of image is the (mirrored) bits from the right side of image, etc. RGB version iterates over outer-most dimension of components.

Types

type Geom

type Geom struct {

	// size of input -- computed from image or set
	In image.Point

	// size of output -- computed
	Out image.Point

	// starting border into image -- must be >= FiltRt
	Border image.Point

	// spacing -- number of pixels to skip in each direction
	Spacing image.Point

	// full size of filter
	FiltSz image.Point

	// computed size of left/top size of filter
	FiltLt image.Point

	// computed size of right/bottom size of filter (FiltSz - FiltLeft)
	FiltRt image.Point
}

Geom contains the filtering geometry info for a given filter pass.

func (*Geom) Set

func (ge *Geom) Set(border, spacing, filtSz image.Point)

Set sets the basic geometry params

func (*Geom) SetSize

func (ge *Geom) SetSize(inSize image.Point)

SetSize sets the input size, and computes output from that.

func (*Geom) UpdtFilt

func (ge *Geom) UpdtFilt()

UpdtFilt updates filter sizes, and ensures that Border >= FiltRt

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL