imghash

package module
v0.0.0-...-6afea89 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 18, 2015 License: BSD-1-Clause Imports: 10 Imported by: 0

README

imghash

imghash computes the Perceptual Hash for a given input image. The hash is returned as a 64 bit integer. It comes with two commandline tools: img-index and img-find. Refer to their respective READMEs for information on what they do.

Note that this toolset is mainly for educational purposes on my part. It is a partial implementation of an article on hackerfactor.com.

The package supports these hashing modes:

  • Average: Average computes a Perceptual Hash using a naive, but very fast method. It holds up to minor colour changes, changing brightness and contrast and is indifferent to aspect ratio and image size differences.

    Average Hash is a great algorithm if you are looking for something specific. For example, if we have a small thumbnail of an image and we wish to know if the big one exists somewhere in our collection. Average Hash will find it very quickly. However, if there are modifications -- like text was added or a head was spliced into place, then Average Hash probably won't do the job.

    The Average Hash is quick and easy, but it can generate false-misses if gamma correction or color histogram is applied to the image. This is because the colors move along a non-linear scale -- changing where the "average" is located and therefore changing which bits are above/below the average.

More may come at some point.

Usage

go get github.com/jteeuwen/imghash

License

Unless otherwise stated, all of the work in this project is subject to a 1-clause BSD license. Its contents can be found in the enclosed LICENSE file.

Documentation

Overview

imghash computes the Perceptual Hash for a given input image. The Perceptual Hash is returned as a 64 bit integer.

Comparing two images can be done by constructing the hash from each image and counting the number of bit positions that are different. This is a Hamming distance. A distance of zero indicates that it is likely a very similar picture (or a variation of the same picture). A distance of 5 means a few things may be different, but they are probably still close enough to be similar. But a distance of 10 or more? That's probably a very different picture.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Average

func Average(img image.Image) uint64

Average computes a Perceptual Hash using a naive, but very fast method. It holds up to minor colour changes, changing brightness and contrast and is indifferent to aspect ratio and image size differences.

Average Hash is a great algorithm if you are looking for something specific. For example, if we have a small thumbnail of an image and we wish to know if the big one exists somewhere in our collection. Average Hash will find it very quickly. However, if there are modifications -- like text was added or a head was spliced into place, then Average Hash probably won't do the job.

The Average Hash is quick and easy, but it can generate false-misses if gamma correction or color histogram is applied to the image. This is because the colors move along a non-linear scale -- changing where the "average" is located and therefore changing which bits are above/below the average.

func Distance

func Distance(a, b uint64) uint64

Distance calculates the Hamming Distance between the two input hashes.

Types

type Database

type Database struct {
	Root string // Database root path.
	// contains filtered or unexported fields
}

A Database holds a listing of Perceptual hashes, mapped to image file paths.

Note: This is a very naive implementation that can benefit a great deal from optimization.

func NewDatabase

func NewDatabase() *Database

NewDatabase creates a new, empty database.

func (*Database) AddEntry

func (d *Database) AddEntry(entry *Entry)

func (*Database) DeleteEntry

func (d *Database) DeleteEntry(index int)

Remove the entry without reshuffling the whole database. Note this means the array may have nil elements

func (*Database) Find

func (d *Database) Find(hash, distance uint64) ResultSet

Find finds all entries which have a Hamming Diance <= to the specified distance with the given hash. The list is sorted by relevance.

func (*Database) IndexFile

func (d *Database) IndexFile(file string) int

IndexFile returns the index for the given file.

func (*Database) IndexHash

func (d *Database) IndexHash(hash uint64) []int

IndexHash returns the indices for files with the given hash. There can be more than one of them.

func (*Database) IsNew

func (d *Database) IsNew(file string, modtime int64) bool

IsNew returns true if the given file has been updated since it was last stored in the database.

func (*Database) Load

func (d *Database) Load(file string) (err error)

Load loads a database from the given file. Leave the filename empty to use the default file.

func (*Database) Save

func (d *Database) Save(file string) (err error)

Save saves the database to the given file. Leave the filename empty to use the default file.

func (*Database) Set

func (d *Database) Set(file string, modtime int64, hash uint64)

Set adds the given file if it doesn't already exist. Otherwise it overwrites the existing one.

type Entry

type Entry struct {
	Path    string // Image path, relative to Database.Root
	Hash    uint64 // Perceptual Image hash.
	ModTime int64  // Last-Modified timestamp for this file.
}

Entry represents a single database entry.

type HashFunc

type HashFunc func(image.Image) uint64

A HashFunc computes a Perceptual Hash for a given image.

type ResultSet

type ResultSet []*SearchResult

ResultSet holds search results, sortable by Hamming Distance.

func (ResultSet) Len

func (r ResultSet) Len() int

func (ResultSet) Less

func (r ResultSet) Less(i, j int) bool

func (ResultSet) Swap

func (r ResultSet) Swap(i, j int)

type SearchResult

type SearchResult struct {
	Path     string // Image path, relative to Database.Root
	Hash     uint64 // Perceptual Image hash.
	Distance uint64 // Hamming Distance to search term.
}

SearchResult is returned by Database.Find.

Directories

Path Synopsis
img-find accepts the path to a single image.
img-find accepts the path to a single image.
img-index accepts the path to a given directory.
img-index accepts the path to a given directory.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL