imagehash2

package module
v1.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 30, 2024 License: MIT Imports: 3 Imported by: 0

README

Fast similar image search with Go (LATEST version)

Resized and near-duplicate image search for large image collections (thousands, millions, and more). The package generates 'real' hashes to be used in hash-tables, and consumes very little memory. It is recommended to cross-check the similarity result with more precise (but slow) images4 package.

Demo (a usage scenario for image similarity search).

Algorithm for nearest neighbour vector search by vector quantization.

Go doc.

Major (semantic) versions have their own repositories and are mutually incompatible:

Major version Repository Comment
2 imagehash2 (this) recommended, with improved precision
1 imagehash as fast, but has a minor generalization issue

Parameters

The most important parameter is numBuckets. It defines granularity of hyper-space quantization. The higher the value, the more restrictive the comparison is. And, when used together with images4 package, higher numBuckets considerably accelerates the search process, because fewer image ids fall into a single quantization cell.

The second parameter is epsilon, which can be safely set to 0.25.

Example of comparison for 2 photos

The demo shows only the hash-based similarity comparison (without making actual hash table). But the hash table, typically a Golang map, is implied in full implementation.

package main

import (
	"fmt"
	"github.com/vitali-fedulov/imagehash2"
	"github.com/vitali-fedulov/images4"
)

const (
	// Recommended initial parameters.

	// Increase this value to get higher precision.
	numBuckets = 4

	// No need to change epsilon value.
	epsilon = 0.25
)

func main() {

	// Open and decode photos (skipping error handling for clarity).
	img1, _ := images4.Open("1.jpg")
	img2, _ := images4.Open("2.jpg")

	// Icons are compact image representations needed for comparison.
	icon1 := images4.Icon(img1)
	icon2 := images4.Icon(img2)

	// Hash table values.

	// Value to save to the hash table as a KEY with corresponding
	// image ids. Table structure: map[centralHash][]imageId.
	// imageId is simply an image number in a directory tree.
	// centralHash is uint64.
	centralHash := imagehash2.CentralHash9(icon1, epsilon, numBuckets)

	// Hash set to be used as a QUERY to the hash table.
	// Each hash from the hashSet must be checked against the hash table.
	// The length of hashSet is different for each image.
	// The most frequent length is 1.
	hashSet := imagehash2.HashSet9(icon2, epsilon, numBuckets)

	// As if we are searching in the table.

	foundSimilarImage := false

	// Checking hash matches. In full implementation this will be done
	// on the map mentioned above.
	for _, hash := range hashSet { // Query. Check full hashSet.

		if centralHash == hash { // Sub-query hash found in the table.
			foundSimilarImage = true
			break
		}

	}

	// Comparison result.
	if foundSimilarImage {

		fmt.Println("Images are *approximately* similar.")

		// It is recommended to cross-check the result with
	        // the higher-precision func Similar from package images4.
		if images4.Similar(icon1, icon2) == true {
			fmt.Println("Images are similar")
		}

	} else {
		fmt.Println("Images are distinct.")
	}

}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CentralHash9

func CentralHash9(icon images4.IconT, epsilon float64, numBuckets int) uint64

CentralHash9 generates a central hash for an icon. This hash can then be used for a record or a query. When used for a record, you will need a hash set made with func HashSet for a query. And vice versa. They are interchangeable for optimization. The hash is generated for 9 average luma values of nine 3x3 pixel blocks in the central 9x9 area of the icon.

func HashSet9

func HashSet9(icon images4.IconT, epsilon float64, numBuckets int) []uint64

HashSet9 generates a hash set for an icon. This hash set can then be used for a record or a query. When used for a query, you will need a hash made with func CentralHash as a record. And vice versa.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL