images4

package module
v1.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 10, 2022 License: MIT Imports: 7 Imported by: 8

README

Find similar images with Go (LATEST VERSION)

Near duplicates and resized images can be found with the package. No dependencies.

Demo: similar image clustering based on similar algorithm.

Versions: There are in total 3 version repositories: images (oldest), images3 (older), and this major version images4 (latest, recommended).

Changes in images4 vs images3 are: simplified func Icon, more than 2x reduction of icon memory footprint, removal of dependencies, removal of hashes (a separate new package imagehash can be used for fast large scale preliminary search), fixed GIF support, new func IconNN. Overall goal of v4 is simplification and memory footprint reduction.

Key functions

Open supports JPEG, PNG and GIF. But other image types can be used through third-party decoders, because input for func Icon is Golang image.Image.

Icon produces "image hashes" called "icons", which will be used for comparision.

Similar gives a verdict whether 2 images are similar with well-tested default thresholds.

EucMetric can be used instead, when you need different precision or want to sort by similarity. Func PropMetric can be used for customization of image proportion threshold.

Go doc for code reference.

Example of comparing 2 images

package main

import (
	"fmt"
	"github.com/vitali-fedulov/images4"
)

func main() {

	// Photos to compare.
	path1 := "1.jpg"
	path2 := "2.jpg"

	// Open files (discarding errors here).
	img1, _ := images4.Open(path1)
	img2, _ := images4.Open(path2)

	// Icons are compact image representations (image "hashes").
	// Name "hash" is not used intentionally.
	icon1 := images4.Icon(img1)
	icon2 := images4.Icon(img2)

	// Comparison.
	// Images are not used directly. Icons are used instead,
	// because they have tiny memory footprint and fast to compare.
	if images4.Similar(icon1, icon2) {
		fmt.Println("Images are similar.")
	} else {
		fmt.Println("Images are distinct.")
	}
}

Algorithm

Detailed explanation, also as a PDF.

Summary: Images are resized in a special way to squares of fixed size called "icons". Euclidean distance between the icons is used to give the similarity verdict. Also image proportions are used to avoid matching images of distinct shape.

Customization suggestions

To increase precision you can either use your own thresholds in func EucMetric (and PropMetric) OR generate icons for image sub-regions and compare those icons.

To speedup file processing you may want to generate icons for available image thumbnails. Specifically, many JPEG images contain EXIF thumbnails, you could considerably speedup the reads by using decoded thumbnails to feed into func Icon. External packages to read thumbnails: 1 and 2. A note of caution: in rare cases there could be issues with thumbnails not matching image content. EXIF standard specification: 1 and 2.

To search in very large image collections (billions or more), use preliminary hash-table-based filtering with package imagehash.

Documentation

Index

Constants

View Source
const (

	// Image resolution of the icon is very small
	// (11x11 pixels), therefore original image details
	// are lost in downsampling, except when source images
	// have very low resolution (e.g. favicons or simple
	// logos). This is useful from the privacy perspective
	// if you are to use generated icons in a large searchable
	// database.
	IconSize = 11 // Exported to be used in package imagehash.

)

Variables

This section is empty.

Functions

func EucMetric

func EucMetric(iconA, iconB IconT) (m1, m2, m3 float64)

EucMetric returns Euclidean distances between 2 icons. These are 3 metrics corresponding to each color channel. Distances are squared, not to waste CPU on square root calculations. Note: color channels of icons are YCbCr (not RGB).

func Get added in v1.1.0

func Get(icon IconT, size int, p image.Point) (c1, c2, c3 float64)

Get reads pixel values in an icon at a point. c1, c2, c3 are color values for each channel (RGB for example). Exported to be used in package imagehash.

func Open

func Open(path string) (img image.Image, err error)

Open opens and decodes an image file for a given path.

func PropMetric

func PropMetric(iconA, iconB IconT) (m float64)

PropMetric gives image proportion similarity metric for image A and B. The smaller the metric the more similar are images by their x-y size.

func SaveToJPG

func SaveToJPG(img *image.RGBA, path string, quality int)

SaveToJPG encodes and saves image.RGBA to a file.

func SaveToPNG

func SaveToPNG(img *image.RGBA, path string)

SaveToPNG encodes and saves image.RGBA to a file.

func Set added in v1.1.0

func Set(icon IconT, size int, p image.Point, c1, c2, c3 float64)

Set places pixel values in an icon at a point. c1, c2, c3 are color values for each channel (RGB for example). Size is icon size. Exported to be used in package imagehash.

func Similar

func Similar(iconA, iconB IconT) bool

Similar returns similarity verdict based on Euclidean and proportion similarity.

Types

type IconT

type IconT struct {
	Pixels  []uint16    // Visual signature.
	ImgSize image.Point // Original image size.
}

Icon has square shape. Its pixels are uint16 values in 3 channels. uint16 is intentional to preserve color relationships from the full-size image. It is a 255- premultiplied color value in [0, 255] range.

func EmptyIcon

func EmptyIcon() (icon IconT)

EmptyIcon is an icon constructor in case you need an icon with nil values, for example for convenient error handling. Then you can use icon.Pixels == nil condition.

func Icon

func Icon(img image.Image) IconT

Icon generates a normalized image signature ("icon"). Generated icons can then be stored in a database and used for comparison. Icon is the recommended function, vs less robust func IconNN.

func IconNN

func IconNN(img image.Image) IconT

IconNN generates a NON-normalized image signature (icon). Icons made with IconNN can be used instead of icons made with func Icon, but mostly for experimental purposes, allowing better understand how the algorithm works, or performing less agressive customized normalization. Not for general use.

func (IconT) ToRGBA

func (icon IconT) ToRGBA(size int) *image.RGBA

ToRGBA transforms a sized icon to image.RGBA. This is an auxiliary function to visually evaluate an icon.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL