search

package module

v0.0.0-...-fb6cc8f Latest Latest Go to latest Published: Jan 20, 2023 License: Apache-2.0, MIT Imports: 3 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/healeycodes/crane-search

Links

Open Source Insights

README ¶

Crane 🐦

My blog post: WebAssembly Search Tools for Static Sites

Crane is a technical demo is inspired by Stork and uses a near-identical configuration file setup. So it had to be named after a bird too.

I wrote it to help me understand how WebAssembly search tools work. Please use Stork instead.

Crane is two programs. The first program scans a group of documents and builds an efficient index. 1MB of text and metadata is turned into a 25KB index (14KB gzipped). The second program is a Wasm module that is sent to the browser along with a little bit of JavaScript glue code and the index. The result is an instant search engine that helps users find web pages as they type.

Visit the demo

The full text search engine is powered in part with code from Artem Krylysov's blog post Let's build a Full-Text Search engine.

No effort has been made to shrink the Wasm binary. See Reducing the size of Wasm files.

Use it

Describe your document files and their metadata.

[input]
files = [
    {
        path = "docs/essays/essay01.txt",
        url = "essays/essay01.txt",
        title = "Introduction"
    },
    # etc.
]

[output]
filename = "dist/federalist.crane"

Pass the configuration file to the build script. You'll want a fresh index whenever your documents change but you only need to build the Wasm module once ever.

./build-index.sh federalist.toml
./build-search.sh

Host the files from /dist on your website (e.g. wasm_exec.js, crane.js, crane.wasm, federalist.crane). And away you go!

const crane = new Crane("crane.wasm", "federalist.crane");
await crane.load();

const results = crane.query('some keywords');
console.log(results);

See the demo inside /docs for a basic UI.

Build demo page

./gh-pages.sh

Documentation ¶

Index ¶

func Analyze(text string) []string
func Intersection(a []int, b []int) []int
func LowercaseFilter(tokens []string) []string
func StemmerFilter(tokens []string) []string
func StopwordFilter(tokens []string) []string
func Tokenize(text string) []string
type Document
type Index
- func (index Index) Add(docs []Document)
- func (index Index) Search(text string) []int
type Result
type Store

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Analyze ¶

func Analyze(text string) []string

Analyze analyzes the text and returns a slice of tokens.

func Intersection ¶

func Intersection(a []int, b []int) []int

Intersection returns the set Intersection between a and b. a and b have to be sorted in ascending order and contain no duplicates.

func LowercaseFilter ¶

func LowercaseFilter(tokens []string) []string

LowercaseFilter returns a slice of tokens normalized to lower case.

func StemmerFilter ¶

func StemmerFilter(tokens []string) []string

StemmerFilter returns a slice of stemmed tokens.

func StopwordFilter ¶

func StopwordFilter(tokens []string) []string

StopwordFilter returns a slice of tokens with stop words removed.

func Tokenize ¶

func Tokenize(text string) []string

Tokenize returns a slice of tokens for the given text.

Types ¶

type Document ¶

type Document struct {
	Title string
	URL   string
	Text  string
	ID    int
}

Document represents a text file

type Index ¶

type Index map[string][]int

Index is an inverted Index. It maps tokens to document IDs.

func (Index) Add ¶

func (index Index) Add(docs []Document)

Add adds documents to the index.

func (Index) Search ¶

func (index Index) Search(text string) []int

Search queries the index for the given text.

type Result ¶

type Result struct {
	Title string
	URL   string
	ID    int
}

Result is a search result item

type Store ¶

type Store struct {
	Index   Index
	Results []Result
}

Store contains results and their index

Source Files ¶

View all Source files

search.go

Directories ¶

Path	Synopsis
browser
build

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL