Documentation ¶
Overview ¶
A simple cross-platform, full-text search engine, backed by sqlite. Intended for use on small- to medium-sized websites.
See README.md for usage.
Index ¶
- Constants
- Variables
- func HTMLExtractDescription(html string) string
- func HTMLExtractTitle(html string) string
- func HTMLStripTags(s string) (output string)
- func IndexizeWord(w string) string
- func Wordize(t string) []string
- type IndexDoc
- type Indexer
- type SearchResultItem
- type SearchResultItems
- type SearchResults
- type Searcher
- type StopWordChecker
- type WordCleaner
- type WordSplitter
Constants ¶
const HEADER_SIZE = 4096
Size of header block to prepend - make it 4k to align disk reads
Variables ¶
var EnglishStopWordChecker = func(s string) bool { return STOPWORDS_EN[s] }
var STOPWORDS_EN = map[string]bool{}/* 173 elements not displayed */
English stop words
Functions ¶
func HTMLExtractDescription ¶
Helper to extract an HTML description from the meta[name=description] tag
func HTMLExtractTitle ¶
Helper to extract an HTML title from the title tag
func HTMLStripTags ¶
This function copied from here: https://github.com/kennygrant/sanitize/blob/master/sanitize.go License is: https://github.com/kennygrant/sanitize/blob/master/License-BSD.txt Strip html tags, replace common entities, and escape <>&;'" in the result. Note the returned text may contain entities as it is escaped by HTMLEscapeString, and most entities are not translated.
Types ¶
type IndexDoc ¶
type IndexDoc struct { Id []byte // the id, this is usually the path to the document IndexValue []byte // index this data StoreValue []byte // store this data }
Contents of a single document to be indexed
type Indexer ¶
type Indexer struct { WordSplit WordSplitter WordClean WordCleaner StopWordCheck StopWordChecker // contains filtered or unexported fields }
Produces a set of cdb files from a series of AddDoc() calls
func NewIndexer ¶
Creates a new indexer, using the given temp dir while building the index.
func (*Indexer) AddDoc ¶
Add a document to the index - writes to temporary files and stores some data in memory while building the index.
func (*Indexer) DumpStatus ¶
Dump some human readable status information
type SearchResultItem ¶
type SearchResultItem struct { Id []byte // id of this item (document) StoreValue []byte // the stored value of this document Score int64 // the total score }
A single item in a search result
type SearchResultItems ¶
type SearchResultItems []SearchResultItem
Implement sort.Interface
func (SearchResultItems) Len ¶
func (s SearchResultItems) Len() int
func (SearchResultItems) Less ¶
func (s SearchResultItems) Less(i, j int) bool
func (SearchResultItems) Swap ¶
func (s SearchResultItems) Swap(i, j int)
type SearchResults ¶
type SearchResults struct {
Items SearchResultItems
}
What happened during the search
type Searcher ¶
type Searcher struct {
// contains filtered or unexported fields
}
Interface for search. Not thread-safe, but low overhead so having a separate one per thread should be workable.
func NewSearcher ¶
Make a new searcher using the file at the specified path TODO: Make a variation that accepts a ReaderAt
func (*Searcher) SimpleSearch ¶
func (s *Searcher) SimpleSearch(search string, maxn int) (SearchResults, error)
Perform a search