riot

package module
v0.0.0-...-32d62a6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2023 License: Apache-2.0 Imports: 24 Imported by: 0

README

CircleCI Status Appveyor codecov Build Status Go Report Card GoDoc GitHub release

Go Open Source, Distributed, Simple and efficient full text search engine.

简体中文

Features

Requirements

Go version >= 1.8

Dependencies

Riot uses go module or dep to manage dependencies.

Installation/Update

go get -u github.com/xiechuxi/riot

Build-tools

go get -u github.com/go-ego/re 
re riot

To create a new riot application

$ re riot my-riotapp
re run

To run the application we just created, you can navigate to the application folder and execute:

$ cd my-riotapp && re run

Usage:

Look at an example
package main

import (
	"log"

	"github.com/xiechuxi/riot"
	"github.com/xiechuxi/riot/types"
)

var (
	// searcher is coroutine safe
	searcher = riot.Engine{}
)

func main() {
	// Init
	searcher.Init(types.EngineOpts{
		// Using:             4,
		NotUseGse: true,
		})
	defer searcher.Close()

	text := "Google Is Experimenting With Virtual Reality Advertising"
	text1 := `Google accidentally pushed Bluetooth update for Home
	speaker early`
	text2 := `Google is testing another Search results layout with 
	rounded cards, new colors, and the 4 mysterious colored dots again`
	
	// Add the document to the index, docId starts at 1
	searcher.Index("1", types.DocData{Content: text})
	searcher.Index("2", types.DocData{Content: text1}, false)
	searcher.IndexDoc("3", types.DocData{Content: text2}, true)

	// Wait for the index to refresh
	searcher.Flush()
	// engine.FlushIndex()

	// The search output format is found in the types.SearchResp structure
	log.Print(searcher.Search(types.SearchReq{Text:"google testing"}))
}

It is very simple!

Use default engine:
package main

import (
	"log"

	"github.com/xiechuxi/riot"
	"github.com/xiechuxi/riot/types"
)

var (
	searcher = riot.New("zh")
)

func main() {
	data := types.DocData{Content: `I wonder how, I wonder why
		, I wonder where they are`}
	data1 := types.DocData{Content: "所以, 你好, 再见"}
	data2 := types.DocData{Content: "没有理由"}

	searcher.Index("1", data)
	searcher.Index("2", data1)
	searcher.Index("3", data2)
	searcher.Flush()

	req := types.SearchReq{Text: "你好"}
	search := searcher.Search(req)
	log.Println("search...", search)
}
Look at more Examples
Look at Store example
Look at Logic search example
Look at Pinyin search example
Look at different dict and language search example
Look at benchmark example
Riot search engine templates, client and dictionaries

Authors

License

Riot is primarily distributed under the terms of the Apache License (Version 2.0), base on wukong.

Documentation

Overview

Package riot is riot engine

Package riot full text search engine

Index

Constants

View Source
const (
	// Version get the riot version
	Version string = "v0.10.0.425, Danube River!"

	// NumNanosecondsInAMillisecond nano-seconds in a milli-second num
	NumNanosecondsInAMillisecond = 1000000
	// StoreFilePrefix persistent store file prefix
	StoreFilePrefix = "riot"

	// DefaultPath default db path
	DefaultPath = "./riot-index"
)

Variables

View Source
var (

	// InitMemUsed init mem used
	InitMemUsed uint64
	// InitDiskUsed init disk used
	InitDiskUsed uint64
)
View Source
var (
	Req1 = types.SearchReq{Text: reqText}
)
View Source
var (
	TestIndexOpts = rankEngineOpts(rankOptsMax10)
)

Functions

func AddDocs

func AddDocs(engine *Engine)

func AddDocsWithLabels

func AddDocsWithLabels(engine *Engine)

func CPUInfo

func CPUInfo(args ...int) (string, error)

CPUInfo returns the cpu info

func CPUPercent

func CPUPercent() ([]float64, error)

CPUPercent returns the amount of use cpu in percent.

func DiskFree

func DiskFree() (uint64, error)

DiskFree returns the amount of free disk in bytes.

func DiskPercent

func DiskPercent() (string, error)

DiskPercent returns the amount of use disk in percent.

func DiskTotal

func DiskTotal() (uint64, error)

DiskTotal returns the amount of total disk in bytes.

func DiskUsed

func DiskUsed() (uint64, error)

DiskUsed returns the amount of use disk in bytes.

func GetVersion

func GetVersion() string

GetVersion get the riot version

func KernelVer

func KernelVer() (string, error)

KernelVer returns the kernel version as a string.

func MemFree

func MemFree() (uint64, error)

MemFree returns the amount of free memory in bytes.

func MemPercent

func MemPercent() (string, error)

MemPercent returns the amount of use memory in percent.

func MemTotal

func MemTotal() (uint64, error)

MemTotal returns the amount of total memory in bytes.

func MemUsed

func MemUsed() (uint64, error)

MemUsed returns the amount of used memory in bytes.

func OrderlessOpts

func OrderlessOpts(idOnly bool) types.EngineOpts

func Platform

func Platform() (string, error)

Platform returns the platform name and OS Version.

func PlatformInfo

func PlatformInfo() (platform, family, osVersion string, err error)

PlatformInfo fetches system platform information.

func ToGB

func ToGB(data uint64) uint64

ToGB bytes to gb

func ToKB

func ToKB(data uint64) uint64

ToKB bytes to kb

func ToMB

func ToMB(data uint64) uint64

ToMB bytes to mb

func Try

func Try(fun func(), handler func(interface{}))

Try handler(err)

func Uptime

func Uptime() (uptime uint64, err error)

Uptime returns the system uptime in seconds.

Types

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

Engine initialize the engine

func New

func New(conf ...interface{}) *Engine

New create a new engine with mode

func NewEngine

func NewEngine(conf ...interface{}) *Engine

NewEngine create a new engine

func (*Engine) CheckMem

func (engine *Engine) CheckMem()

CheckMem check the memory when the memory is larger than 99.99% using the store

func (*Engine) Close

func (engine *Engine) Close()

Close close the engine 关闭引擎

func (*Engine) Flush

func (engine *Engine) Flush()

Flush block wait until all indexes are added 阻塞等待直到所有索引添加完毕

func (*Engine) FlushIndex

func (engine *Engine) FlushIndex()

FlushIndex block wait until all indexes are added 阻塞等待直到所有索引添加完毕

func (*Engine) ForSplitData

func (engine *Engine) ForSplitData(strData []string, num int) (TMap, int)

ForSplitData for split segment's data, segspl

func (*Engine) GetAllDocIds

func (engine *Engine) GetAllDocIds() []string

GetAllDocIds get all the DocId from the storage database and return 从数据库遍历所有的 DocId, 并返回

func (*Engine) GetDBAllDocs

func (engine *Engine) GetDBAllDocs() (docsId []string, docsData []types.DocData)

GetDBAllDocs get the db all docs

func (*Engine) GetDBAllIds

func (engine *Engine) GetDBAllIds() []string

GetDBAllIds get all the DocId from the storage database and return 从数据库遍历所有的 DocId, 并返回

func (*Engine) HasDoc

func (engine *Engine) HasDoc(docId string) bool

HasDoc if the document is exist return true

func (*Engine) HasDocDB

func (engine *Engine) HasDocDB(docId string) bool

HasDocDB if the document is exist in the database return true

func (*Engine) Index

func (engine *Engine) Index(docId string, data types.DocData,
	forceUpdate ...bool)

Index add the document to the index

func (*Engine) IndexDoc

func (engine *Engine) IndexDoc(docId string, data types.DocData,
	forceUpdate ...bool)

IndexDoc add the document to the index 将文档加入索引

输入参数:

docId	      标识文档编号,必须唯一,docId == 0 表示非法文档(用于强制刷新索引),[1, +oo) 表示合法文档
data	      见 DocIndexData 注释
forceUpdate 是否强制刷新 cache,如果设为 true,则尽快添加到索引,否则等待 cache 满之后一次全量添加

注意:

  1. 这个函数是线程安全的,请尽可能并发调用以提高索引速度
  2. 这个函数调用是非同步的,也就是说在函数返回时有可能文档还没有加入索引中,因此 如果立刻调用Search可能无法查询到这个文档。强制刷新索引请调用FlushIndex函数。

func (*Engine) Indexer

func (engine *Engine) Indexer(options types.EngineOpts)

Indexer initialize the indexer channel

func (*Engine) Init

func (engine *Engine) Init(options types.EngineOpts)

Init initialize the engine

func (*Engine) InitStore

func (engine *Engine) InitStore()

InitStore initialize the persistent store channel

func (*Engine) NotTimeOut

func (engine *Engine) NotTimeOut(request types.SearchReq,
	rankerReturnChan chan rankerReturnReq) (
	rankOutArr interface{}, numDocs int)

NotTimeOut not set engine timeout

func (*Engine) NumDocsIndexed

func (engine *Engine) NumDocsIndexed() uint64

NumDocsIndexed documents indexed number, deprecated

func (*Engine) NumDocsRemoved

func (engine *Engine) NumDocsRemoved() uint64

NumDocsRemoved documents removed number, deprecated

func (*Engine) NumIndexed

func (engine *Engine) NumIndexed() uint64

NumIndexed documents indexed number

func (*Engine) NumRemoved

func (engine *Engine) NumRemoved() uint64

NumRemoved documents removed number

func (*Engine) NumTokenAdded

func (engine *Engine) NumTokenAdded() uint64

NumTokenAdded added token index number

func (*Engine) NumTokenIndexAdded

func (engine *Engine) NumTokenIndexAdded() uint64

NumTokenIndexAdded added token index number, deprecated

func (*Engine) PinYin

func (engine *Engine) PinYin(hans string) []string

PinYin get the Chinese alphabet and abbreviation

func (*Engine) RankID

func (engine *Engine) RankID(request types.SearchReq, rankOpts types.RankOpts,
	tokens []string, rankerReturnChan chan rankerReturnReq) (output types.SearchResp)

RankID rank docs by types.ScoredIDs

func (*Engine) Ranker

func (engine *Engine) Ranker(options types.EngineOpts)

Ranker initialize the ranker channel

func (*Engine) Ranks

func (engine *Engine) Ranks(request types.SearchReq, rankOpts types.RankOpts,
	tokens []string, rankerReturnChan chan rankerReturnReq) (output types.SearchResp)

Ranks rank docs by types.ScoredDocs

func (*Engine) RemoveDoc

func (engine *Engine) RemoveDoc(docId string, forceUpdate ...bool)

RemoveDoc remove the document from the index 将文档从索引中删除

输入参数:

docId	      标识文档编号,必须唯一,docId == 0 表示非法文档(用于强制刷新索引),[1, +oo) 表示合法文档
forceUpdate 是否强制刷新 cache,如果设为 true,则尽快删除索引,否则等待 cache 满之后一次全量删除

注意:

  1. 这个函数是线程安全的,请尽可能并发调用以提高索引速度
  2. 这个函数调用是非同步的,也就是说在函数返回时有可能文档还没有加入索引中,因此 如果立刻调用 Search 可能无法查询到这个文档。强制刷新索引请调用 FlushIndex 函数。

func (*Engine) Search

func (engine *Engine) Search(request types.SearchReq) (output types.SearchResp)

Search find the document that satisfies the search criteria. This function is thread safe 查找满足搜索条件的文档,此函数线程安全

func (*Engine) SearchDoc

func (engine *Engine) SearchDoc(request types.SearchReq) (output types.SearchDoc)

SearchDoc find the document that satisfies the search criteria. This function is thread safe, return not IDonly

func (*Engine) SearchID

func (engine *Engine) SearchID(request types.SearchReq) (output types.SearchID)

SearchID find the document that satisfies the search criteria. This function is thread safe, return IDonly

func (*Engine) Segment

func (engine *Engine) Segment(content string) (keywords []string)

Segment get the word segmentation result of the text 获取文本的分词结果, 只分词与过滤弃用词

func (*Engine) Store

func (engine *Engine) Store()

Store start the persistent store work connection

func (*Engine) TimeOut

func (engine *Engine) TimeOut(request types.SearchReq,
	rankerReturnChan chan rankerReturnReq) (
	rankOutArr interface{}, numDocs int, isTimeout bool)

TimeOut set engine timeout

func (*Engine) Tokens

func (engine *Engine) Tokens(request types.SearchReq) (tokens []string)

Tokens get the engine tokens

func (*Engine) UsedDisk

func (engine *Engine) UsedDisk() (uint64, error)

UsedDisk returns the amount of use disk in bytes after init() func.

func (*Engine) UsedMem

func (engine *Engine) UsedMem() (uint64, error)

UsedMem returns the amount of riot used memory in bytes after init() func.

func (*Engine) WithGse

func (engine *Engine) WithGse(segmenter gse.Segmenter) *Engine

WithGse Using user defined segmenter If using a not nil segmenter and the dictionary is loaded, the `opt.GseDict` will be ignore.

type RankByTokenProximity

type RankByTokenProximity struct {
}

func (RankByTokenProximity) Score

func (rule RankByTokenProximity) Score(
	doc types.IndexedDoc, fields interface{}) []float32

type ScoringFields

type ScoringFields struct {
	A, B, C float32
}

type StopTokens

type StopTokens struct {
	// contains filtered or unexported fields
}

StopTokens stop tokens map

func (*StopTokens) Init

func (st *StopTokens) Init(stopTokenFile string)

Init 从 stopTokenFile 中读入停用词,一个词一行 文档索引建立时会跳过这些停用词

func (*StopTokens) IsStopToken

func (st *StopTokens) IsStopToken(token string) bool

IsStopToken to determine whether to stop token

type TMap

type TMap map[string][]int

TMap defines the tokens map type map[string][]int

Directories

Path Synopsis
Package core is riot core
Package core is riot core
Package engine is riot engine
Package engine is riot engine
examples
benchmark
riot 性能测试
riot 性能测试
codelab
一个微博搜索的例子。
一个微博搜索的例子。
new
pinyin_weibo
一个微博 pinyin 搜索的例子。
一个微博 pinyin 搜索的例子。
net
Package net is riot net
Package net is riot net
com
Package types is riot types
Package types is riot types

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL