Documentation ¶
Index ¶
- Constants
- Variables
- func CheckText(content []byte) error
- func ReadMetadata(inf IndexFile) (*Repository, *IndexMetadata, error)
- func SortFilesByScore(ms []FileMatch)
- type Document
- type DocumentSection
- type FileMatch
- type IndexBuilder
- type IndexFile
- type IndexMetadata
- type LineFragmentMatch
- type LineMatch
- type RepoList
- type RepoListEntry
- type RepoStats
- type Repository
- type RepositoryBranch
- type SearchOptions
- type SearchResult
- type Searcher
- type Stats
Constants ¶
const FeatureVersion = 7
FeatureVersion is increased if a feature is added that requires reindexing data without changing the format version 2: Rank field for shards. 3: Rank documents within shards 4: Dedup file bugfix 5: Remove max line size limit 6: Include '#' into the LineFragment template 7: Record skip reasons in the index.
const IndexFormatVersion = 15
FormatVersion is a version number. It is increased every time the on-disk index format is changed. 5: subrepositories. 6: remove size prefix for posting varint list. 7: move subrepos into Repository struct. 8: move repoMetaData out of indexMetadata 9: use bigendian uint64 for trigrams. 10: sections for rune offsets. 11: file ends in rune offsets. 12: 64-bit branchmasks. 13: content checksums 14: languages 15: rune based symbol sections
Variables ¶
var DebugScore = false
DebugScore controls whether we collect data on match scores are constructed. Intended for use in tests.
var Version string
Filled by the linker (see ./shared/scripts/build-deploy.sh)
Functions ¶
func ReadMetadata ¶
func ReadMetadata(inf IndexFile) (*Repository, *IndexMetadata, error)
ReadMetadata returns the metadata of index shard without reading the index data. The IndexFile is not closed.
Types ¶
type Document ¶
type Document struct { Name string Content []byte Branches []string SubRepositoryPath string Language string // If set, something is wrong with the file contents, and this // is the reason it wasn't indexed. SkipReason string // Document sections for symbols. Offsets should use bytes. Symbols []DocumentSection }
Document holds a document (file) to index.
type DocumentSection ¶
type DocumentSection struct {
Start, End uint32
}
type FileMatch ¶
type FileMatch struct { // Ranking; the higher, the better. Score float64 // TODO - hide this field? // For debugging. Needs DebugScore set, but public so tests in // other packages can print some diagnostics. Debug string FileName string // Repository is the globally unique name of the repo of the // match Repository string Branches []string LineMatches []LineMatch // Only set if requested Content []byte // Checksum of the content. Checksum []byte // Detected language of the result. Language string // SubRepositoryName is the globally unique name of the repo, // if it came from a subrepository SubRepositoryName string // SubRepositoryPath holds the prefix where the subrepository // was mounted. SubRepositoryPath string // Commit SHA1 (hex) of the (sub)repo holding the file. Version string }
FileMatch contains all the matches within a file.
type IndexBuilder ¶
type IndexBuilder struct {
// contains filtered or unexported fields
}
IndexBuilder builds a single index shard.
func NewIndexBuilder ¶
func NewIndexBuilder(r *Repository) (*IndexBuilder, error)
NewIndexBuilder creates a fresh IndexBuilder. The passed in Repository contains repo metadata, and may be set to nil.
func (*IndexBuilder) Add ¶
func (b *IndexBuilder) Add(doc Document) error
Add a file which only occurs in certain branches.
func (*IndexBuilder) AddFile ¶
func (b *IndexBuilder) AddFile(name string, content []byte) error
AddFile is a convenience wrapper for Add
func (*IndexBuilder) ContentSize ¶
func (b *IndexBuilder) ContentSize() uint32
ContentSize returns the number of content bytes so far ingested.
type IndexFile ¶
type IndexFile interface { Read(off uint32, sz uint32) ([]byte, error) Size() (uint32, error) Close() Name() string }
IndexFile is a file suitable for concurrent read access. For performance reasons, it allows a mmap'd implementation.
type IndexMetadata ¶
type IndexMetadata struct { IndexFormatVersion int IndexFeatureVersion int IndexTime time.Time PlainASCII bool LanguageMap map[string]byte ZoektVersion string }
IndexMetadata holds metadata stored in the index file.
type LineFragmentMatch ¶
type LineFragmentMatch struct { // Offset within the line, in bytes. LineOffset int // Offset from file start, in bytes. Offset uint32 // Number bytes that match. MatchLength int }
LineFragmentMatch a segment of matching text within a line.
type LineMatch ¶
type LineMatch struct { // The line in which a match was found. Line []byte LineStart int LineEnd int LineNumber int // If set, this was a match on the filename. FileName bool // The higher the better. Only ranks the quality of the match // within the file, does not take rank of file into account Score float64 LineFragments []LineFragmentMatch }
LineMatch holds the matches within a single line in a file.
type RepoList ¶
type RepoList struct { Repos []*RepoListEntry Crashes int }
RepoList holds a set of Repository metadata.
type RepoListEntry ¶
type RepoListEntry struct { Repository Repository IndexMetadata IndexMetadata Stats RepoStats }
type RepoStats ¶
type RepoStats struct { // Repos is used for aggregrating the number of repositories. Repos int // Shards is the total number of search shards. Shards int // Documents holds the number of documents or files. Documents int // IndexBytes is the amount of RAM used for index overhead. IndexBytes int64 // ContentBytes is the amount of RAM used for raw content. ContentBytes int64 }
Statistics of a (collection of) repositories.
type Repository ¶
type Repository struct { // The repository name Name string // The repository URL. URL string // The branches indexed in this repo. Branches []RepositoryBranch // Nil if this is not the super project. SubRepoMap map[string]*Repository // URL template to link to the commit of a branch CommitURLTemplate string // The repository URL for getting to a file. Has access to // {{Branch}}, {{Path}} FileURLTemplate string // The URL fragment to add to a file URL for line numbers. has // access to {{LineNumber}}. The fragment should include the // separator, generally '#' or ';'. LineFragmentTemplate string // All zoekt.* configuration settings. RawConfig map[string]string // Importance of the repository, bigger is more important Rank uint16 }
Repository holds repository metadata.
type RepositoryBranch ¶
RepositoryBranch describes an indexed branch, which is a name combined with a version.
type SearchOptions ¶
type SearchOptions struct { // Return an upper-bound estimate of eligible documents in // stats.ShardFilesConsidered. EstimateDocCount bool // Return the whole file. Whole bool // Maximum number of matches: skip all processing an index // shard after we found this many non-overlapping matches. ShardMaxMatchCount int // Maximum number of matches: stop looking for more matches // once we have this many matches across shards. TotalMaxMatchCount int // Maximum number of important matches: skip processing // shard after we found this many important matches. ShardMaxImportantMatch int // Maximum number of important matches across shards. TotalMaxImportantMatch int // Abort the search after this much time has passed. MaxWallTime time.Duration // Trim the number of results after collating and sorting the // results MaxDocDisplayCount int }
func (*SearchOptions) SetDefaults ¶
func (o *SearchOptions) SetDefaults()
func (*SearchOptions) String ¶
func (s *SearchOptions) String() string
type SearchResult ¶
type SearchResult struct { Stats Files []FileMatch // RepoURLs holds a repo => template string map. RepoURLs map[string]string // FragmentNames holds a repo => template string map, for // the line number fragment. LineFragments map[string]string }
SearchResult contains search matches and extra data
type Searcher ¶
type Searcher interface { Search(ctx context.Context, q query.Q, opts *SearchOptions) (*SearchResult, error) // List lists repositories. The query `q` can only contain // query.Repo atoms. List(ctx context.Context, q query.Q) (*RepoList, error) Close() // Describe the searcher for debug messages. String() string }
func NewSearcher ¶
NewSearcher creates a Searcher for a single index file. Search results coming from this searcher are valid only for the lifetime of the Searcher itself, ie. []byte members should be copied into fresh buffers if the result is to survive closing the shard.
type Stats ¶
type Stats struct { // Amount of I/O for reading contents. ContentBytesLoaded int64 // Amount of I/O for reading from index. IndexBytesLoaded int64 // Number of search shards that had a crash. Crashes int // Wall clock time for this search Duration time.Duration // Number of files containing a match. FileCount int // Number of files in shards that we considered. ShardFilesConsidered int // Files that we evaluated. Equivalent to files for which all // atom matches (including negations) evaluated to true. FilesConsidered int // Files for which we loaded file content to verify substring matches FilesLoaded int // Candidate files whose contents weren't examined because we // gathered enough matches. FilesSkipped int // Shards that we did not process because a query was canceled. ShardsSkipped int // Number of non-overlapping matches MatchCount int // Number of candidate matches as a result of searching ngrams. NgramMatches int // Wall clock time for queued search. Wait time.Duration }
Stats contains interesting numbers on the search
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
package build implements a more convenient interface for building zoekt indices.
|
package build implements a more convenient interface for building zoekt indices. |
Package gitindex provides functions for indexing Git repositories.
|
Package gitindex provides functions for indexing Git repositories. |