index

package
v0.0.0-...-1b7ac2d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 21, 2014 License: Apache-2.0 Imports: 34 Imported by: 0

Documentation

Overview

Package index provides a generic indexing system on top of the abstract Storage interface.

The following keys & values are populated by receiving blobs and queried for search operations:

  • Recent Permanodes "recpn|<pgp-keyid>|<reverse-modtime>|<claim-blobref>" == "<permanode-blobref>" where reverse-modtime flips each digit to '9'-<digit> and prepends "rt" (for reverse time) "2011-11-27T01:23:45Z" ==> "rt7988-88-72T98:76:54Z"

  • signer blobref of ascii public key -> gpg key id "signerkeyid:sha1-ad87ca5c78bd0ce1195c46f7c98e6025abbaf007" = "2931A67C26F5ABDA"

  • PermanodeOfSignerAttrValue: "signerattrvalue|<keyid>|<URLEscape(attr)>|<URLEscape(value)>|<reverse-claimtime>|<claim-blobref>" == "<permanode>" e.g. "signerattrvalue|2931A67C26F5ABDA|camliRoot|rootval|"+ "rt7988-88-71T98:67:60.999876543Z|sha1-bf115940641f1aae2e007edcf36b3b18c17256d9" => "sha1-7a14cce982aa73ab519e63050f82e2a2adfcf039"

  • Other: "meta:<blobref>" == "<size>|<mimetype>" "have:<blobref>" == "<size>" (used for enumeration, which doesn't need mime type)

  • For GetOwnerClaims(permanode, signer): "claim|<permanode-blobref>|<keyid>|<date>|<claim-blobref>" => "<URL:type>|<URL:attr>|<URL:value>"

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IsBlobReferenceAttribute

func IsBlobReferenceAttribute(attr string) bool

IsBlobReferenceAttribute returns whether attr is an attribute whose value is a blob reference (e.g. camliMember) and thus something the indexers should keep inverted indexes on for parent/child-type relationships.

func IsFulltextAttribute

func IsFulltextAttribute(attr string) bool

func IsIndexedAttribute

func IsIndexedAttribute(attr string) bool

TODO(bradfitz): rename this? This is really about signer-attr-value (PermanodeOfSignerAttrValue), and not about indexed attributes in general.

func SetImpendingReindex

func SetImpendingReindex()

SetImpendingReindex notes that the user ran the camlistored binary with the --reindex flag. Because the index is about to be wiped, schema version checks should be suppressed.

func SetVerboseCorpusLogging

func SetVerboseCorpusLogging(v bool)

SetVerboseCorpusLogging controls corpus setup verbosity. It's on by default but used to disable verbose logging in tests.

Types

type BlobSniffer

type BlobSniffer struct {
	// contains filtered or unexported fields
}

func NewBlobSniffer

func NewBlobSniffer(ref blob.Ref) *BlobSniffer

func (*BlobSniffer) Body

func (sn *BlobSniffer) Body() ([]byte, error)

func (*BlobSniffer) CamliType

func (sn *BlobSniffer) CamliType() string

func (*BlobSniffer) IsTruncated

func (sn *BlobSniffer) IsTruncated() bool

func (*BlobSniffer) MIMEType

func (sn *BlobSniffer) MIMEType() string

MIMEType returns the sniffed blob's content-type or the empty string if unknown. If the blob is a Camlistore schema metadata blob, the MIME type will be of the form "application/json; camliType=foo".

func (*BlobSniffer) Parse

func (sn *BlobSniffer) Parse()

func (*BlobSniffer) SchemaBlob

func (sn *BlobSniffer) SchemaBlob() (meta *schema.Blob, ok bool)

func (*BlobSniffer) Size

func (sn *BlobSniffer) Size() int64

func (*BlobSniffer) Write

func (sn *BlobSniffer) Write(d []byte) (int, error)

type Corpus

type Corpus struct {
	// contains filtered or unexported fields
}

Corpus is an in-memory summary of all of a user's blobs' metadata.

func NewCorpusFromStorage

func NewCorpusFromStorage(s sorted.KeyValue) (*Corpus, error)

func (*Corpus) AppendClaims

func (c *Corpus) AppendClaims(dst []camtypes.Claim, permaNode blob.Ref,
	signerFilter blob.Ref,
	attrFilter string) ([]camtypes.Claim, error)

func (*Corpus) AppendPermanodeAttrValues

func (c *Corpus) AppendPermanodeAttrValues(dst []string,
	permaNode blob.Ref,
	attr string,
	at time.Time,
	signerFilter blob.Ref) []string

signerFilter is optional. dst must start with length 0 (laziness, mostly)

func (*Corpus) AppendPermanodeAttrValuesLocked

func (c *Corpus) AppendPermanodeAttrValuesLocked(dst []string,
	permaNode blob.Ref,
	attr string,
	at time.Time,
	signerFilter blob.Ref) []string

AppendPermanodeAttrValuesLocked is the version of AppendPermanodeAttrValues that assumes the Corpus is already locked with RLock.

func (*Corpus) EnumerateBlobMetaLocked

func (c *Corpus) EnumerateBlobMetaLocked(ctx *context.Context, ch chan<- camtypes.BlobMeta) error

EnumerateBlobMetaLocked sends all known blobs to ch, or until the context is canceled.

The Corpus must already be locked with RLock.

func (*Corpus) EnumerateCamliBlobsLocked

func (c *Corpus) EnumerateCamliBlobsLocked(ctx *context.Context, camType string, ch chan<- camtypes.BlobMeta) error

EnumerateCamliBlobsLocked sends just camlistore meta blobs to ch.

The Corpus must already be locked with RLock.

If camType is empty, all camlistore blobs are sent, otherwise it specifies the camliType to send. ch is closed at the end. The err will either be nil or context.ErrCanceled.

func (*Corpus) EnumeratePermanodesLastModifiedLocked

func (c *Corpus) EnumeratePermanodesLastModifiedLocked(ctx *context.Context, ch chan<- camtypes.BlobMeta) error

EnumeratePermanodesLastModified sends all permanodes, sorted by most recently modified first, to ch, or until ctx is done.

The Corpus must already be locked with RLock.

func (*Corpus) FileLatLongLocked

func (c *Corpus) FileLatLongLocked(fileRef blob.Ref) (lat, long float64, ok bool)

func (*Corpus) GetBlobMeta

func (c *Corpus) GetBlobMeta(br blob.Ref) (camtypes.BlobMeta, error)

func (*Corpus) GetBlobMetaLocked

func (c *Corpus) GetBlobMetaLocked(br blob.Ref) (camtypes.BlobMeta, error)

func (*Corpus) GetFileInfo

func (c *Corpus) GetFileInfo(fileRef blob.Ref) (fi camtypes.FileInfo, err error)

func (*Corpus) GetFileInfoLocked

func (c *Corpus) GetFileInfoLocked(fileRef blob.Ref) (fi camtypes.FileInfo, err error)

func (*Corpus) GetImageInfo

func (c *Corpus) GetImageInfo(fileRef blob.Ref) (ii camtypes.ImageInfo, err error)

func (*Corpus) GetImageInfoLocked

func (c *Corpus) GetImageInfoLocked(fileRef blob.Ref) (ii camtypes.ImageInfo, err error)

func (*Corpus) KeyId

func (c *Corpus) KeyId(signer blob.Ref) (string, error)

func (*Corpus) MediaTagLocked

func (c *Corpus) MediaTagLocked(fileRef blob.Ref) map[string]string

func (*Corpus) PermanodeModtime

func (c *Corpus) PermanodeModtime(pn blob.Ref) (t time.Time, ok bool)

PermanodeModtime returns the latest modification time of the given permanode.

The ok value is true only if the permanode is known and has any non-deleted claims. A deleted claim is ignored and neither its claim date nor the date of the delete claim affect the modtime of the permanode.

func (*Corpus) PermanodeModtimeLocked

func (c *Corpus) PermanodeModtimeLocked(pn blob.Ref) (t time.Time, ok bool)

PermanodeModtimeLocked is like PermanodeModtime but for when the Corpus is already locked via RLock.

func (*Corpus) PermanodeTimeLocked

func (c *Corpus) PermanodeTimeLocked(pn blob.Ref) (t time.Time, ok bool)

PermanodeTimeLocked returns the time of the content in permanode.

func (*Corpus) RLock

func (c *Corpus) RLock()

RLock locks the Corpus for reads. It must be used for any "Locked" methods.

func (*Corpus) RUnlock

func (c *Corpus) RUnlock()

RUnlock unlocks the Corpus for reads.

type Index

type Index struct {
	*blobserver.NoImplStorage

	KeyFetcher blob.StreamingFetcher // for verifying claims

	// BlobSource is used for fetching blobs when indexing files and other
	// blobs types that reference other objects.
	BlobSource blobserver.FetcherEnumerator
	// contains filtered or unexported fields
}

func New

func New(s sorted.KeyValue) *Index

New returns a new index using the provided key/value storage implementation.

func NewMemoryIndex

func NewMemoryIndex() *Index

NewMemoryIndex returns an Index backed only by memory, for use in tests.

func (*Index) AppendClaims

func (x *Index) AppendClaims(dst []camtypes.Claim, permaNode blob.Ref,
	signerFilter blob.Ref,
	attrFilter string) ([]camtypes.Claim, error)

func (*Index) Close

func (x *Index) Close() error

Close closes the underlying sorted.KeyValue, if the storage has a Close method. The return value is the return value of the underlying Close, or nil otherwise.

func (*Index) EdgesTo

func (x *Index) EdgesTo(ref blob.Ref, opts *camtypes.EdgesToOpts) (edges []*camtypes.Edge, err error)

func (*Index) EnumerateBlobMeta

func (x *Index) EnumerateBlobMeta(ctx *context.Context, ch chan<- camtypes.BlobMeta) (err error)

EnumerateBlobMeta sends all metadata about all known blobs to ch and then closes ch.

func (*Index) EnumerateBlobs

func (ix *Index) EnumerateBlobs(ctx *context.Context, dest chan<- blob.SizedRef, after string, limit int) (err error)

func (*Index) ExistingFileSchemas

func (x *Index) ExistingFileSchemas(wholeRef blob.Ref) (schemaRefs []blob.Ref, err error)

func (*Index) GetBlobMeta

func (x *Index) GetBlobMeta(br blob.Ref) (camtypes.BlobMeta, error)

func (*Index) GetDirMembers

func (x *Index) GetDirMembers(dir blob.Ref, dest chan<- blob.Ref, limit int) (err error)

GetDirMembers sends on dest the children of the static directory dir.

func (*Index) GetFileInfo

func (x *Index) GetFileInfo(fileRef blob.Ref) (camtypes.FileInfo, error)

func (*Index) GetImageInfo

func (x *Index) GetImageInfo(fileRef blob.Ref) (camtypes.ImageInfo, error)

func (*Index) GetRecentPermanodes

func (x *Index) GetRecentPermanodes(dest chan<- camtypes.RecentPermanode, owner blob.Ref, limit int, before time.Time) (err error)

GetRecentPermanodes sends results to dest filtered by owner, limit, and before. A zero value for before will default to the current time. The results will have duplicates supressed, with most recent permanode returned. Note, permanodes more recent than before will still be fetched from the index then skipped. This means runtime scales linearly with the number of nodes more recent than before.

func (*Index) IsDeleted

func (x *Index) IsDeleted(br blob.Ref) bool

IsDeleted reports whether the provided blobref (of a permanode or claim) should be considered deleted.

func (*Index) KeepInMemory

func (x *Index) KeepInMemory() (*Corpus, error)

func (*Index) KeyId

func (x *Index) KeyId(signer blob.Ref) (string, error)

func (*Index) PathLookup

func (x *Index) PathLookup(signer, base blob.Ref, suffix string, at time.Time) (*camtypes.Path, error)

func (*Index) PathsLookup

func (x *Index) PathsLookup(signer, base blob.Ref, suffix string) (paths []*camtypes.Path, err error)

func (*Index) PathsOfSignerTarget

func (x *Index) PathsOfSignerTarget(signer, target blob.Ref) (paths []*camtypes.Path, err error)

func (*Index) PermanodeOfSignerAttrValue

func (x *Index) PermanodeOfSignerAttrValue(signer blob.Ref, attr, val string) (permaNode blob.Ref, err error)

func (*Index) PreventStorageAccessForTesting

func (x *Index) PreventStorageAccessForTesting()

PreventStorageAccessForTesting causes any access to the index's underlying Storage interface to panic.

func (*Index) ReceiveBlob

func (ix *Index) ReceiveBlob(blobRef blob.Ref, source io.Reader) (retsb blob.SizedRef, err error)

func (*Index) Reindex

func (x *Index) Reindex() error

func (*Index) SearchPermanodesWithAttr

func (x *Index) SearchPermanodesWithAttr(dest chan<- blob.Ref, request *camtypes.PermanodeByAttrRequest) (err error)

This is just like PermanodeOfSignerAttrValue except we return multiple and dup-suppress. If request.Query is "", it is not used in the prefix search.

func (*Index) StatBlobs

func (ix *Index) StatBlobs(dest chan<- blob.SizedRef, blobs []blob.Ref) error

func (*Index) Storage

func (x *Index) Storage() sorted.KeyValue

Storage returns the index's underlying Storage implementation.

type Interface

type Interface interface {
	// os.ErrNotExist should be returned if the blob isn't known
	GetBlobMeta(blob.Ref) (camtypes.BlobMeta, error)

	// Should return os.ErrNotExist if not found.
	GetFileInfo(fileRef blob.Ref) (camtypes.FileInfo, error)

	// Should return os.ErrNotExist if not found.
	GetImageInfo(fileRef blob.Ref) (camtypes.ImageInfo, error)

	// KeyId returns the GPG keyid (e.g. "2931A67C26F5ABDA)
	// given the blobref of its ASCII-armored blobref.
	// The error is ErrNotFound if not found.
	KeyId(blob.Ref) (string, error)

	// AppendClaims appends to dst claims on the given permanode.
	// The signerFilter and attrFilter are both optional.  If non-zero,
	// they filter the return items to only claims made by the given signer
	// or claims about the given attribute, respectively.
	// Deleted claims are never returned.
	// The items may be appended in any order.
	//
	// TODO: this should take a context and a callback func
	// instead of a dst, then it can append to a channel instead,
	// and the context lets it be interrupted. The callback should
	// take the context too, so the channel send's select can read
	// from the Done channel.
	AppendClaims(dst []camtypes.Claim, permaNode blob.Ref,
		signerFilter blob.Ref,
		attrFilter string) ([]camtypes.Claim, error)

	// dest must be closed, even when returning an error.
	// limit <= 0 means unlimited.
	GetRecentPermanodes(dest chan<- camtypes.RecentPermanode,
		owner blob.Ref,
		limit int,
		before time.Time) error

	// SearchPermanodes finds permanodes matching the provided
	// request and sends unique permanode blobrefs to dest.
	// In particular, if request.FuzzyMatch is true, a fulltext
	// search is performed (if supported by the attribute(s))
	// instead of an exact match search.
	// If request.Query is blank, the permanodes which have
	// request.Attribute as an attribute (regardless of its value)
	// are searched.
	// Additionally, if request.Attribute is blank, all attributes
	// are searched (as fulltext), otherwise the search is
	// restricted  to the named attribute.
	//
	// dest is always closed, regardless of the error return value.
	SearchPermanodesWithAttr(dest chan<- blob.Ref,
		request *camtypes.PermanodeByAttrRequest) error

	// ExistingFileSchemas returns 0 or more blobrefs of "bytes"
	// (TODO(bradfitz): or file?) schema blobs that represent the
	// bytes of a file given in bytesRef.  The file schema blobs
	// returned are not guaranteed to reference chunks that still
	// exist on the blobservers, though.  It's purely a hint for
	// clients to avoid uploads if possible.  Before re-using any
	// returned blobref they should be checked.
	//
	// Use case: a user drag & drops a large file onto their
	// browser to upload.  (imagine that "large" means anything
	// larger than a blobserver's max blob size) JavaScript can
	// first SHA-1 the large file locally, then send the
	// wholeFileRef to this call and see if they'd previously
	// uploaded the same file in the past.  If so, the upload
	// can be avoided if at least one of the returned schemaRefs
	// can be validated (with a validating HEAD request) to still
	// all exist on the blob server.
	ExistingFileSchemas(wholeFileRef blob.Ref) (schemaRefs []blob.Ref, err error)

	// GetDirMembers sends on dest the children of the static
	// directory dirRef. It returns os.ErrNotExist if dirRef
	// is nil.
	// dest must be closed, even when returning an error.
	// limit <= 0 means unlimited.
	GetDirMembers(dirRef blob.Ref, dest chan<- blob.Ref, limit int) error

	// Given an owner key, a camliType 'claim', 'attribute' name,
	// and specific 'value', find the most recent permanode that has
	// a corresponding 'set-attribute' claim attached.
	// Returns os.ErrNotExist if none is found.
	// Only attributes white-listed by IsIndexedAttribute are valid.
	// TODO(bradfitz): ErrNotExist here is a weird error message ("file" not found). change.
	// TODO(bradfitz): use keyId instead of signer?
	PermanodeOfSignerAttrValue(signer blob.Ref, attr, val string) (blob.Ref, error)

	// PathsOfSignerTarget queries the index about "camliPath:"
	// URL-dispatch attributes.
	//
	// It returns a list of all the path claims that have been signed
	// by the provided signer and point at the given target.
	//
	// This is used when editing a permanode, to figure work up
	// the name resolution tree backwards ultimately to a
	// camliRoot permanode (which should know its base URL), and
	// then the complete URL(s) of a target can be found.
	PathsOfSignerTarget(signer, target blob.Ref) ([]*camtypes.Path, error)

	// All Path claims for (signer, base, suffix)
	PathsLookup(signer, base blob.Ref, suffix string) ([]*camtypes.Path, error)

	// Most recent Path claim for (signer, base, suffix) as of
	// provided time 'at', or most recent if 'at' is nil.
	PathLookup(signer, base blob.Ref, suffix string, at time.Time) (*camtypes.Path, error)

	// EdgesTo finds references to the provided ref.
	//
	// For instance, if ref is a permanode, it might find the parent permanodes
	// that have ref as a member.
	// Or, if ref is a static file, it might find static directories which contain
	// that file.
	// This is a way to go "up" or "back" in a hierarchy.
	//
	// opts may be nil to accept the defaults.
	EdgesTo(ref blob.Ref, opts *camtypes.EdgesToOpts) ([]*camtypes.Edge, error)

	// EnumerateBlobMeta sends ch information about all blobs
	// known to the indexer (which may be a subset of all total
	// blobs, since the indexer is typically configured to not see
	// non-metadata blobs) and then closes ch.  When it returns an
	// error, it also closes ch. The blobs may be sent in any order.
	// If the context finishes, the return error is context.ErrCanceled.
	EnumerateBlobMeta(*context.Context, chan<- camtypes.BlobMeta) error
}

type PermanodeMeta

type PermanodeMeta struct {
	// TODO: OwnerKeyId string
	Claims []*camtypes.Claim // sorted by camtypes.ClaimsByDate
}

Directories

Path Synopsis
Package indextest contains the unit tests for the indexer so they can be re-used for each specific implementation of the index Storage interface.
Package indextest contains the unit tests for the indexer so they can be re-used for each specific implementation of the index Storage interface.
Package kvfile implements the Camlistore index storage abstraction on top of a single mutable database file on disk using github.com/cznic/kv.
Package kvfile implements the Camlistore index storage abstraction on top of a single mutable database file on disk using github.com/cznic/kv.
Package mongo implements the Camlistore index storage abstraction on top of MongoDB.
Package mongo implements the Camlistore index storage abstraction on top of MongoDB.
Package mysql implements the Camlistore index storage abstraction on top of MySQL.
Package mysql implements the Camlistore index storage abstraction on top of MySQL.
Package postgres implements the Camlistore index storage abstraction on top of PostgreSQL.
Package postgres implements the Camlistore index storage abstraction on top of PostgreSQL.
Package sqlindex implements the sorted.KeyValue interface using an *sql.DB.
Package sqlindex implements the sorted.KeyValue interface using an *sql.DB.
Package sqlite implements the Camlistore index storage abstraction using an SQLite database file.
Package sqlite implements the Camlistore index storage abstraction using an SQLite database file.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL