maintner

package
v0.0.0-...-529bf19 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 18, 2017 License: BSD-3-Clause Imports: 35 Imported by: 0

Documentation

Overview

Package maintner mirrors, searches, syncs, and serves Git, Github, and Gerrit metadata.

Maintner is short for "Maintainer". This package is intended for use by many tools. The name of the daemon that serves the maintner data to other tools is "maintnerd".

Index

Constants

This section is empty.

Variables

View Source
var ErrSplit = errors.New("maintner: leader server's history split, process out of sync")

ErrSplit is returned when the the client notices the leader's mutation log has changed. This can happen if the leader restarts with uncommitted transactions. (The leader only commits mutations periodically.)

Functions

This section is empty.

Types

type Corpus

type Corpus struct {
	// contains filtered or unexported fields
}

Corpus holds all of a project's metadata.

Many public accessor methods are missing. File bugs at golang.org/issues/new.

func (*Corpus) Check

func (c *Corpus) Check() error

Check verifies the internal structure of the Corpus data structures. It is intended for tests and debugging.

func (*Corpus) EnableLeaderMode

func (c *Corpus) EnableLeaderMode(logger MutationLogger, scratchDir string)

EnableLeaderMode prepares c to be the leader. This should only be called by the maintnerd process.

The provided scratchDir will store git checkouts.

func (*Corpus) Gerrit

func (c *Corpus) Gerrit() *Gerrit

Gerrit returns the corpus's Gerrit data.

func (*Corpus) GitCommit

func (c *Corpus) GitCommit(hash string) *GitCommit

GitCommit returns the provided git commit, or nil if it's unknown.

func (*Corpus) GitHub

func (c *Corpus) GitHub() *GitHub

GitHub returns the corpus's github data.

func (*Corpus) Initialize

func (c *Corpus) Initialize(ctx context.Context, src MutationSource) error

Initialize populates the Corpus using the data from the MutationSource. It returns once it's up-to-date. To incrementally update it later, use the Update method.

func (*Corpus) RLock

func (c *Corpus) RLock()

RLock grabs the corpus's read lock. Grabbing the read lock prevents any concurrent writes from mutation the corpus. This is only necessary if the application is querying the corpus and calling its Update method concurrently.

func (*Corpus) RUnlock

func (c *Corpus) RUnlock()

RUnlock unlocks the corpus's read lock.

func (*Corpus) SetDebug

func (c *Corpus) SetDebug()

func (*Corpus) SetVerbose

func (c *Corpus) SetVerbose(v bool)

SetVerbose enables or disables verbose logging.

func (*Corpus) StartPubSubHelperSubscribe

func (c *Corpus) StartPubSubHelperSubscribe(urlBase string)

StartPubSubHelperSubscribe starts subscribing to a golang.org/x/build/cmd/pubsubhelper server, such as https://pubsubhelper.golang.org

func (*Corpus) Sync

func (c *Corpus) Sync(ctx context.Context) error

Sync updates the corpus from its tracked sources.

func (*Corpus) SyncLoop

func (c *Corpus) SyncLoop(ctx context.Context) error

SyncLoop runs forever (until an error or context expiration) and updates the corpus as the tracked sources change.

func (*Corpus) TrackGerrit

func (c *Corpus) TrackGerrit(gerritProj string)

TrackGerrit registers the Gerrit project with the given project as a project to watch and append to the mutation log. Only valid in leader mode. The provided string should be of the form "hostname/project", without a scheme or trailing slash.

func (*Corpus) TrackGithub

func (c *Corpus) TrackGithub(owner, repo, token string)

TrackGithub registers the named Github repo as a repo to watch and append to the mutation log. Only valid in leader mode. The token is the auth token to use to make API calls.

func (*Corpus) TrackGoGitRepo

func (c *Corpus) TrackGoGitRepo(goRepo, dir string)

TrackGoGitRepo registers a git directory to have its metadata slurped into the corpus. The goRepo is a name like "go" or "net". The dir is a path on disk.

func (*Corpus) Update

func (c *Corpus) Update(ctx context.Context) error

Update incrementally updates the corpus from its current state to the latest state from the MutationSource passed earlier to Initialize. It does not return until there's either a new change or the context expires. If Update returns ErrSplit, the corpus can longer be updated.

Update must not be called concurrently with any other method or access of the corpus, including other Update calls.

func (*Corpus) UpdateWithLocker

func (c *Corpus) UpdateWithLocker(ctx context.Context, lk sync.Locker) error

UpdateWithLocker behaves just like Update, but holds lk when processing mutation events.

type DiskMutationLogger

type DiskMutationLogger struct {
	// contains filtered or unexported fields
}

DiskMutationLogger logs mutations to disk.

func NewDiskMutationLogger

func NewDiskMutationLogger(directory string) *DiskMutationLogger

NewDiskMutationLogger creates a new DiskMutationLogger, which will create mutations in the given directory.

func (*DiskMutationLogger) ForeachFile

func (d *DiskMutationLogger) ForeachFile(fn func(fullPath string, fi os.FileInfo) error) error

func (*DiskMutationLogger) GetMutations

func (d *DiskMutationLogger) GetMutations(ctx context.Context) <-chan MutationStreamEvent

func (*DiskMutationLogger) Log

Log will write m to disk. If a mutation file does not exist for the current day, it will be created.

type Gerrit

type Gerrit struct {
	// contains filtered or unexported fields
}

Gerrit holds information about a number of Gerrit projects.

func (*Gerrit) ForeachProjectUnsorted

func (g *Gerrit) ForeachProjectUnsorted(fn func(*GerritProject) error) error

ForeachProjectUnsorted calls fn for each known Gerrit project. Iteration ends if fn returns a non-nil value.

func (*Gerrit) Project

func (g *Gerrit) Project(server, project string) *GerritProject

Project returns the specified Gerrit project if it's known, otherwise it returns nil. Server is the Gerrit server's hostname, such as "go.googlesource.com".

type GerritCL

type GerritCL struct {
	// Project is the project this CL is part of.
	Project *GerritProject

	// Number is the CL number on the Gerrit
	// server. (e.g. 1, 2, 3)
	Number int32

	Created time.Time

	// Version is the number of versions of the patchset for this
	// CL seen so far. It starts at 1.
	Version int32

	// Commit is the git commit of the latest version of this CL.
	// Previous versions are available via GerritProject.remote.
	Commit *GitCommit

	// Meta is the head of the most recent Gerrit "meta" commit
	// for this CL. This is guaranteed to be a linear history
	// back to a CL-specific root commit for this meta branch.
	Meta *GitCommit

	// Status will be "merged", "abandoned", "new", or "draft".
	Status string

	// GitHubIssueRefs are parsed references to GitHub issues.
	GitHubIssueRefs []GitHubIssueRef

	// Messages contains all of the messages for this CL, in sorted order.
	Messages []*GerritMessage
}

A GerritCL represents a single change in Gerrit.

func (*GerritCL) OwnerID

func (cl *GerritCL) OwnerID() int

OwnerID returns the ID of the CL’s owner. It will return -1 on error.

func (*GerritCL) References

func (cl *GerritCL) References(ref GitHubIssueRef) bool

References reports whether cl includes a commit message reference to the provided Github issue ref.

type GerritMessage

type GerritMessage struct {
	// Version is the patch set version this message was sent on.
	Version int32

	// Message is the raw message contents from Gerrit (a subset
	// of the raw git commit message), starting with "Patch Set
	// nnnn".
	Message string

	// Date is when this message was stored (the commit time of
	// the git commit).
	Date time.Time

	// Author returns the author of the commit. This takes the form "Kevin Burke
	// <13437@62eb7196-b449-3ce5-99f1-c037f21e1705>", where the number before
	// the '@' sign is your Gerrit user ID, and the UUID after the '@' sign
	// seems to be the same for all commits for the same Gerrit server, across
	// projects.
	//
	// TODO: Merge the *GitPerson object here and for a person's Git commits
	// (which use their real email) via the user ID, so they point to the same
	// object.
	Author *GitPerson
}

GerritMessage is a Gerrit reply that is attached to the CL as a whole, and not to a file or line of a patch set.

Maintner does very little parsing or formatting of a Message body. Messages are stored the same way they are stored in the API.

type GerritProject

type GerritProject struct {
	// contains filtered or unexported fields
}

GerritProject represents a single Gerrit project.

func (*GerritProject) CL

func (gp *GerritProject) CL(number int32) *GerritCL

CL returns the GerritCL with the given number, or nil if it is not present.

CL numbers are shared across all projects on a Gerrit server, so you can get nil unless you have the GerritProject containing that CL.

func (*GerritProject) ForeachCLUnsorted

func (gp *GerritProject) ForeachCLUnsorted(fn func(*GerritCL) error) error

ForeachCLUnsorted calls fn for each CL in the repo, in any order

If fn returns an error, iteration ends and ForeachCLUnsorted returns with that error.

func (*GerritProject) ForeachNonChangeRef

func (gp *GerritProject) ForeachNonChangeRef(fn func(ref string, hash GitHash) error) error

ForeachNonChangeRef calls fn for each git ref on the server that is not a change (code review) ref. In general, these correspond to submitted changes. fn is called serially with sorted ref names. Iteration stops with the first non-nil error returned by fn.

func (*GerritProject) ForeachOpenCL

func (gp *GerritProject) ForeachOpenCL(fn func(*GerritCL) error) error

ForeachOpenCL calls fn for each open CL in the repo.

If fn returns an error, iteration ends and ForeachOpenCL returns with that error.

The fn function is called serially, with increasingly numbered CLs.

func (*GerritProject) Project

func (gp *GerritProject) Project() string

Project returns the Gerrit project on the server, such as "go" or "crypto".

func (*GerritProject) Server

func (gp *GerritProject) Server() string

Server returns the Gerrit server, such as "go.googlesource.com".

func (*GerritProject) ServerSlashProject

func (gp *GerritProject) ServerSlashProject() string

ServerSlashProject returns the server and project together, such as "go.googlesource.com/build".

type GitCommit

type GitCommit struct {
	Hash       GitHash
	Tree       GitHash
	Parents    []*GitCommit
	Author     *GitPerson
	AuthorTime time.Time
	Committer  *GitPerson
	CommitTime time.Time
	Msg        string // Commit message subject and body
	Files      []*maintpb.GitDiffTreeFile
}

GitCommit represents a single commit in a git repository.

func (*GitCommit) HasAncestor

func (gc *GitCommit) HasAncestor(ancestor *GitCommit) bool

HasAncestor reports whether gc contains the provided ancestor commit in gc's history.

type GitHash

type GitHash string

GitHash is a git commit in binary form (NOT hex form). They are currently always 20 bytes long. (for SHA-1 refs) That may change in the future.

func (GitHash) String

func (h GitHash) String() string

type GitHub

type GitHub struct {
	// contains filtered or unexported fields
}

GitHub holds data about a GitHub repo.

func (*GitHub) ForeachRepo

func (g *GitHub) ForeachRepo(fn func(*GitHubRepo) error) error

ForeachRepo calls fn serially for each GithubRepo, stopping if fn returns an error. The function is called with lexically increasing repo IDs.

func (*GitHub) Repo

func (g *GitHub) Repo(owner, repo string) *GitHubRepo

Repo returns the repo if it's known. Otherwise it returns nil.

type GitHubComment

type GitHubComment struct {
	ID      int64
	User    *GitHubUser
	Created time.Time
	Updated time.Time
	Body    string
}

type GitHubIssue

type GitHubIssue struct {
	ID        int64
	Number    int32
	NotExist  bool // if true, rest of fields should be ignored.
	Closed    bool
	Locked    bool
	User      *GitHubUser
	Assignees []*GitHubUser
	Created   time.Time
	Updated   time.Time
	ClosedAt  time.Time
	ClosedBy  *GitHubUser
	Title     string
	Body      string
	Milestone *GitHubMilestone       // nil for unknown, noMilestone for none
	Labels    map[int64]*GitHubLabel // label ID => label
	// contains filtered or unexported fields
}

GitHubIssue represents a github issue. This is maintner's in-memory representation. It differs slightly from the API's *github.Issue type, notably in the lack of pointers for all fields. See https://developer.github.com/v3/issues/#get-a-single-issue

func (*GitHubIssue) ForeachComment

func (gi *GitHubIssue) ForeachComment(fn func(*GitHubComment) error) error

ForeachComment calls fn for each event on the issue.

If fn returns an error, iteration ends and ForeachComment returns with that error.

The fn function is called serially, in order of the comment's time.

func (*GitHubIssue) ForeachEvent

func (gi *GitHubIssue) ForeachEvent(fn func(*GitHubIssueEvent) error) error

ForeachEvent calls fn for each event on the issue.

If fn returns an error, iteration ends and ForeachEvent returns with that error.

The fn function is called serially, in order of the event's time.

func (*GitHubIssue) HasEvent

func (gi *GitHubIssue) HasEvent(eventType string) bool

HasEvent reports whether there's any GitHubIssueEvent in this issue's history of the given type.

func (*GitHubIssue) HasLabel

func (gi *GitHubIssue) HasLabel(label string) bool

HasLabel reports whether the issue is labeled with the given label.

func (*GitHubIssue) LastModified

func (gi *GitHubIssue) LastModified() time.Time

LastModified reports the most recent time that any known metadata was updated. In contrast to the Updated field, LastModified includes comments and events.

TODO(bradfitz): this seems to not be working, at least events aren't updating it. Investigate.

type GitHubIssueEvent

type GitHubIssueEvent struct {

	// ID is the ID of the event.
	ID int64

	// Type is one of:
	// * labeled, unlabeled
	// * milestoned, demilestoned
	// * assigned, unassigned
	// * locked, unlocked
	// * closed
	// * referenced
	// * renamed
	Type string

	// OtherJSON optionally contains a JSON object of Github's API
	// response for any fields maintner was unable to extract at
	// the time. It is empty if maintner supported all the fields
	// when the mutation was created.
	OtherJSON string

	Created time.Time
	Actor   *GitHubUser

	Label               string      // for type: "unlabeled", "labeled"
	Assignee            *GitHubUser // for type "assigned", "unassigned"
	Assigner            *GitHubUser // for type "assigned", "unassigned"
	Milestone           string      // for type: "milestoned", "demilestoned"
	From, To            string      // for type: "renamed"
	CommitID, CommitURL string      // for type: "closed", "referenced" ... ?
}

func (*GitHubIssueEvent) Proto

type GitHubIssueRef

type GitHubIssueRef struct {
	Repo   *GitHubRepo // must be non-nil
	Number int32       // GitHubIssue.Number
}

GitHubIssueRef is a reference to an issue (or pull request) number in a repo. These are parsed from text making references such as "golang/go#1234" or just "#1234" (with an implicit Repo).

func (GitHubIssueRef) String

func (r GitHubIssueRef) String() string

type GitHubLabel

type GitHubLabel struct {
	ID   int64
	Name string
}

func (*GitHubLabel) GenMutationDiff

func (a *GitHubLabel) GenMutationDiff(b *github.Label) *maintpb.GithubLabel

GenMutationDiff generates a diff from in-memory state 'a' (which may be nil) to the current (non-nil) state b from GitHub. It returns nil if there's no difference.

type GitHubMilestone

type GitHubMilestone struct {
	ID     int64
	Title  string
	Number int32
	Closed bool
}

func (*GitHubMilestone) GenMutationDiff

func (a *GitHubMilestone) GenMutationDiff(b *github.Milestone) *maintpb.GithubMilestone

GenMutationDiff generates a diff from in-memory state 'a' (which may be nil) to the current (non-nil) state b from GitHub. It returns nil if there's no difference.

func (*GitHubMilestone) IsNone

func (ms *GitHubMilestone) IsNone() bool

IsNone reports whether ms represents the sentinel "no milestone" milestone.

func (*GitHubMilestone) IsUnknown

func (ms *GitHubMilestone) IsUnknown() bool

IsUnknown reports whether ms is nil, which represents the unknown state. Milestones should never be in this state, though.

type GitHubRepo

type GitHubRepo struct {
	// contains filtered or unexported fields
}

func (*GitHubRepo) ForeachIssue

func (gr *GitHubRepo) ForeachIssue(fn func(*GitHubIssue) error) error

ForeachIssue calls fn for each issue in the repo.

If fn returns an error, iteration ends and ForeachIssue returns with that error.

The fn function is called serially, with increasingly numbered issues.

func (*GitHubRepo) ForeachLabel

func (gr *GitHubRepo) ForeachLabel(fn func(*GitHubLabel) error) error

ForeachLabel calls fn for each label in the repo, in unsorted order.

Iteration ends if fn returns an error, with that error.

func (*GitHubRepo) ID

func (gr *GitHubRepo) ID() GithubRepoID

func (*GitHubRepo) Issue

func (gr *GitHubRepo) Issue(n int32) *GitHubIssue

Issue returns the the provided issue number, or nil if it's not known.

type GitHubUser

type GitHubUser struct {
	ID    int64
	Login string
}

GitHubUser represents a github user. It is a subset of https://developer.github.com/v3/users/#get-a-single-user

type GitPerson

type GitPerson struct {
	Str string // "Foo Bar <foo@bar.com>"
}

GitPerson is a person in a git commit.

func (*GitPerson) Email

func (p *GitPerson) Email() string

Email returns the GitPerson's email address only, without the name or angle brackets.

func (*GitPerson) Name

func (p *GitPerson) Name() string

type GithubRepoID

type GithubRepoID struct {
	Owner, Repo string
}

GithubRepoID is a github org & repo, lowercase.

func (GithubRepoID) String

func (id GithubRepoID) String() string

type LogSegmentJSON

type LogSegmentJSON struct {
	Number int    `json:"number"`
	Size   int64  `json:"size"`
	SHA224 string `json:"sha224"`
	URL    string `json:"url"`
}

type MutationLogger

type MutationLogger interface {
	Log(*maintpb.Mutation) error
}

A MutationLogger logs mutations.

type MutationSource

type MutationSource interface {
	// GetMutations returns a channel of mutations or related events.
	// The channel will never be closed.
	// All sends on the returned channel should select
	// on the provided context.
	GetMutations(context.Context) <-chan MutationStreamEvent
}

A MutationSource yields a log of mutations that will catch a corpus back up to the present.

func NewNetworkMutationSource

func NewNetworkMutationSource(server, cacheDir string) MutationSource

NewNetworkMutationSource returns a mutation source from a master server. The server argument should be a URL to the JSON logs index.

type MutationStreamEvent

type MutationStreamEvent struct {
	Mutation *maintpb.Mutation

	// Err is a fatal error reading the log. No other events will
	// follow an Err.
	Err error

	// End, if true, means that all mutations have been sent and
	// the next event might take some time to arrive (it might not
	// have occurred yet). The End event is not a terminal state
	// like Err. There may be multiple Ends.
	End bool
}

MutationStreamEvent represents one of three possible events while reading mutations from disk. An event is either a mutation, an error, or reaching the current end of the log. Only one of the fields will be non-zero.

Directories

Path Synopsis
Package godata loads the Go project's corpus of Git, Github, and Gerrit activity into memory to allow easy analysis without worrying about APIs and their pagination, quotas, and other nuisances and limitations.
Package godata loads the Go project's corpus of Git, Github, and Gerrit activity into memory to allow easy analysis without worrying about APIs and their pagination, quotas, and other nuisances and limitations.
The gostats command computes stats about the Go project.
The gostats command computes stats about the Go project.
The maintnerd command serves project maintainer data from Git, Github, and/or Gerrit.
The maintnerd command serves project maintainer data from Git, Github, and/or Gerrit.
apipb
Package apipb is a generated protocol buffer package.
Package apipb is a generated protocol buffer package.
Package maintpb is a generated protocol buffer package.
Package maintpb is a generated protocol buffer package.
The maintq command queries a maintnerd gRPC server.
The maintq command queries a maintnerd gRPC server.
Package reclog contains readers and writers for a record wrapper format used by maintner.
Package reclog contains readers and writers for a record wrapper format used by maintner.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL