Documentation ¶
Overview ¶
Package hercules contains the functions which are needed to gather various statistics from a Git repository.
The analysis is expressed in a form of the tree: there are nodes - "pipeline items" - which require some other nodes to be executed prior to selves and in turn provide the data for dependent nodes. There are several service items which do not produce any useful statistics but rather provide the requirements for other items. The top-level items include:
- BurndownAnalysis - line burndown statistics for project, files and developers.
- CouplesAnalysis - coupling statistics for files and developers.
- ShotnessAnalysis - structural hotness and couples, by any Babelfish UAST XPath (functions by default).
The typical API usage is to initialize the Pipeline class:
import "gopkg.in/src-d/go-git.v4" var repository *git.Repository // ...initialize repository... pipeline := hercules.NewPipeline(repository)
Then add the required analysis:
ba := pipeline.DeployItem(&hercules.BurndownAnalysis{}).(hercules.LeafPipelineItem)
This call will add all the needed intermediate pipeline items. Then link and execute the analysis tree:
pipeline.Initialize(nil) result, err := pipeline.Run(pipeline.Commits(false))
Finally extract the result:
result := result[ba].(hercules.BurndownResult)
The actual usage example is cmd/hercules/root.go - the command line tool's code.
Hercules depends heavily on https://github.com/src-d/go-git and leverages the diff algorithm through https://github.com/sergi/go-diff.
Besides, BurndownAnalysis involves File and RBTree. These are low level data structures which enable incremental blaming. File carries an instance of RBTree and the current line burndown state. RBTree implements the red-black balanced binary tree and is based on https://github.com/yasushi-saito/rbtree.
Coupling stats are supposed to be further processed rather than observed directly. labours.py uses Swivel embeddings and visualises them in Tensorflow Projector.
Shotness analysis as well as other UAST-featured items relies on [Babelfish](https://doc.bblf.sh) and requires the server to be running.
Index ¶
- Constants
- Variables
- func LoadCommitsFromFile(path string, repository *git.Repository) ([]*object.Commit, error)
- func SafeYamlString(str string) string
- type CachedBlob
- type CommonAnalysisResult
- type ConfigurationOption
- type ConfigurationOptionType
- type FeaturedPipelineItem
- type FileDiffData
- type LeafPipelineItem
- type NoopMerger
- type OneShotMergeProcessor
- type Pipeline
- type PipelineItem
- type PipelineItemRegistry
- type ResultMergeablePipelineItem
Constants ¶
const ( // BoolConfigurationOption reflects the boolean value type. BoolConfigurationOption = core.BoolConfigurationOption // IntConfigurationOption reflects the integer value type. IntConfigurationOption = core.IntConfigurationOption // StringConfigurationOption reflects the string value type. StringConfigurationOption = core.StringConfigurationOption // FloatConfigurationOption reflects a floating point value type. FloatConfigurationOption = core.FloatConfigurationOption // StringsConfigurationOption reflects the array of strings value type. StringsConfigurationOption = core.StringsConfigurationOption )
const ( // ConfigPipelineDumpPath is the name of the Pipeline configuration option (Pipeline.Initialize()) // which enables saving the items DAG to the specified file. ConfigPipelineDumpPath = core.ConfigPipelineDumpPath // ConfigPipelineDryRun is the name of the Pipeline configuration option (Pipeline.Initialize()) // which disables Configure() and Initialize() invocation on each PipelineItem during the // Pipeline initialization. // Subsequent Run() calls are going to fail. Useful with ConfigPipelineDumpPath=true. ConfigPipelineDryRun = core.ConfigPipelineDryRun // ConfigPipelineCommits is the name of the Pipeline configuration option (Pipeline.Initialize()) // which allows to specify the custom commit sequence. By default, Pipeline.Commits() is used. ConfigPipelineCommits = core.ConfigPipelineCommits )
const ( // DependencyCommit is the name of one of the three items in `deps` supplied to PipelineItem.Consume() // which always exists. It corresponds to the currently analyzed commit. DependencyCommit = core.DependencyCommit // DependencyIndex is the name of one of the three items in `deps` supplied to PipelineItem.Consume() // which always exists. It corresponds to the currently analyzed commit's index. DependencyIndex = core.DependencyIndex // DependencyIsMerge is the name of one of the three items in `deps` supplied to PipelineItem.Consume() // which always exists. It indicates whether the analyzed commit is a merge commit. // Checking the number of parents is not correct - we remove the back edges during the DAG simplification. DependencyIsMerge = core.DependencyIsMerge // DependencyAuthor is the name of the dependency provided by identity.Detector. DependencyAuthor = identity.DependencyAuthor // DependencyBlobCache identifies the dependency provided by BlobCache. DependencyBlobCache = plumbing.DependencyBlobCache // DependencyDay is the name of the dependency which DaysSinceStart provides - the number // of days since the first commit in the analysed sequence. DependencyDay = plumbing.DependencyDay // DependencyFileDiff is the name of the dependency provided by FileDiff. DependencyFileDiff = plumbing.DependencyFileDiff // DependencyTreeChanges is the name of the dependency provided by TreeDiff. DependencyTreeChanges = plumbing.DependencyTreeChanges // DependencyUastChanges is the name of the dependency provided by Changes. DependencyUastChanges = uast.DependencyUastChanges // DependencyUasts is the name of the dependency provided by Extractor. DependencyUasts = uast.DependencyUasts // FactCommitsByDay contains the mapping between day indices and the corresponding commits. FactCommitsByDay = plumbing.FactCommitsByDay // FactIdentityDetectorPeopleCount is the name of the fact which is inserted in // identity.Detector.Configure(). It is equal to the overall number of unique authors // (the length of ReversedPeopleDict). FactIdentityDetectorPeopleCount = identity.FactIdentityDetectorPeopleCount // FactIdentityDetectorPeopleDict is the name of the fact which is inserted in // identity.Detector.Configure(). It corresponds to identity.Detector.PeopleDict - the mapping // from the signatures to the author indices. FactIdentityDetectorPeopleDict = identity.FactIdentityDetectorPeopleDict // FactIdentityDetectorReversedPeopleDict is the name of the fact which is inserted in // identity.Detector.Configure(). It corresponds to identity.Detector.ReversedPeopleDict - // the mapping from the author indices to the main signature. FactIdentityDetectorReversedPeopleDict = identity.FactIdentityDetectorReversedPeopleDict )
Variables ¶
var BinaryGitHash = "<unknown>"
BinaryGitHash is the Git hash of the Hercules binary file which is executing.
var BinaryVersion = 0
BinaryVersion is Hercules' API version. It matches the package name.
var Registry = core.Registry
Registry contains all known pipeline item types.
Functions ¶
func LoadCommitsFromFile ¶
LoadCommitsFromFile reads the file by the specified FS path and generates the sequence of commits by interpreting each line as a Git commit hash.
func SafeYamlString ¶
SafeYamlString escapes the string so that it can be reliably used in YAML.
Types ¶
type CachedBlob ¶
type CachedBlob = plumbing.CachedBlob
CachedBlob allows to explicitly cache the binary data associated with the Blob object. Such structs are returned by DependencyBlobCache.
type CommonAnalysisResult ¶
type CommonAnalysisResult = core.CommonAnalysisResult
CommonAnalysisResult holds the information which is always extracted at Pipeline.Run().
func MetadataToCommonAnalysisResult ¶
func MetadataToCommonAnalysisResult(meta *core.Metadata) *CommonAnalysisResult
MetadataToCommonAnalysisResult copies the data from a Protobuf message.
type ConfigurationOption ¶
type ConfigurationOption = core.ConfigurationOption
ConfigurationOption allows for the unified, retrospective way to setup PipelineItem-s.
type ConfigurationOptionType ¶
type ConfigurationOptionType = core.ConfigurationOptionType
ConfigurationOptionType represents the possible types of a ConfigurationOption's value.
type FeaturedPipelineItem ¶
type FeaturedPipelineItem = core.FeaturedPipelineItem
FeaturedPipelineItem enables switching the automatic insertion of pipeline items on or off.
type FileDiffData ¶
type FileDiffData = plumbing.FileDiffData
FileDiffData is the type of the dependency provided by plumbing.FileDiff.
type LeafPipelineItem ¶
type LeafPipelineItem = core.LeafPipelineItem
LeafPipelineItem corresponds to the top level pipeline items which produce the end results.
type NoopMerger ¶
type NoopMerger = core.NoopMerger
NoopMerger provides an empty Merge() method suitable for PipelineItem.
type OneShotMergeProcessor ¶
type OneShotMergeProcessor = core.OneShotMergeProcessor
OneShotMergeProcessor provides the convenience method to consume merges only once.
type Pipeline ¶
Pipeline is the core Hercules entity which carries several PipelineItems and executes them. See the extended example of how a Pipeline works in doc.go
func NewPipeline ¶
func NewPipeline(repository *git.Repository) *Pipeline
NewPipeline initializes a new instance of Pipeline struct.
type PipelineItem ¶
type PipelineItem = core.PipelineItem
PipelineItem is the interface for all the units in the Git commits analysis pipeline.
func ForkCopyPipelineItem ¶
func ForkCopyPipelineItem(origin PipelineItem, n int) []PipelineItem
ForkCopyPipelineItem clones items by copying them by value from the origin.
func ForkSamePipelineItem ¶
func ForkSamePipelineItem(origin PipelineItem, n int) []PipelineItem
ForkSamePipelineItem clones items by referencing the same origin.
type PipelineItemRegistry ¶
type PipelineItemRegistry = core.PipelineItemRegistry
PipelineItemRegistry contains all the known PipelineItem-s.
type ResultMergeablePipelineItem ¶
type ResultMergeablePipelineItem = core.ResultMergeablePipelineItem
ResultMergeablePipelineItem specifies the methods to combine several analysis results together.