Documentation ¶
Index ¶
- Constants
- Variables
- func Chunker(originalChunk *Chunk) chan *Chunk
- func DecodeResumeInfo(resumeInfo string) []string
- func EncodeResumeInfo(resumeInfoSlice []string) string
- func FilterReposToResume(repos []string, resumeInfo string) (reposToScan []string, progressOffsetCount int)
- func HandleTestChannel(chunksCh chan *Chunk, cf ChunkFunc) error
- func RemoveRepoFromResumeInfo(resumeRepos []string, repoURL string) []string
- func WithAPI(api apiClient) func(*SourceManager)
- func WithConcurrency(concurrency int) func(*SourceManager)
- type Chunk
- type ChunkFunc
- type ChunkReporter
- type CommonSourceUnit
- type CommonSourceUnitUnmarshaller
- type FilesystemConfig
- type GCSConfig
- type GitConfig
- type GithubConfig
- type GitlabConfig
- type Progress
- type S3Config
- type ScanErrors
- type Source
- type SourceInitFunc
- type SourceManager
- type SourceUnit
- type SourceUnitChunker
- type SourceUnitEnumerator
- type SourceUnitUnmarshaller
- type SyslogConfig
- type UnitReporter
- type Validator
Constants ¶
const ( // ChunkSize is the maximum size of a chunk. ChunkSize = 10 * 1024 // PeekSize is the size of the peek into the previous chunk. PeekSize = 3 * 1024 // TotalChunkSize is the total size of a chunk with peek data. TotalChunkSize = ChunkSize + PeekSize )
Variables ¶
var MatchError = errors.New("chunk doesn't match")
Functions ¶
func DecodeResumeInfo ¶ added in v3.6.6
func EncodeResumeInfo ¶ added in v3.6.6
func FilterReposToResume ¶ added in v3.6.6
func FilterReposToResume(repos []string, resumeInfo string) (reposToScan []string, progressOffsetCount int)
FilterReposToResume filters the existing repos down to those that are included in the encoded resume info. It returns the new slice of repos to be scanned. It also returns the difference between the original length of the repos and the new length to use for progress reporting. It is required that both the resumeInfo repos and the existing repos are sorted.
func HandleTestChannel ¶ added in v3.8.0
func RemoveRepoFromResumeInfo ¶ added in v3.6.6
RemoveRepoFromResumeInfo removes the repoURL from the resume info.
func WithAPI ¶ added in v3.45.0
func WithAPI(api apiClient) func(*SourceManager)
WithAPI adds an API client to the manager for tracking jobs and progress.
func WithConcurrency ¶ added in v3.45.0
func WithConcurrency(concurrency int) func(*SourceManager)
WithConcurrency limits the concurrent number of sources a manager can run.
Types ¶
type Chunk ¶
type Chunk struct { // SourceName is the name of the Source that produced the chunk. SourceName string // SourceID is the ID of the source that the Chunk originated from. SourceID int64 // SourceType is the type of Source that produced the chunk. SourceType sourcespb.SourceType // SourceMetadata holds the context of where the Chunk was found. SourceMetadata *source_metadatapb.MetaData // Data is the data to decode and scan. Data []byte // Verify specifies whether any secrets in the Chunk should be verified. Verify bool }
Chunk contains data to be decoded and scanned along with context on where it came from.
type ChunkReporter ¶ added in v3.45.0
type ChunkReporter interface { ChunkOk(ctx context.Context, chunk Chunk) error ChunkErr(ctx context.Context, err error) error }
ChunkReporter defines the interface a source will use to report whether a chunk was found during unit chunking. Either method may be called any number of times. Implementors of this interface should allow for concurrent calls.
type CommonSourceUnit ¶ added in v3.41.0
type CommonSourceUnit struct {
ID string `json:"source_unit_id"`
}
CommonSourceUnit is a common implementation of SourceUnit that Sources can use instead of implementing their own types.
func (CommonSourceUnit) SourceUnitID ¶ added in v3.41.0
func (c CommonSourceUnit) SourceUnitID() string
SourceUnitID implements the SourceUnit interface.
type CommonSourceUnitUnmarshaller ¶ added in v3.41.1
type CommonSourceUnitUnmarshaller struct{}
CommonSourceUnitUnmarshaller is an implementation of SourceUnitUnmarshaller for the CommonSourceUnit. A source can embed this struct to gain the functionality of converting []byte to a CommonSourceUnit.
func (CommonSourceUnitUnmarshaller) UnmarshalSourceUnit ¶ added in v3.41.1
func (c CommonSourceUnitUnmarshaller) UnmarshalSourceUnit(data []byte) (SourceUnit, error)
UnmarshalSourceUnit implements the SourceUnitUnmarshaller interface.
type FilesystemConfig ¶ added in v3.27.0
type FilesystemConfig struct { // Paths is the list of files and directories to scan. Paths []string // Filter is the filter to use to scan the source. Filter *common.Filter }
FilesystemConfig defines the optional configuration for a filesystem source.
type GCSConfig ¶ added in v3.29.0
type GCSConfig struct { // CloudCred determines whether to use cloud credentials. // This can NOT be used with a secret. CloudCred, WithoutAuth bool // ApiKey is the API key to use to authenticate with the source. ApiKey, ProjectID, ServiceAccount string // MaxObjectSize is the maximum object size to scan. MaxObjectSize int64 // Concurrency is the number of concurrent workers to use to scan the source. Concurrency int // IncludeBuckets is a list of buckets to include in the scan. IncludeBuckets, ExcludeBuckets, IncludeObjects, ExcludeObjects []string }
GCSConfig defines the optional configuration for a GCS source.
type GitConfig ¶ added in v3.27.0
type GitConfig struct { // RepoPath is the path to the repository to scan. RepoPath, HeadRef, BaseRef string // MaxDepth is the maximum depth to scan the source. MaxDepth int // Filter is the filter to use to scan the source. Filter *common.Filter // ExcludeGlobs is a list of globs to exclude from the scan. // This differs from the Filter exclusions as ExcludeGlobs is applied at the `git log -p` level ExcludeGlobs []string }
GitConfig defines the optional configuration for a git source.
type GithubConfig ¶ added in v3.27.0
type GithubConfig struct { // Endpoint is the endpoint of the source. Endpoint, Token string // IncludeForks indicates whether to include forks in the scan. IncludeForks, IncludeMembers bool // Concurrency is the number of concurrent workers to use to scan the source. Concurrency int // Repos is the list of repositories to scan. Repos, Orgs, ExcludeRepos, IncludeRepos []string // Filter is the filter to use to scan the source. Filter *common.Filter }
GithubConfig defines the optional configuration for a github source.
type GitlabConfig ¶ added in v3.27.0
type GitlabConfig struct { // Endpoint is the endpoint of the source. Endpoint, Token string // Repos is the list of repositories to scan. Repos []string // Filter is the filter to use to scan the source. Filter *common.Filter }
GitlabConfig defines the optional configuration for a gitlab source.
type Progress ¶
type Progress struct { PercentComplete int64 Message string EncodedResumeInfo string SectionsCompleted int32 SectionsRemaining int32 // contains filtered or unexported fields }
Progress is used to update job completion progress across sources.
func (*Progress) GetProgress ¶
GetProgress gets job completion percentage for metrics reporting.
func (*Progress) SetProgressComplete ¶
SetProgressComplete sets job progress information for a running job based on the highest level objects in the source. i is the current iteration in the loop of target scope scope should be the len(scopedItems) message is the public facing user information about the current progress encodedResumeInfo is an optional string representing any information necessary to resume the job if interrupted
type S3Config ¶ added in v3.27.0
type S3Config struct { // CloudCred determines whether to use cloud credentials. // This can NOT be used with a secret. CloudCred bool // Key is any key to use to authenticate with the source. Key, Secret, SessionToken string // Buckets is the list of buckets to scan. Buckets []string // MaxObjectSize is the maximum object size to scan. MaxObjectSize int64 }
S3Config defines the optional configuration for an S3 source.
type ScanErrors ¶ added in v3.27.0
type ScanErrors struct {
// contains filtered or unexported fields
}
ScanErrors is used to collect errors encountered while scanning. It ensures that errors are collected in a thread-safe manner.
func NewScanErrors ¶ added in v3.27.0
func NewScanErrors() *ScanErrors
NewScanErrors creates a new thread safe error collector.
func (*ScanErrors) Add ¶ added in v3.27.0
func (s *ScanErrors) Add(err error)
Add an error to the collection in a thread-safe manner.
func (*ScanErrors) Count ¶ added in v3.27.0
func (s *ScanErrors) Count() uint64
Count returns the number of errors collected.
func (*ScanErrors) String ¶ added in v3.28.3
func (s *ScanErrors) String() string
type Source ¶
type Source interface { // Type returns the source type, used for matching against configuration and jobs. Type() sourcespb.SourceType // SourceID returns the initialized source ID used for tracking relationships in the DB. SourceID() int64 // JobID returns the initialized job ID used for tracking relationships in the DB. JobID() int64 // Init initializes the source. Init(aCtx context.Context, name string, jobId, sourceId int64, verify bool, connection *anypb.Any, concurrency int) error // Chunks emits data over a channel that is decoded and scanned for secrets. Chunks(ctx context.Context, chunksChan chan *Chunk) error // GetProgress is the completion progress (percentage) for Scanned Source. GetProgress() *Progress }
Source defines the interface required to implement a source chunker.
type SourceInitFunc ¶ added in v3.45.0
SourceInitFunc is a function that takes a source and job ID and returns an initialized Source.
type SourceManager ¶ added in v3.45.0
type SourceManager struct {
// contains filtered or unexported fields
}
func NewManager ¶ added in v3.45.0
func NewManager(outputChunks chan *Chunk, opts ...func(*SourceManager)) *SourceManager
NewManager creates a new manager with the provided options.
func (*SourceManager) Enroll ¶ added in v3.45.0
func (s *SourceManager) Enroll(ctx context.Context, name string, kind sourcespb.SourceType, f SourceInitFunc) (handle, error)
Enroll informs the SourceManager to track and manage a Source.
func (*SourceManager) Run ¶ added in v3.45.0
func (s *SourceManager) Run(ctx context.Context, handle handle) error
Run blocks until a resource is available to run the source, then synchronously runs it.
func (*SourceManager) ScheduleRun ¶ added in v3.45.0
func (s *SourceManager) ScheduleRun(ctx context.Context, handle handle) error
ScheduleRun blocks until a resource is available to run the source, then asynchronously runs it. Error information is lost in this case.
type SourceUnit ¶ added in v3.41.0
type SourceUnit interface { // SourceUnitID uniquely identifies a source unit. SourceUnitID() string }
SourceUnit is an object that represents a Source's unit of work. This is used as the output of enumeration, progress reporting, and job distribution.
type SourceUnitChunker ¶ added in v3.45.0
type SourceUnitChunker interface { // ChunkUnit creates 0 or more chunks from a unit, reporting them or // any errors to the ChunkReporter. An error should only be returned // from this method in the case of context cancellation, fatal source // errors, or errors returned by the reporter. All other errors related // to unit chunking are tracked by the ChunkReporter. ChunkUnit(ctx context.Context, unit SourceUnit, reporter ChunkReporter) error }
SourceUnitChunker defines an optional interface a Source can implement to support chunking a single SourceUnit.
type SourceUnitEnumerator ¶ added in v3.44.0
type SourceUnitEnumerator interface { // Enumerate creates 0 or more units from an initialized source, // reporting them or any errors to the UnitReporter. This method is // synchronous but can be called in a goroutine to support concurrent // enumeration and chunking. An error should only be returned from this // method in the case of context cancellation, fatal source errors, or // errors returned by the reporter All other errors related to unit // enumeration are tracked by the UnitReporter. Enumerate(ctx context.Context, reporter UnitReporter) error }
SourceUnitEnumerator defines an optional interface a Source can implement to support enumerating an initialized Source into SourceUnits.
type SourceUnitUnmarshaller ¶ added in v3.41.0
type SourceUnitUnmarshaller interface {
UnmarshalSourceUnit(data []byte) (SourceUnit, error)
}
SourceUnitUnmarshaller defines an optional interface a Source can implement to support units coming from an external source.
type SyslogConfig ¶ added in v3.27.0
type SyslogConfig struct { // Address used to connect to the source. Address, Protocol, CertPath, Format, KeyPath string // Concurrency is the number of concurrent workers to use to scan the source. Concurrency int }
SyslogConfig defines the optional configuration for a syslog source.
type UnitReporter ¶ added in v3.45.0
type UnitReporter interface { UnitOk(ctx context.Context, unit SourceUnit) error UnitErr(ctx context.Context, err error) error }
UnitReporter defines the interface a source will use to report whether a unit was found during enumeration. Either method may be called any number of times. Implementors of this interface should allow for concurrent calls.