data

package
v5.0.0-rc1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 12, 2024 License: Apache-2.0 Imports: 21 Imported by: 0

README

Generating proto files

Installation

Follow instructions here to install necessary binaries.

Compile

Run make build-proto to compile proto.

Documentation

Overview

Package data contains util fns for reading and writing data handled by the cron job.

Index

Constants

This section is empty.

Variables

View Source
var File_cron_data_metadata_proto protoreflect.FileDescriptor
View Source
var File_cron_data_request_proto protoreflect.FileDescriptor

Functions

func BlobExists

func BlobExists(ctx context.Context, bucketURL, key string) (bool, error)

BlobExists checks whether a given `bucketURL/key` blob exists.

func GetBlobContent

func GetBlobContent(ctx context.Context, bucketURL, key string) ([]byte, error)

GetBlobContent returns the file content given a bucketURL and object key.

func GetBlobFilename

func GetBlobFilename(filename string, datetime time.Time) string

GetBlobFilename returns a blob key for a shard. Takes Time object and filename as input.

func GetBlobKeys

func GetBlobKeys(ctx context.Context, bucketURL string) ([]string, error)

GetBlobKeys returns all object keys for a given bucketURL.

func GetBlobKeysWithPrefix

func GetBlobKeysWithPrefix(ctx context.Context, bucketURL, filePrefix string) ([]string, error)

GetBlobKeysWithPrefix returns all object keys for a given bucketURL which start with prefix. The prefix can be used to specify directories.

func GetShardMetadataFilename

func GetShardMetadataFilename(datetime time.Time) string

GetShardMetadataFilename returns shard_metadata filename for a shard.

func GetShardNumFilename

func GetShardNumFilename(datetime time.Time) string

GetShardNumFilename returns shard_num filename for a shard.

func GetTransferStatusFilename

func GetTransferStatusFilename(datetime time.Time) string

GetTransferStatusFilename returns transfer_status filename for a shard.

func ParseBlobFilename

func ParseBlobFilename(key string) (time.Time, string, error)

ParseBlobFilename parses a blob key into a Time object.

func SortAndAppendFrom

func SortAndAppendFrom(in io.Reader, out io.Writer, newRepos []RepoFormat) error

SortAndAppendFrom reads from `in`, appends to newRepos and writes the sorted output to `out`.

func SortAndAppendTo

func SortAndAppendTo(out io.Writer, oldRepos, newRepos []RepoFormat) error

SortAndAppendTo appends `oldRepos` and `newRepos` before sorting and writing out the result to `out`.

func WriteTo

func WriteTo(out io.Writer, repos []RepoFormat) error

WriteTo writes `repos` to `out`.

func WriteToBlobStore

func WriteToBlobStore(ctx context.Context, bucketURL, filename string, data []byte) error

WriteToBlobStore creates and writes data to filename in bucketURL.

Types

type BucketSummary

type BucketSummary struct {
	// contains filtered or unexported fields
}

BucketSummary contains details about all the shards in a bucket grouped by their creation time.

func GetBucketSummary

func GetBucketSummary(ctx context.Context, bucketURL string) (*BucketSummary, error)

GetBucketSummary iterates through all files in a bucket and returns a BucketSummary with details on each set of shards grouped by creation time.

func (*BucketSummary) Shards

func (summary *BucketSummary) Shards() []*ShardSummary

Shards returns a slice of ShardSummary instances for each shard creation time.

type CSVStrings

type CSVStrings []string

CSVStrings is []string with support for CSV formatting.

func (CSVStrings) MarshalCSV

func (s CSVStrings) MarshalCSV() ([]byte, error)

MarshalCSV implements []string -> []byte serialization.

func (CSVStrings) ToString

func (s CSVStrings) ToString() []string

ToString converts CSVStrings -> []string.

func (*CSVStrings) UnmarshalCSV

func (s *CSVStrings) UnmarshalCSV(input []byte) error

UnmarshalCSV implements []byte -> []string de-serialization.

type Iterator

type Iterator interface {
	HasNext() bool
	Next() (RepoFormat, error)
}

Iterator interface is used to iterate through list of input repos for the cron job.

func MakeIteratorFrom

func MakeIteratorFrom(reader io.Reader) (Iterator, error)

MakeIteratorFrom returns an implementation of Iterator interface. Currently returns an instance of csvIterator.

func MakeNestedIterator

func MakeNestedIterator(iterators []Iterator) (Iterator, error)

type Repo

type Repo struct {
	Url      *string  `protobuf:"bytes,1,opt,name=url,proto3,oneof" json:"url,omitempty"`
	Commit   *string  `protobuf:"bytes,3,opt,name=commit,proto3,oneof" json:"commit,omitempty"`
	Metadata []string `protobuf:"bytes,2,rep,name=metadata,proto3" json:"metadata,omitempty"`
	// contains filtered or unexported fields
}

func (*Repo) Descriptor deprecated

func (*Repo) Descriptor() ([]byte, []int)

Deprecated: Use Repo.ProtoReflect.Descriptor instead.

func (*Repo) GetCommit

func (x *Repo) GetCommit() string

func (*Repo) GetMetadata

func (x *Repo) GetMetadata() []string

func (*Repo) GetUrl

func (x *Repo) GetUrl() string

func (*Repo) ProtoMessage

func (*Repo) ProtoMessage()

func (*Repo) ProtoReflect

func (x *Repo) ProtoReflect() protoreflect.Message

func (*Repo) Reset

func (x *Repo) Reset()

func (*Repo) String

func (x *Repo) String() string

type RepoFormat

type RepoFormat struct {
	Repo     string     `csv:"repo"`
	Metadata CSVStrings `csv:"metadata"`
}

RepoFormat is used to read input repos.

type ScorecardBatchRequest

type ScorecardBatchRequest struct {
	Repos    []*Repo                `protobuf:"bytes,4,rep,name=repos,proto3" json:"repos,omitempty"`
	ShardNum *int32                 `protobuf:"varint,2,opt,name=shard_num,json=shardNum,proto3,oneof" json:"shard_num,omitempty"`
	JobTime  *timestamppb.Timestamp `protobuf:"bytes,3,opt,name=job_time,json=jobTime,proto3,oneof" json:"job_time,omitempty"`
	// contains filtered or unexported fields
}

func (*ScorecardBatchRequest) Descriptor deprecated

func (*ScorecardBatchRequest) Descriptor() ([]byte, []int)

Deprecated: Use ScorecardBatchRequest.ProtoReflect.Descriptor instead.

func (*ScorecardBatchRequest) GetJobTime

func (x *ScorecardBatchRequest) GetJobTime() *timestamppb.Timestamp

func (*ScorecardBatchRequest) GetRepos

func (x *ScorecardBatchRequest) GetRepos() []*Repo

func (*ScorecardBatchRequest) GetShardNum

func (x *ScorecardBatchRequest) GetShardNum() int32

func (*ScorecardBatchRequest) ProtoMessage

func (*ScorecardBatchRequest) ProtoMessage()

func (*ScorecardBatchRequest) ProtoReflect

func (x *ScorecardBatchRequest) ProtoReflect() protoreflect.Message

func (*ScorecardBatchRequest) Reset

func (x *ScorecardBatchRequest) Reset()

func (*ScorecardBatchRequest) String

func (x *ScorecardBatchRequest) String() string

type ShardMetadata

type ShardMetadata struct {
	ShardLoc  *string `protobuf:"bytes,1,opt,name=shard_loc,json=shardLoc,proto3,oneof" json:"shard_loc,omitempty"`
	NumShard  *int32  `protobuf:"varint,2,opt,name=num_shard,json=numShard,proto3,oneof" json:"num_shard,omitempty"`
	CommitSha *string `protobuf:"bytes,3,opt,name=commit_sha,json=commitSha,proto3,oneof" json:"commit_sha,omitempty"`
	// contains filtered or unexported fields
}

func (*ShardMetadata) Descriptor deprecated

func (*ShardMetadata) Descriptor() ([]byte, []int)

Deprecated: Use ShardMetadata.ProtoReflect.Descriptor instead.

func (*ShardMetadata) GetCommitSha

func (x *ShardMetadata) GetCommitSha() string

func (*ShardMetadata) GetNumShard

func (x *ShardMetadata) GetNumShard() int32

func (*ShardMetadata) GetShardLoc

func (x *ShardMetadata) GetShardLoc() string

func (*ShardMetadata) ProtoMessage

func (*ShardMetadata) ProtoMessage()

func (*ShardMetadata) ProtoReflect

func (x *ShardMetadata) ProtoReflect() protoreflect.Message

func (*ShardMetadata) Reset

func (x *ShardMetadata) Reset()

func (*ShardMetadata) String

func (x *ShardMetadata) String() string

type ShardSummary

type ShardSummary struct {
	// contains filtered or unexported fields
}

ShardSummary is a summary of information about a set of shards with the same creation time.

func (*ShardSummary) CreationTime

func (s *ShardSummary) CreationTime() time.Time

CreationTime returns the time the shards were created. This corresponds to the job time generated by the controller.

func (*ShardSummary) IsCompleted

func (s *ShardSummary) IsCompleted(completionThreshold float64) bool

IsCompleted checks if the percentage of completed shards is over the desired completion threshold. It also returns false to prevent transfers in cases where the expected number of shards is 0, as either the .shard_metadata file is missing, or there is nothing to transfer anyway.

func (*ShardSummary) IsTransferred

func (s *ShardSummary) IsTransferred() bool

IsTransferred returns true if the shards have already been transferred. A true value indicates that a transfer should not occur, a false value indicates that a transfer should occur if IsCompleted() also returns true.

func (*ShardSummary) MarkTransferred

func (s *ShardSummary) MarkTransferred(ctx context.Context, bucketURL string) error

func (*ShardSummary) Metadata

func (s *ShardSummary) Metadata() []byte

Metadata returns the raw metadata about the bucket.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL