embargo

package module
v0.0.0-...-8f1fad7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 27, 2020 License: Apache-2.0 Imports: 22 Imported by: 0

README

M-Lab no longer needs to embargo things. This repository is now archived.

This is the package for embargo process on cloud platform.

branch travis-ci report-card coveralls
master Travis Build Status Coverage Status

Documentation

Overview

Package embargo performs embargo for all sidestream data. For all data that are more than one year old, or server IP in the list of M-Lab server IP list except the samknow sites, the sidestream test will be published. Otherwise the test will be embargoed and saved in a private bucket. It will published later when it is more than one year old.

Package embargo implemented site IP loading from public URL or local file and check whether an IP is in the whitelist which is the list of all sites exceot the samknows sites.

Parse filename and return componants like log-time, IP, etc. Filename example: 20170315T01:00:00Z_173.205.3.39_0.web100

Package gcs implements a simple library for basic operations given bucket names and file name/prefix, such as ls, cp, rm, etc. on Google Cloud Storage.

Implement the umembargo process when the previously embargoed files are more than one year old.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CompareBuckets

func CompareBuckets(sourceBucket string, destBucket string) bool

CompareBuckets compares whether 2 buckets have exactly same files. Return true if they are the same.

func CopyOneFile

func CopyOneFile(sourceBucket string, destBucket string, fileName string) bool

CopyOneFile copies one file from one bucket to another bucket. Return true if succeed. ("cp")

func CreateBucket

func CreateBucket(projectID string, bucketName string) bool

CreateBucket creates a new bucket. Return true if it already exsits or is created successfully.

func CreateService

func CreateService() *storage.Service

CreateService creates GCS service used by the following functions.

func DeleteBucket

func DeleteBucket(bucketName string) bool

Delete the bucket if it is empty. ("rmdir")

func DeleteFiles

func DeleteFiles(bucketName string, prefixFileName string) bool

DeleteFiles deletes all files with specified prefix from bucket. ("rm")

func FilterSiteIPs

func FilterSiteIPs(body []byte) (map[string]struct{}, error)

FilterSiteIPs parses bytes and returns array of struct with site IPs filtering out all samknows sites. TODO: make the filter use positive checks, including the list of things other than samknows, rather than excluding samknows.

func FormatDateAsInt

func FormatDateAsInt(t time.Time) int

FormatDateAsInt return a date in interger as format yyyymmdd.

func GetFileNamesFromBucket

func GetFileNamesFromBucket(bucketName string) []string

GetFileNamesFromBucket returns array of file names in that bucket given the bucket name,. ("ls")

func GetFileNamesWithPrefix

func GetFileNamesWithPrefix(service *storage_v1.Service, bucketName string, prefixFileName string) (map[string]bool, error)

Get filenames for given bucket with the given prefix. Use the service

func SyncTwoBuckets

func SyncTwoBuckets(sourceBucket string, destBucket string, prefixFileName string) bool

SyncTwoBuckets copies all files with PrefixFileName from SourceBucke to DestBucket if there is no one yet. Return true if succeed.

func UnEmbargoOneDayLegacyFiles

func UnEmbargoOneDayLegacyFiles(sourceBucket string, destBucket string, prefixFileName string) error

UnEmbargoOneDayLegacyFiles unembargos one day data in the sourceBucket, and writes the output to destBucket. The date is used as prefixFileName in format sidestream/yyyy/mm/dd

func UnembargoCron

func UnembargoCron(date int) error

func UpdateWhitelist

func UpdateWhitelist() error

UpdateWhitelist loads the site IP json file again and updates the whitelist in memory.

func UploadFile

func UploadFile(bucketName string, fileName string, targetdir string) bool

UploadFile uploads one file from local path to bucket. ("cp")

Types

type EmbargoConfig

type EmbargoConfig struct {
	// contains filtered or unexported fields
}

EmbargoConfig is a struct that performs all embargo procedures.

var EmbargoSingleton *EmbargoConfig

EmbargoSingleton is the singleton object that is the pointer of the EmbargoConfig object.

func GetEmbargoConfig

func GetEmbargoConfig(siteIPFile string) (*EmbargoConfig, error)

GetEmbargoConfig creates a new EmbargoConfig and returns it.

func (*EmbargoConfig) EmbargoOneDayData

func (ec *EmbargoConfig) EmbargoOneDayData(date string, cutoffDate int) error

EmbargoOneDayData do embargo for one day files. The input date is string in format yyyymmdd The cutoffDate is integer in format yyyymmdd TODO: handle midway crash. Since the source bucket is unchanged, if it failed in the middle, we just rerun it for that specific day.

func (*EmbargoConfig) EmbargoOneTar

func (ec *EmbargoConfig) EmbargoOneTar(content io.Reader, tarfileName string, moreThanOneYear bool) error

EmbargoOneTar processes one tar file, splits it to 2 files. The embargoed files will be saved in a private bucket, and the unembargoed part will be save in a public bucket. The private file will have a different name, so it can be copied to public bucket directly when it becomes one year old. The tarfileName is like 20170516T000000Z-mlab1-atl06-sidestream-0000.tgz

func (*EmbargoConfig) EmbargoSingleFile

func (ec *EmbargoConfig) EmbargoSingleFile(filename string) error

EmbargoSingleFile embargo the input file.

func (*EmbargoConfig) SplitFile

func (ec *EmbargoConfig) SplitFile(content io.Reader, moreThanOneYear bool) (bytes.Buffer, bytes.Buffer, error)

SplitFile splits one tar files into 2 buffers.

func (*EmbargoConfig) WriteResults

func (ec *EmbargoConfig) WriteResults(tarfileName string, embargoBuf, publicBuf bytes.Buffer) error

WriteResults writes results to GCS.

type FileName

type FileName struct {
	Name string
}

func (*FileName) GetDate

func (f *FileName) GetDate() string

func (*FileName) GetLocalIP

func (f *FileName) GetLocalIP() string

GetLocalIP parse the filename and return IP. For old format, it will return empty string.

type FileNameParser

type FileNameParser interface {
	GetLocalIP()
	GetDate()
}

type Site

type Site struct {
	Hostname string `json:"hostname"`
	Ipv4     string `json:"ipv4"`
	Ipv6     string `json:"ipv6"`
}

Site is a struct for parsing json file.

type UnembargoConfig

type UnembargoConfig struct {
	// contains filtered or unexported fields
}

func NewUnembargoConfig

func NewUnembargoConfig(privateBucketName, publicBucketName string) *UnembargoConfig

func (*UnembargoConfig) Unembargo

func (nc *UnembargoConfig) Unembargo(date int) error

Unembargo unembargo the data of the input date in format yyyymmdd. TODO(dev): add more validity check for input date.

type WhitelistChecker

type WhitelistChecker struct {
	EmbargoWhiteList map[string]struct{}
}

WhitelistChecker is a struct containing map EmbargoWhiteList which is the list of M-Lab site IP EXCEPT the Samknows sites.

func (*WhitelistChecker) CheckInWhiteList

func (wc *WhitelistChecker) CheckInWhiteList(fileName string) bool

CheckInWhiteList checks whether the IP in fileName is in the embargo whitelist. The filename is like: 20170225T23:00:00Z_4.34.58.34_0.web100 file with IP that is in the site IP list, return true file with IP not in the site IP list, return false

func (*WhitelistChecker) LoadFromLocalWhitelist

func (wc *WhitelistChecker) LoadFromLocalWhitelist(path string) error

LoadFromLocalWhitelist loads embargo IP whitelist from a local file.

func (*WhitelistChecker) LoadFromURL

func (wc *WhitelistChecker) LoadFromURL(jsonURL string) error

LoadFromGCS loads the embargo IP whitelist from public URL. TODO: add unittest for this func.

Directories

Path Synopsis
The metrics package defines prometheus metric types and provides convenience methods to add accounting to embargo service..
The metrics package defines prometheus metric types and provides convenience methods to add accounting to embargo service..

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL