Documentation ¶
Overview ¶
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
msgp -file <path to dsort/extract/record_gen.go> -tests=false -marshal=false -unexported Code generated by the command above; see docs/msgp.md. DO NOT EDIT.
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
msgp -file <path to dsort/extract/shard.go> -tests=false -marshal=false -unexported Code generated by the command above; see docs/msgp.md. DO NOT EDIT.
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Package extract provides provides functions for working with compressed files
- Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Index ¶
- Constants
- func Ext(path string) string
- func ValidateAlgorithmFormatType(ty string) error
- type Creator
- type KeyExtractor
- type LoadContentFunc
- type Record
- type RecordExtractor
- type RecordManager
- func (rm *RecordManager) ChangeStoreType(fullContentPath, newStoreType string, value any, buf []byte) (n int64)
- func (rm *RecordManager) Cleanup()
- func (rm *RecordManager) EnqueueRecords(records *Records)
- func (rm *RecordManager) ExtractRecordWithBuffer(args extractRecordArgs) (size int64, err error)
- func (rm *RecordManager) ExtractionPaths() *sync.Map
- func (rm *RecordManager) FullContentPath(obj *RecordObj) string
- func (rm *RecordManager) MergeEnqueuedRecords()
- func (rm *RecordManager) RecordContents() *sync.Map
- type RecordObj
- type Records
- func (r *Records) All() []*Record
- func (z *Records) DecodeMsg(dc *msgp.Reader) (err error)
- func (r *Records) DeleteDup(name, ext string)
- func (r *Records) Drain()
- func (z *Records) EncodeMsg(en *msgp.Writer) (err error)
- func (r *Records) Exists(name, ext string) (exists bool)
- func (r *Records) Find(name string) (record *Record, exists bool)
- func (r *Records) Insert(records ...*Record)
- func (r *Records) Len() int
- func (r *Records) Less(i, j int, formatType string) (bool, error)
- func (*Records) MarshalJSON() ([]byte, error)
- func (z *Records) Msgsize() (s int)
- func (r *Records) RecordMemorySize() (size uint64)
- func (r *Records) Slice(start, end int) *Records
- func (r *Records) Swap(i, j int)
- func (r *Records) TotalObjectCount() int
- func (*Records) UnmarshalJSON([]byte) error
- type Shard
- type SingleKeyExtractor
Constants ¶
const ( FormatTypeInt = "int" FormatTypeFloat = "float" FormatTypeString = "string" )
const ( // Extract methods ExtractToMem cos.Bits = 1 << iota ExtractToDisk ExtractToWriter )
const ( // Values are small to save memory. OffsetStoreType = "o" SGLStoreType = "s" DiskStoreType = "d" )
Variables ¶
This section is empty.
Functions ¶
func Ext ¶
Ext returns the file name extension used by path. The extension is the suffix beginning at the FIRST (not final) dot in the final element of path; it is empty if there is no dot.
NOTE: This function one should be used instead of `filepath.Ext` in dSort.
Types ¶
type Creator ¶
type Creator interface { ExtractShard(lom *cluster.LOM, r cos.ReadReaderAt, extractor RecordExtractor, toDisk bool) (int64, int, error) CreateShard(s *Shard, w io.Writer, loadContent LoadContentFunc) (int64, error) UsingCompression() bool SupportsOffset() bool MetadataSize() int64 }
Creator is interface which describes set of functions which each shard creator should implement.
func NewTarExtractCreator ¶
func NewTargzExtractCreator ¶
func NewZipExtractCreator ¶
func NopExtractCreator ¶
type KeyExtractor ¶
type KeyExtractor interface { PrepareExtractor(name string, r cos.ReadSizer, ext string) (cos.ReadSizer, *SingleKeyExtractor, bool) // ExtractKey extracts key from either name or reader (file/sgl) ExtractKey(ske *SingleKeyExtractor) (any, error) }
func NewContentKeyExtractor ¶
func NewContentKeyExtractor(ty, ext string) (KeyExtractor, error)
func NewMD5KeyExtractor ¶
func NewMD5KeyExtractor() (KeyExtractor, error)
func NewNameKeyExtractor ¶
func NewNameKeyExtractor() (KeyExtractor, error)
type LoadContentFunc ¶
LoadContentFunc is type for the function which loads content from the either remote or local target.
type Record ¶
type Record struct { Key any `msg:"k" json:"k"` // Used to determine the sorting order. Name string `msg:"n" json:"n"` // Name which uniquely identifies record across all shards. DaemonID string `msg:"d" json:"d"` // ID of the target which maintains the contents for this record. // All objects associated with given record. Record can be composed of // multiple objects which have the same name but different extension. Objects []*RecordObj `msg:"o" json:"o"` }
Record represents the metadata corresponding to a single file from an archive file.
func (*Record) MakeUniqueName ¶
type RecordExtractor ¶
type RecordManager ¶
type RecordManager struct { Records *Records // contains filtered or unexported fields }
func NewRecordManager ¶
func NewRecordManager(t cluster.Target, bck cmn.Bck, extension string, extractCreator Creator, keyExtractor KeyExtractor, onDuplicatedRecords func(string) error) *RecordManager
func (*RecordManager) ChangeStoreType ¶
func (rm *RecordManager) ChangeStoreType(fullContentPath, newStoreType string, value any, buf []byte) (n int64)
func (*RecordManager) Cleanup ¶
func (rm *RecordManager) Cleanup()
func (*RecordManager) EnqueueRecords ¶
func (rm *RecordManager) EnqueueRecords(records *Records)
func (*RecordManager) ExtractRecordWithBuffer ¶
func (rm *RecordManager) ExtractRecordWithBuffer(args extractRecordArgs) (size int64, err error)
func (*RecordManager) ExtractionPaths ¶
func (rm *RecordManager) ExtractionPaths() *sync.Map
func (*RecordManager) FullContentPath ¶
func (rm *RecordManager) FullContentPath(obj *RecordObj) string
func (*RecordManager) MergeEnqueuedRecords ¶
func (rm *RecordManager) MergeEnqueuedRecords()
func (*RecordManager) RecordContents ¶
func (rm *RecordManager) RecordContents() *sync.Map
type RecordObj ¶
type RecordObj struct { // Can represent, one of the following: // * Shard name - in case offset is used. // * Key for extractCreator's RecordContents - records stored in SGLs. // * Location (full path) on disk where extracted record has been placed. // // To get path for given object you need to use `FullContentPath` method. ContentPath string `msg:"p" json:"p"` // Filesystem file type where the shard is stored - used to determine // location for content path when asking filesystem. ObjectFileType string `msg:"ft" json:"ft"` // Determines where the record has been stored, can be either: OffsetStoreType, // SGLStoreType, DiskStoreType. StoreType string `msg:"st" json:"st"` // If set, determines the offset in shard file where the record begins. Offset int64 `msg:"f,omitempty" json:"f,string,omitempty"` MetadataSize int64 `msg:"ms" json:"ms,string"` Size int64 `msg:"s" json:"s,string"` Extension string `msg:"e" json:"e"` }
RecordObj describes single object of record. Objects inside single record differs by extension.
type Records ¶
Records abstract array of records. It safe to be used concurrently.
func NewRecords ¶
NewRecords creates new instance of Records struct and allocates n places for the actual Record's
func (*Records) MarshalJSON ¶
func (*Records) Msgsize ¶
Msgsize returns an upper bound estimate of the number of bytes occupied by the serialized message
func (*Records) RecordMemorySize ¶
func (*Records) TotalObjectCount ¶
func (*Records) UnmarshalJSON ¶
type Shard ¶
type Shard struct { // Size is total size of shard to be created. Size int64 `msg:"s"` // Records contains all metadata to construct the shard. Records *Records `msg:"r"` // Name determines the output name of the shard. Name string `msg:"n"` }
Shard represents the metadata required to construct a single shard (aka an archive file).
func (*Shard) MarshalJSON ¶
func (*Shard) Msgsize ¶
Msgsize returns an upper bound estimate of the number of bytes occupied by the serialized message
func (*Shard) UnmarshalJSON ¶
type SingleKeyExtractor ¶
type SingleKeyExtractor struct {
// contains filtered or unexported fields
}