iceberg

package module

v0.0.0-...-aecf591 Latest Latest Go to latest Published: May 2, 2024 License: Apache-2.0 Imports: 24 Imported by: 2

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/polarsignals/iceberg-go

README ¶

Iceberg Golang

iceberg is a Golang implementation of the Iceberg table spec.

Feature Support / Roadmap

FileSystem Support

Filesystem Type	Supported
S3	X
Google Cloud Storage
Azure Blob Storage
Local Filesystem	X

Metadata

Operation	Supported
Get Schema	X
Get Snapshots	X
Get Sort Orders	X
Get Partition Specs	X
Get Manifests	X
Create New Manifests	X
Plan Scan
Plan Scan for Snapshot

Catalog Support

Operation	REST	Hive	DynamoDB	Glue
Load Table				X
List Tables				X
Create Table
Update Current Snapshot
Create New Snapshot
Rename Table
Drop Table
Alter Table
Set Table Properties
Create Namespace
Drop Namespace
Set Namespace Properties

Read/Write Data Support

No intrinsic support for reading/writing data yet
Data can be manually read currently by retrieving data files via Manifests.
Plan to add Apache Arrow support eventually.

Get in Touch

Iceberg community

Documentation ¶

Index ¶

Constants
Variables
func AvroSchemaFromEntriesV1(entries []ManifestEntry) string
func DataFileFromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (DataFile, *Schema, error)
func IndexByID(schema *Schema) (map[int]NestedField, error)
func IndexByName(schema *Schema) (map[string]int, error)
func IndexNameByID(schema *Schema) (map[int]string, error)
func ManifestEntryV1FromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (ManifestEntry, *Schema, error)
func Version() string
func Visit[T any](sc *Schema, visitor SchemaVisitor[T]) (res T, err error)
func WriteManifestListV1(w io.Writer, files []ManifestFile) error
func WriteManifestV1(w io.Writer, schema *Schema, entries []ManifestEntry) error
type AfterFieldVisitor
type AfterListElementVisitor
type AfterMapKeyVisitor
type AfterMapValueVisitor
type BeforeFieldVisitor
type BeforeListElementVisitor
type BeforeMapKeyVisitor
type BeforeMapValueVisitor
type BinaryType
- func (BinaryType) Equals(other Type) bool
- func (BinaryType) String() string
- func (BinaryType) Type() string
type BooleanType
- func (BooleanType) Equals(other Type) bool
- func (BooleanType) String() string
- func (BooleanType) Type() string
type BucketTransform
- func (t BucketTransform) MarshalText() ([]byte, error)
- func (BucketTransform) ResultType(Type) Type
- func (t BucketTransform) String() string
type DataFile
type DataFileBuilder
- func NewDataFileV1Builder(FilePath string, FileFormat FileFormat, PartitionSpec map[string]any, ...) DataFileBuilder
- func (builder DataFileBuilder) Build() DataFile
- func (d DataFileBuilder) ColumnSizes() map[int]int64
- func (d DataFileBuilder) ContentType() ManifestEntryContent
- func (d DataFileBuilder) Count() int64
- func (d DataFileBuilder) DistinctValueCounts() map[int]int64
- func (d DataFileBuilder) EqualityFieldIDs() []int
- func (d DataFileBuilder) FileFormat() FileFormat
- func (d DataFileBuilder) FilePath() string
- func (d DataFileBuilder) FileSizeBytes() int64
- func (d DataFileBuilder) KeyMetadata() []byte
- func (d DataFileBuilder) LowerBoundValues() map[int][]byte
- func (d DataFileBuilder) NaNValueCounts() map[int]int64
- func (d DataFileBuilder) NullValueCounts() map[int]int64
- func (d DataFileBuilder) Partition() map[string]any
- func (d DataFileBuilder) SortOrderID() *int
- func (d DataFileBuilder) SplitOffsets() []int64
- func (d DataFileBuilder) UpperBoundValues() map[int][]byte
- func (d DataFileBuilder) ValueCounts() map[int]int64
- func (builder DataFileBuilder) WithColumnSizes(columnSizes map[int]int64) DataFileBuilder
- func (builder DataFileBuilder) WithDistinctCounts(distinctCounts map[int]int64) DataFileBuilder
- func (builder DataFileBuilder) WithKeyMetadata(keyMetadata []byte) DataFileBuilder
- func (builder DataFileBuilder) WithLowerBounds(lowerBounds map[int][]byte) DataFileBuilder
- func (builder DataFileBuilder) WithNanValueCounts(nanValueCounts map[int]int64) DataFileBuilder
- func (builder DataFileBuilder) WithNullValueCounts(nullValueCounts map[int]int64) DataFileBuilder
- func (builder DataFileBuilder) WithSortOrderID(sortOrderID int) DataFileBuilder
- func (builder DataFileBuilder) WithSplitOffsets(splitOffsets []int64) DataFileBuilder
- func (builder DataFileBuilder) WithUpperBounds(upperBounds map[int][]byte) DataFileBuilder
- func (builder DataFileBuilder) WithValueCounts(valueCounts map[int]int64) DataFileBuilder
type Date
type DateType
- func (DateType) Equals(other Type) bool
- func (DateType) String() string
- func (DateType) Type() string
type DayTransform
- func (t DayTransform) MarshalText() ([]byte, error)
- func (DayTransform) ResultType(Type) Type
- func (DayTransform) String() string
type DecimalType
- func DecimalTypeOf(prec, scale int) DecimalType
- func (d DecimalType) Equals(other Type) bool
- func (d DecimalType) Precision() int
- func (d DecimalType) Scale() int
- func (d DecimalType) String() string
- func (d DecimalType) Type() string
type FieldSummary
type FileFormat
type FixedType
- func FixedTypeOf(n int) FixedType
- func (f FixedType) Equals(other Type) bool
- func (f FixedType) Len() int
- func (f FixedType) String() string
- func (f FixedType) Type() string
type Float32Type
- func (Float32Type) Equals(other Type) bool
- func (Float32Type) String() string
- func (Float32Type) Type() string
type Float64Type
- func (Float64Type) Equals(other Type) bool
- func (Float64Type) String() string
- func (Float64Type) Type() string
type HourTransform
- func (t HourTransform) MarshalText() ([]byte, error)
- func (HourTransform) ResultType(Type) Type
- func (HourTransform) String() string
type IdentityTransform
- func (t IdentityTransform) MarshalText() ([]byte, error)
- func (IdentityTransform) ResultType(t Type) Type
- func (IdentityTransform) String() string
type Int32Type
- func (Int32Type) Equals(other Type) bool
- func (Int32Type) String() string
- func (Int32Type) Type() string
type Int64Type
- func (Int64Type) Equals(other Type) bool
- func (Int64Type) String() string
- func (Int64Type) Type() string
type ListType
- func (l *ListType) ElementField() NestedField
- func (l *ListType) Equals(other Type) bool
- func (l *ListType) Fields() []NestedField
- func (l *ListType) MarshalJSON() ([]byte, error)
- func (l *ListType) String() string
- func (*ListType) Type() string
- func (l *ListType) UnmarshalJSON(b []byte) error
type ManifestContent
type ManifestEntry
- func NewManifestEntryV1(entryStatus ManifestEntryStatus, snapshotID int64, data DataFile) ManifestEntry
type ManifestEntryContent
type ManifestEntryStatus
type ManifestFile
- func ReadManifestList(in io.Reader) ([]ManifestFile, error)
type ManifestV1Builder
- func NewManifestV1Builder(path string, length int64, partitionSpecID int32, addedSnapshotID int64) *ManifestV1Builder
- func (b *ManifestV1Builder) AddedFiles(cnt int32) *ManifestV1Builder
- func (b *ManifestV1Builder) AddedRows(cnt int64) *ManifestV1Builder
- func (b *ManifestV1Builder) Build() ManifestFile
- func (b *ManifestV1Builder) DeletedFiles(cnt int32) *ManifestV1Builder
- func (b *ManifestV1Builder) DeletedRows(cnt int64) *ManifestV1Builder
- func (b *ManifestV1Builder) ExistingFiles(cnt int32) *ManifestV1Builder
- func (b *ManifestV1Builder) ExistingRows(cnt int64) *ManifestV1Builder
- func (b *ManifestV1Builder) KeyMetadata(km []byte) *ManifestV1Builder
- func (b *ManifestV1Builder) Partitions(p []FieldSummary) *ManifestV1Builder
type ManifestV2Builder
- func NewManifestV2Builder(path string, length int64, partitionSpecID int32, content ManifestContent, ...) *ManifestV2Builder
- func (b *ManifestV2Builder) AddedFiles(cnt int32) *ManifestV2Builder
- func (b *ManifestV2Builder) AddedRows(cnt int64) *ManifestV2Builder
- func (b *ManifestV2Builder) Build() ManifestFile
- func (b *ManifestV2Builder) DeletedFiles(cnt int32) *ManifestV2Builder
- func (b *ManifestV2Builder) DeletedRows(cnt int64) *ManifestV2Builder
- func (b *ManifestV2Builder) ExistingFiles(cnt int32) *ManifestV2Builder
- func (b *ManifestV2Builder) ExistingRows(cnt int64) *ManifestV2Builder
- func (b *ManifestV2Builder) KeyMetadata(km []byte) *ManifestV2Builder
- func (b *ManifestV2Builder) Partitions(p []FieldSummary) *ManifestV2Builder
- func (b *ManifestV2Builder) SequenceNum(num, minSeqNum int64) *ManifestV2Builder
type MapType
- func (m *MapType) Equals(other Type) bool
- func (m *MapType) Fields() []NestedField
- func (m *MapType) KeyField() NestedField
- func (m *MapType) MarshalJSON() ([]byte, error)
- func (m *MapType) String() string
- func (*MapType) Type() string
- func (m *MapType) UnmarshalJSON(b []byte) error
- func (m *MapType) ValueField() NestedField
type MonthTransform
- func (t MonthTransform) MarshalText() ([]byte, error)
- func (MonthTransform) ResultType(Type) Type
- func (MonthTransform) String() string
type NestedField
- func (n *NestedField) Equals(other NestedField) bool
- func (n NestedField) MarshalJSON() ([]byte, error)
- func (n NestedField) String() string
- func (n *NestedField) UnmarshalJSON(b []byte) error
type NestedType
type PartitionField
- func (p *PartitionField) String() string
- func (p *PartitionField) UnmarshalJSON(b []byte) error
type PartitionSpec
- func NewPartitionSpec(fields ...PartitionField) PartitionSpec
- func NewPartitionSpecID(id int, fields ...PartitionField) PartitionSpec
- func (ps *PartitionSpec) CompatibleWith(other *PartitionSpec) bool
- func (ps *PartitionSpec) Equals(other PartitionSpec) bool
- func (ps *PartitionSpec) Field(i int) PartitionField
- func (ps *PartitionSpec) FieldsBySourceID(fieldID int) []PartitionField
- func (ps *PartitionSpec) ID() int
- func (ps *PartitionSpec) IsUnpartitioned() bool
- func (ps *PartitionSpec) LastAssignedFieldID() int
- func (ps PartitionSpec) MarshalJSON() ([]byte, error)
- func (ps *PartitionSpec) NumFields() int
- func (ps *PartitionSpec) PartitionType(schema *Schema) *StructType
- func (ps PartitionSpec) String() string
- func (ps *PartitionSpec) UnmarshalJSON(b []byte) error
type PrimitiveType
type Properties
type Schema
- func NewSchema(id int, fields ...NestedField) *Schema
- func NewSchemaWithIdentifiers(id int, identifierIDs []int, fields ...NestedField) *Schema
- func PruneColumns(schema *Schema, selected map[int]Void, selectFullTypes bool) (*Schema, error)
- func (s *Schema) AsStruct() StructType
- func (s *Schema) Equals(other *Schema) bool
- func (s *Schema) Field(i int) NestedField
- func (s *Schema) Fields() []NestedField
- func (s *Schema) FindColumnName(fieldID int) (string, bool)
- func (s *Schema) FindFieldByID(id int) (NestedField, bool)
- func (s *Schema) FindFieldByName(name string) (NestedField, bool)
- func (s *Schema) FindFieldByNameCaseInsensitive(name string) (NestedField, bool)
- func (s *Schema) FindTypeByID(id int) (Type, bool)
- func (s *Schema) FindTypeByName(name string) (Type, bool)
- func (s *Schema) FindTypeByNameCaseInsensitive(name string) (Type, bool)
- func (s *Schema) HighestFieldID() int
- func (s *Schema) MarshalJSON() ([]byte, error)
- func (s *Schema) Merge(other *Schema) (*Schema, error)
- func (s *Schema) NumFields() int
- func (s *Schema) Select(caseSensitive bool, names ...string) (*Schema, error)
- func (s *Schema) String() string
- func (s *Schema) Type() string
- func (s *Schema) UnmarshalJSON(b []byte) error
type SchemaVisitor
type StringType
- func (StringType) Equals(other Type) bool
- func (StringType) String() string
- func (StringType) Type() string
type StructType
- func (s *StructType) Equals(other Type) bool
- func (s *StructType) Fields() []NestedField
- func (s *StructType) MarshalJSON() ([]byte, error)
- func (s *StructType) String() string
- func (*StructType) Type() string
type Time
type TimeType
- func (TimeType) Equals(other Type) bool
- func (TimeType) String() string
- func (TimeType) Type() string
type Timestamp
type TimestampType
- func (TimestampType) Equals(other Type) bool
- func (TimestampType) String() string
- func (TimestampType) Type() string
type TimestampTzType
- func (TimestampTzType) Equals(other Type) bool
- func (TimestampTzType) String() string
- func (TimestampTzType) Type() string
type Transform
- func ParseTransform(s string) (Transform, error)
type TruncateTransform
- func (t TruncateTransform) MarshalText() ([]byte, error)
- func (TruncateTransform) ResultType(t Type) Type
- func (t TruncateTransform) String() string
type Type
type UUIDType
- func (UUIDType) Equals(other Type) bool
- func (UUIDType) String() string
- func (UUIDType) Type() string
type Void
type VoidTransform
- func (t VoidTransform) MarshalText() ([]byte, error)
- func (VoidTransform) ResultType(t Type) Type
- func (VoidTransform) String() string
type YearTransform
- func (t YearTransform) MarshalText() ([]byte, error)
- func (YearTransform) ResultType(Type) Type
- func (YearTransform) String() string

Constants ¶

View Source

const (
	AvroManifestListV1Schema = `` /* 2271-byte string literal not displayed */

	AvroManifestListV2Schema = `` /* 3665-byte string literal not displayed */

	AvroManifestEntryV2Schema = `` /* 9599-byte string literal not displayed */

	// EntryV1SchemaTmpl is a Go text/template template for the Avro schema of a v1 manifest entry.
	// It expects a map[string]any as the partitions as as the templated object. It calls a custom Type function to determine the Avro type for each partition value.
	// It also calls a PartitionFieldID function to determine the field-id for each partition value.
	AvroEntryV1SchemaTmpl = `` /* 9279-byte string literal not displayed */

)

View Source

const (
	InitialPartitionSpecID = 0
)

Variables ¶

View Source

var (
	ErrInvalidTypeString = errors.New("invalid type")
	ErrNotImplemented    = errors.New("not implemented")
	ErrInvalidArgument   = errors.New("invalid argument")
	ErrInvalidSchema     = errors.New("invalid schema")
	ErrInvalidTransform  = errors.New("invalid transform syntax")
)

View Source

var PositionalDeleteSchema = NewSchema(0,
	NestedField{ID: 2147483546, Type: PrimitiveTypes.String, Name: "file_path", Required: true},
	NestedField{ID: 2147483545, Type: PrimitiveTypes.Int32, Name: "pos", Required: true},
)

View Source

var PrimitiveTypes = struct {
	Bool        PrimitiveType
	Int32       PrimitiveType
	Int64       PrimitiveType
	Float32     PrimitiveType
	Float64     PrimitiveType
	Date        PrimitiveType
	Time        PrimitiveType
	Timestamp   PrimitiveType
	TimestampTz PrimitiveType
	String      PrimitiveType
	Binary      PrimitiveType
	UUID        PrimitiveType
}{
	Bool:        BooleanType{},
	Int32:       Int32Type{},
	Int64:       Int64Type{},
	Float32:     Float32Type{},
	Float64:     Float64Type{},
	Date:        DateType{},
	Time:        TimeType{},
	Timestamp:   TimestampType{},
	TimestampTz: TimestampTzType{},
	String:      StringType{},
	Binary:      BinaryType{},
	UUID:        UUIDType{},
}

View Source

var UnpartitionedSpec = &PartitionSpec{id: 0}

UnpartitionedSpec is the default unpartitioned spec which can be used for comparisons or to just provide a convenience for referencing the same unpartitioned spec object.

Functions ¶

func AvroSchemaFromEntriesV1 ¶

func AvroSchemaFromEntriesV1(entries []ManifestEntry) string

AvroSchemaFromEntriesV1 creates an Avro schema from the given manifest entries. The entries must all share the same partition spec.

func DataFileFromParquet ¶

func DataFileFromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (DataFile, *Schema, error)

func IndexByID ¶

func IndexByID(schema *Schema) (map[int]NestedField, error)

IndexByID performs a post-order traversal of the given schema and returns a mapping from field ID to field.

func IndexByName ¶

func IndexByName(schema *Schema) (map[string]int, error)

IndexByName performs a post-order traversal of the schema and returns a mapping from field name to field ID.

func IndexNameByID ¶

func IndexNameByID(schema *Schema) (map[int]string, error)

IndexNameByID performs a post-order traversal of the schema and returns a mapping from field ID to field name.

func ManifestEntryV1FromParquet ¶

func ManifestEntryV1FromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (ManifestEntry, *Schema, error)

func Version ¶

func Version() string

func Visit ¶

func Visit[T any](sc *Schema, visitor SchemaVisitor[T]) (res T, err error)

Visit accepts a visitor and performs a post-order traversal of the given schema.

func WriteManifestListV1 ¶

func WriteManifestListV1(w io.Writer, files []ManifestFile) error

func WriteManifestV1 ¶

func WriteManifestV1(w io.Writer, schema *Schema, entries []ManifestEntry) error

Types ¶

type AfterFieldVisitor ¶

type AfterFieldVisitor interface {
	AfterField(field NestedField)
}

type AfterListElementVisitor ¶

type AfterListElementVisitor interface {
	AfterListElement(elem NestedField)
}

type AfterMapKeyVisitor ¶

type AfterMapKeyVisitor interface {
	AfterMapKey(key NestedField)
}

type AfterMapValueVisitor ¶

type AfterMapValueVisitor interface {
	AfterMapValue(value NestedField)
}

type BeforeFieldVisitor ¶

type BeforeFieldVisitor interface {
	BeforeField(field NestedField)
}

type BeforeListElementVisitor ¶

type BeforeListElementVisitor interface {
	BeforeListElement(elem NestedField)
}

type BeforeMapKeyVisitor ¶

type BeforeMapKeyVisitor interface {
	BeforeMapKey(key NestedField)
}

type BeforeMapValueVisitor ¶

type BeforeMapValueVisitor interface {
	BeforeMapValue(value NestedField)
}

type BinaryType ¶

type BinaryType struct{}

func (BinaryType) Equals ¶

func (BinaryType) Equals(other Type) bool

func (BinaryType) String ¶

func (BinaryType) String() string

func (BinaryType) Type ¶

func (BinaryType) Type() string

type BooleanType ¶

type BooleanType struct{}

func (BooleanType) Equals ¶

func (BooleanType) Equals(other Type) bool

func (BooleanType) String ¶

func (BooleanType) String() string

func (BooleanType) Type ¶

func (BooleanType) Type() string

type BucketTransform ¶

type BucketTransform struct {
	NumBuckets int
}

BucketTransform transforms values into a bucket partition value. It is parameterized by a number of buckets. Bucket partition transforms use a 32-bit hash of the source value to produce a positive value by mod the bucket number.

func (BucketTransform) MarshalText ¶

func (t BucketTransform) MarshalText() ([]byte, error)

func (BucketTransform) ResultType ¶

func (BucketTransform) ResultType(Type) Type

func (BucketTransform) String ¶

func (t BucketTransform) String() string

type DataFile ¶

type DataFile interface {
	// ContentType is the type of the content stored by the data file,
	// either Data, Equality deletes, or Position deletes. All v1 files
	// are Data files.
	ContentType() ManifestEntryContent
	// FilePath is the full URI for the file, complete with FS scheme.
	FilePath() string
	// FileFormat is the format of the data file, AVRO, Orc, or Parquet.
	FileFormat() FileFormat
	// Partition returns a mapping of field name to partition value for
	// each of the partition spec's fields.
	Partition() map[string]any
	// Count returns the number of records in this file.
	Count() int64
	// FileSizeBytes is the total file size in bytes.
	FileSizeBytes() int64
	// ColumnSizes is a mapping from column id to the total size on disk
	// of all regions that store the column. Does not include bytes
	// necessary to read other columns, like footers. Map will be nil for
	// row-oriented formats (avro).
	ColumnSizes() map[int]int64
	// ValueCounts is a mapping from column id to the number of values
	// in the column, including null and NaN values.
	ValueCounts() map[int]int64
	// NullValueCounts is a mapping from column id to the number of
	// null values in the column.
	NullValueCounts() map[int]int64
	// NaNValueCounts is a mapping from column id to the number of NaN
	// values in the column.
	NaNValueCounts() map[int]int64
	// DistictValueCounts is a mapping from column id to the number of
	// distinct values in the column. Distinct counts must be derived
	// using values in the file by counting or using sketches, but not
	// using methods like merging existing distinct counts.
	DistinctValueCounts() map[int]int64
	// LowerBoundValues is a mapping from column id to the lower bounded
	// value of the column, serialized as binary. Each value in the column
	// must be less than or requal to all non-null, non-NaN values in the
	// column for the file.
	LowerBoundValues() map[int][]byte
	// UpperBoundValues is a mapping from column id to the upper bounded
	// value of the column, serialized as binary. Each value in the column
	// must be greater than or equal to all non-null, non-NaN values in
	// the column for the file.
	UpperBoundValues() map[int][]byte
	// KeyMetadata is implementation-specific key metadata for encryption.
	KeyMetadata() []byte
	// SplitOffsets are the split offsets for the data file. For example,
	// all row group offsets in a Parquet file. Must be sorted ascending.
	SplitOffsets() []int64
	// EqualityFieldIDs are used to determine row equality in equality
	// delete files. It is required when the content type is
	// EntryContentEqDeletes.
	EqualityFieldIDs() []int
	// SortOrderID returns the id representing the sort order for this
	// file, or nil if there is no sort order.
	SortOrderID() *int
}

DataFile is the interface for reading the information about a given data file indicated by an entry in a manifest list.

type DataFileBuilder ¶

type DataFileBuilder struct {
	// contains filtered or unexported fields
}

func NewDataFileV1Builder ¶

func NewDataFileV1Builder(
	FilePath string,
	FileFormat FileFormat,
	PartitionSpec map[string]any,
	RecordCount int64,
	FileSizeBytes int64,
) DataFileBuilder

func (DataFileBuilder) Build ¶

func (builder DataFileBuilder) Build() DataFile

func (DataFileBuilder) ColumnSizes ¶

func (d DataFileBuilder) ColumnSizes() map[int]int64

func (DataFileBuilder) ContentType ¶

func (d DataFileBuilder) ContentType() ManifestEntryContent

func (DataFileBuilder) Count ¶

func (d DataFileBuilder) Count() int64

func (DataFileBuilder) DistinctValueCounts ¶

func (d DataFileBuilder) DistinctValueCounts() map[int]int64

func (DataFileBuilder) EqualityFieldIDs ¶

func (d DataFileBuilder) EqualityFieldIDs() []int

func (DataFileBuilder) FileFormat ¶

func (d DataFileBuilder) FileFormat() FileFormat

func (DataFileBuilder) FilePath ¶

func (d DataFileBuilder) FilePath() string

func (DataFileBuilder) FileSizeBytes ¶

func (d DataFileBuilder) FileSizeBytes() int64

func (DataFileBuilder) KeyMetadata ¶

func (d DataFileBuilder) KeyMetadata() []byte

func (DataFileBuilder) LowerBoundValues ¶

func (d DataFileBuilder) LowerBoundValues() map[int][]byte

func (DataFileBuilder) NaNValueCounts ¶

func (d DataFileBuilder) NaNValueCounts() map[int]int64

func (DataFileBuilder) NullValueCounts ¶

func (d DataFileBuilder) NullValueCounts() map[int]int64

func (DataFileBuilder) Partition ¶

func (d DataFileBuilder) Partition() map[string]any

func (DataFileBuilder) SortOrderID ¶

func (d DataFileBuilder) SortOrderID() *int

func (DataFileBuilder) SplitOffsets ¶

func (d DataFileBuilder) SplitOffsets() []int64

func (DataFileBuilder) UpperBoundValues ¶

func (d DataFileBuilder) UpperBoundValues() map[int][]byte

func (DataFileBuilder) ValueCounts ¶

func (d DataFileBuilder) ValueCounts() map[int]int64

func (DataFileBuilder) WithColumnSizes ¶

func (builder DataFileBuilder) WithColumnSizes(columnSizes map[int]int64) DataFileBuilder

func (DataFileBuilder) WithDistinctCounts ¶

func (builder DataFileBuilder) WithDistinctCounts(distinctCounts map[int]int64) DataFileBuilder

func (DataFileBuilder) WithKeyMetadata ¶

func (builder DataFileBuilder) WithKeyMetadata(keyMetadata []byte) DataFileBuilder

func (DataFileBuilder) WithLowerBounds ¶

func (builder DataFileBuilder) WithLowerBounds(lowerBounds map[int][]byte) DataFileBuilder

func (DataFileBuilder) WithNanValueCounts ¶

func (builder DataFileBuilder) WithNanValueCounts(nanValueCounts map[int]int64) DataFileBuilder

func (DataFileBuilder) WithNullValueCounts ¶

func (builder DataFileBuilder) WithNullValueCounts(nullValueCounts map[int]int64) DataFileBuilder

func (DataFileBuilder) WithSortOrderID ¶

func (builder DataFileBuilder) WithSortOrderID(sortOrderID int) DataFileBuilder

func (DataFileBuilder) WithSplitOffsets ¶

func (builder DataFileBuilder) WithSplitOffsets(splitOffsets []int64) DataFileBuilder

func (DataFileBuilder) WithUpperBounds ¶

func (builder DataFileBuilder) WithUpperBounds(upperBounds map[int][]byte) DataFileBuilder

func (DataFileBuilder) WithValueCounts ¶

func (builder DataFileBuilder) WithValueCounts(valueCounts map[int]int64) DataFileBuilder

type Date ¶

type Date int32

type DateType ¶

type DateType struct{}

DateType represents a calendar date without a timezone or time, represented as a 32-bit integer denoting the number of days since the unix epoch.

func (DateType) Equals ¶

func (DateType) Equals(other Type) bool

func (DateType) String ¶

func (DateType) String() string

func (DateType) Type ¶

func (DateType) Type() string

type DayTransform ¶

type DayTransform struct{}

DayTransform transforms a datetime value into a date value.

func (DayTransform) MarshalText ¶

func (t DayTransform) MarshalText() ([]byte, error)

func (DayTransform) ResultType ¶

func (DayTransform) ResultType(Type) Type

func (DayTransform) String ¶

func (DayTransform) String() string

type DecimalType ¶

type DecimalType struct {
	// contains filtered or unexported fields
}

func DecimalTypeOf ¶

func DecimalTypeOf(prec, scale int) DecimalType

func (DecimalType) Equals ¶

func (d DecimalType) Equals(other Type) bool

func (DecimalType) Precision ¶

func (d DecimalType) Precision() int

func (DecimalType) Scale ¶

func (d DecimalType) Scale() int

func (DecimalType) String ¶

func (d DecimalType) String() string

func (DecimalType) Type ¶

func (d DecimalType) Type() string

type FieldSummary ¶

type FieldSummary struct {
	ContainsNull bool    `avro:"contains_null"`
	ContainsNaN  *bool   `avro:"contains_nan"`
	LowerBound   *[]byte `avro:"lower_bound"`
	UpperBound   *[]byte `avro:"upper_bound"`
}

type FileFormat ¶

type FileFormat string

FileFormat defines constants for the format of data files.

const (
	AvroFile    FileFormat = "AVRO"
	OrcFile     FileFormat = "ORC"
	ParquetFile FileFormat = "PARQUET"
)

type FixedType ¶

type FixedType struct {
	// contains filtered or unexported fields
}

func FixedTypeOf ¶

func FixedTypeOf(n int) FixedType

func (FixedType) Equals ¶

func (f FixedType) Equals(other Type) bool

func (FixedType) Len ¶

func (f FixedType) Len() int

func (FixedType) String ¶

func (f FixedType) String() string

func (FixedType) Type ¶

func (f FixedType) Type() string

type Float32Type ¶

type Float32Type struct{}

Float32Type is the "float" type in the iceberg spec.

func (Float32Type) Equals ¶

func (Float32Type) Equals(other Type) bool

func (Float32Type) String ¶

func (Float32Type) String() string

func (Float32Type) Type ¶

func (Float32Type) Type() string

type Float64Type ¶

type Float64Type struct{}

Float64Type represents the "double" type of the iceberg spec.

func (Float64Type) Equals ¶

func (Float64Type) Equals(other Type) bool

func (Float64Type) String ¶

func (Float64Type) String() string

func (Float64Type) Type ¶

func (Float64Type) Type() string

type HourTransform ¶

type HourTransform struct{}

HourTransform transforms a datetime value into an hour value.

func (HourTransform) MarshalText ¶

func (t HourTransform) MarshalText() ([]byte, error)

func (HourTransform) ResultType ¶

func (HourTransform) ResultType(Type) Type

func (HourTransform) String ¶

func (HourTransform) String() string

type IdentityTransform ¶

type IdentityTransform struct{}

IdentityTransform uses the identity function, performing no transformation but instead partitioning on the value itself.

func (IdentityTransform) MarshalText ¶

func (t IdentityTransform) MarshalText() ([]byte, error)

func (IdentityTransform) ResultType ¶

func (IdentityTransform) ResultType(t Type) Type

func (IdentityTransform) String ¶

func (IdentityTransform) String() string

type Int32Type ¶

type Int32Type struct{}

Int32Type is the "int"/"integer" type of the iceberg spec.

func (Int32Type) Equals ¶

func (Int32Type) Equals(other Type) bool

func (Int32Type) String ¶

func (Int32Type) String() string

func (Int32Type) Type ¶

func (Int32Type) Type() string

type Int64Type ¶

type Int64Type struct{}

Int64Type is the "long" type of the iceberg spec.

func (Int64Type) Equals ¶

func (Int64Type) Equals(other Type) bool

func (Int64Type) String ¶

func (Int64Type) String() string

func (Int64Type) Type ¶

func (Int64Type) Type() string

type ListType ¶

type ListType struct {
	ElementID       int  `json:"element-id"`
	Element         Type `json:"-"`
	ElementRequired bool `json:"element-required"`
}

func (*ListType) ElementField ¶

func (l *ListType) ElementField() NestedField

func (*ListType) Equals ¶

func (l *ListType) Equals(other Type) bool

func (*ListType) Fields ¶

func (l *ListType) Fields() []NestedField

func (*ListType) MarshalJSON ¶

func (l *ListType) MarshalJSON() ([]byte, error)

func (*ListType) String ¶

func (l *ListType) String() string

func (*ListType) Type ¶

func (*ListType) Type() string

func (*ListType) UnmarshalJSON ¶

func (l *ListType) UnmarshalJSON(b []byte) error

type ManifestContent ¶

type ManifestContent int32

ManifestContent indicates the type of data inside of the files described by a manifest. This will indicate whether the data files contain active data or deleted rows.

const (
	ManifestContentData    ManifestContent = 0
	ManifestContentDeletes ManifestContent = 1
)

type ManifestEntry ¶

type ManifestEntry interface {
	// Status returns the type of the file tracked by this entry.
	// Deletes are informational only and not used in scans.
	Status() ManifestEntryStatus
	// SnapshotID is the id where the file was added, or deleted,
	// if null it is inherited from the manifest list.
	SnapshotID() int64
	// SequenceNum returns the data sequence number of the file.
	// If it was null and the status is EntryStatusADDED then it
	// is inherited from the manifest list.
	SequenceNum() int64
	// FileSequenceNum returns the file sequence number indicating
	// when the file was added. If it was null and the status is
	// EntryStatusADDED then it is inherited from the manifest list.
	FileSequenceNum() *int64
	// DataFile provides the information about the data file indicated
	// by this manifest entry.
	DataFile() DataFile
	// contains filtered or unexported methods
}

ManifestEntry is an interface for both v1 and v2 manifest entries.

func NewManifestEntryV1 ¶

func NewManifestEntryV1(entryStatus ManifestEntryStatus, snapshotID int64, data DataFile) ManifestEntry

type ManifestEntryContent ¶

type ManifestEntryContent int8

ManifestEntryContent defines constants for the type of file contents in the file entries. Data, Position based deletes and equality based deletes.

const (
	EntryContentData       ManifestEntryContent = 0
	EntryContentPosDeletes ManifestEntryContent = 1
	EntryContentEqDeletes  ManifestEntryContent = 2
)

type ManifestEntryStatus ¶

type ManifestEntryStatus int8

ManifestEntryStatus defines constants for the entry status of existing, added or deleted.

const (
	EntryStatusEXISTING ManifestEntryStatus = 0
	EntryStatusADDED    ManifestEntryStatus = 1
	EntryStatusDELETED  ManifestEntryStatus = 2
)

type ManifestFile ¶

type ManifestFile interface {
	// Version returns the version number of this manifest file.
	// It should be 1 or 2.
	Version() int
	// FilePath is the location URI of this manifest file.
	FilePath() string
	// Length is the length in bytes of the manifest file.
	Length() int64
	// PartitionSpecID is the ID of the partition spec used to write
	// this manifest. It must be listed in the table metadata
	// partition-specs.
	PartitionSpecID() int32
	// ManifestContent is the type of files tracked by this manifest,
	// either data or delete files. All v1 manifests track data files.
	ManifestContent() ManifestContent
	// SnapshotID is the ID of the snapshot where this manifest file
	// was added.
	SnapshotID() int64
	// AddedDataFiles returns the number of entries in the manifest that
	// have the status of EntryStatusADDED.
	AddedDataFiles() int32
	// ExistingDataFiles returns the number of entries in the manifest
	// which have the status of EntryStatusEXISTING.
	ExistingDataFiles() int32
	// DeletedDataFiles returns the number of entries in the manifest
	// which have the status of EntryStatusDELETED.
	DeletedDataFiles() int32
	// AddedRows returns the number of rows in all files of the manifest
	// that have status EntryStatusADDED.
	AddedRows() int64
	// ExistingRows returns the number of rows in all files of the manifest
	// which have status EntryStatusEXISTING.
	ExistingRows() int64
	// DeletedRows returns the number of rows in all files of the manifest
	// which have status EntryStatusDELETED.
	DeletedRows() int64
	// SequenceNum returns the sequence number when this manifest was
	// added to the table. Will be 0 for v1 manifest lists.
	SequenceNum() int64
	// MinSequenceNum is the minimum data sequence number of all live data
	// or delete files in the manifest. Will be 0 for v1 manifest lists.
	MinSequenceNum() int64
	// KeyMetadata returns implementation-specific key metadata for encryption
	// if it exists in the manifest list.
	KeyMetadata() []byte
	// Partitions returns a list of field summaries for each partition
	// field in the spec. Each field in the list corresponds to a field in
	// the manifest file's partition spec.
	Partitions() []FieldSummary

	// HasAddedFiles returns true if AddedDataFiles > 0 or if it was null.
	HasAddedFiles() bool
	// HasExistingFiles returns true if ExistingDataFiles > 0 or if it was null.
	HasExistingFiles() bool
	// FetchEntries reads the manifest list file to fetch the list of
	// manifest entries using the provided bucket. It will return the schema of the table
	// when the manifest was written.
	// If discardDeleted is true, entries for files containing deleted rows
	// will be skipped.
	FetchEntries(bucket objstore.Bucket, discardDeleted bool) ([]ManifestEntry, *Schema, error)
}

ManifestFile is the interface which covers both V1 and V2 manifest files.

func ReadManifestList ¶

func ReadManifestList(in io.Reader) ([]ManifestFile, error)

ReadManifestList reads in an avro manifest list file and returns a slice of manifest files or an error if one is encountered.

type ManifestV1Builder ¶

type ManifestV1Builder struct {
	// contains filtered or unexported fields
}

ManifestV1Builder is a helper for building a V1 manifest file struct which will conform to the ManifestFile interface.

func NewManifestV1Builder ¶

func NewManifestV1Builder(path string, length int64, partitionSpecID int32, addedSnapshotID int64) *ManifestV1Builder

NewManifestV1Builder is passed all of the required fields and then allows all of the optional fields to be set by calling the corresponding methods before calling ManifestV1Builder.Build to construct the object.

func (*ManifestV1Builder) AddedFiles ¶

func (b *ManifestV1Builder) AddedFiles(cnt int32) *ManifestV1Builder

func (*ManifestV1Builder) AddedRows ¶

func (b *ManifestV1Builder) AddedRows(cnt int64) *ManifestV1Builder

func (*ManifestV1Builder) Build ¶

func (b *ManifestV1Builder) Build() ManifestFile

Build returns the constructed manifest file, after calling Build this builder should not be used further as we avoid copying by just returning a pointer to the constructed manifest file. Further calls to the modifier methods after calling build would modify the constructed ManifestFile.

func (*ManifestV1Builder) DeletedFiles ¶

func (b *ManifestV1Builder) DeletedFiles(cnt int32) *ManifestV1Builder

func (*ManifestV1Builder) DeletedRows ¶

func (b *ManifestV1Builder) DeletedRows(cnt int64) *ManifestV1Builder

func (*ManifestV1Builder) ExistingFiles ¶

func (b *ManifestV1Builder) ExistingFiles(cnt int32) *ManifestV1Builder

func (*ManifestV1Builder) ExistingRows ¶

func (b *ManifestV1Builder) ExistingRows(cnt int64) *ManifestV1Builder

func (*ManifestV1Builder) KeyMetadata ¶

func (b *ManifestV1Builder) KeyMetadata(km []byte) *ManifestV1Builder

func (*ManifestV1Builder) Partitions ¶

func (b *ManifestV1Builder) Partitions(p []FieldSummary) *ManifestV1Builder

type ManifestV2Builder ¶

type ManifestV2Builder struct {
	// contains filtered or unexported fields
}

ManifestV2Builder is a helper for building a V2 manifest file struct which will conform to the ManifestFile interface.

func NewManifestV2Builder ¶

func NewManifestV2Builder(path string, length int64, partitionSpecID int32, content ManifestContent, addedSnapshotID int64) *ManifestV2Builder

NewManifestV2Builder is constructed with the primary fields, with the remaining fields set to their zero value unless modified by calling the corresponding methods of the builder. Then calling ManifestV2Builder.Build to retrieve the constructed ManifestFile.

func (*ManifestV2Builder) AddedFiles ¶

func (b *ManifestV2Builder) AddedFiles(cnt int32) *ManifestV2Builder

func (*ManifestV2Builder) AddedRows ¶

func (b *ManifestV2Builder) AddedRows(cnt int64) *ManifestV2Builder

func (*ManifestV2Builder) Build ¶

func (b *ManifestV2Builder) Build() ManifestFile

Build returns the constructed manifest file, after calling Build this builder should not be used further as we avoid copying by just returning a pointer to the constructed manifest file. Further calls to the modifier methods after calling build would modify the constructed ManifestFile.

func (*ManifestV2Builder) DeletedFiles ¶

func (b *ManifestV2Builder) DeletedFiles(cnt int32) *ManifestV2Builder

func (*ManifestV2Builder) DeletedRows ¶

func (b *ManifestV2Builder) DeletedRows(cnt int64) *ManifestV2Builder

func (*ManifestV2Builder) ExistingFiles ¶

func (b *ManifestV2Builder) ExistingFiles(cnt int32) *ManifestV2Builder

func (*ManifestV2Builder) ExistingRows ¶

func (b *ManifestV2Builder) ExistingRows(cnt int64) *ManifestV2Builder

func (*ManifestV2Builder) KeyMetadata ¶

func (b *ManifestV2Builder) KeyMetadata(km []byte) *ManifestV2Builder

func (*ManifestV2Builder) Partitions ¶

func (b *ManifestV2Builder) Partitions(p []FieldSummary) *ManifestV2Builder

func (*ManifestV2Builder) SequenceNum ¶

func (b *ManifestV2Builder) SequenceNum(num, minSeqNum int64) *ManifestV2Builder

type MapType ¶

type MapType struct {
	KeyID         int  `json:"key-id"`
	KeyType       Type `json:"-"`
	ValueID       int  `json:"value-id"`
	ValueType     Type `json:"-"`
	ValueRequired bool `json:"value-required"`
}

func (*MapType) Equals ¶

func (m *MapType) Equals(other Type) bool

func (*MapType) Fields ¶

func (m *MapType) Fields() []NestedField

func (*MapType) KeyField ¶

func (m *MapType) KeyField() NestedField

func (*MapType) MarshalJSON ¶

func (m *MapType) MarshalJSON() ([]byte, error)

func (*MapType) String ¶

func (m *MapType) String() string

func (*MapType) Type ¶

func (*MapType) Type() string

func (*MapType) UnmarshalJSON ¶

func (m *MapType) UnmarshalJSON(b []byte) error

func (*MapType) ValueField ¶

func (m *MapType) ValueField() NestedField

type MonthTransform ¶

type MonthTransform struct{}

MonthTransform transforms a datetime value into a month value.

func (MonthTransform) MarshalText ¶

func (t MonthTransform) MarshalText() ([]byte, error)

func (MonthTransform) ResultType ¶

func (MonthTransform) ResultType(Type) Type

func (MonthTransform) String ¶

func (MonthTransform) String() string

type NestedField ¶

type NestedField struct {
	Type `json:"-"`

	ID             int    `json:"id"`
	Name           string `json:"name"`
	Required       bool   `json:"required"`
	Doc            string `json:"doc,omitempty"`
	InitialDefault any    `json:"initial-default,omitempty"`
	WriteDefault   any    `json:"write-default,omitempty"`
}

func (*NestedField) Equals ¶

func (n *NestedField) Equals(other NestedField) bool

func (NestedField) MarshalJSON ¶

func (n NestedField) MarshalJSON() ([]byte, error)

func (NestedField) String ¶

func (n NestedField) String() string

func (*NestedField) UnmarshalJSON ¶

func (n *NestedField) UnmarshalJSON(b []byte) error

type NestedType ¶

type NestedType interface {
	Type
	Fields() []NestedField
}

NestedType is an interface that allows access to the child fields of a nested type such as a list/struct/map type.

type PartitionField ¶

type PartitionField struct {
	// SourceID is the source column id of the table's schema
	SourceID int `json:"source-id"`
	// FieldID is the partition field id across all the table partition specs
	FieldID int `json:"field-id,omitempty"`
	// Name is the name of the partition field itself
	Name string `json:"name"`
	// Transform is the transform used to produce the partition value
	Transform Transform `json:"transform"`
}

PartitionField represents how one partition value is derived from the source column by transformation.

func (*PartitionField) String ¶

func (p *PartitionField) String() string

func (*PartitionField) UnmarshalJSON ¶

func (p *PartitionField) UnmarshalJSON(b []byte) error

type PartitionSpec ¶

type PartitionSpec struct {
	// contains filtered or unexported fields
}

PartitionSpec captures the transformation from table data to partition values

func NewPartitionSpec ¶

func NewPartitionSpec(fields ...PartitionField) PartitionSpec

func NewPartitionSpecID ¶

func NewPartitionSpecID(id int, fields ...PartitionField) PartitionSpec

func (*PartitionSpec) CompatibleWith ¶

func (ps *PartitionSpec) CompatibleWith(other *PartitionSpec) bool

CompatibleWith returns true if this partition spec is considered compatible with the passed in partition spec. This means that the two specs have equivalent field lists regardless of the spec id.

func (*PartitionSpec) Equals ¶

func (ps *PartitionSpec) Equals(other PartitionSpec) bool

Equals returns true iff the field lists are the same AND the spec id is the same between this partition spec and the provided one.

func (*PartitionSpec) Field ¶

func (ps *PartitionSpec) Field(i int) PartitionField

func (*PartitionSpec) FieldsBySourceID ¶

func (ps *PartitionSpec) FieldsBySourceID(fieldID int) []PartitionField

func (*PartitionSpec) ID ¶

func (ps *PartitionSpec) ID() int

func (*PartitionSpec) IsUnpartitioned ¶

func (ps *PartitionSpec) IsUnpartitioned() bool

func (*PartitionSpec) LastAssignedFieldID ¶

func (ps *PartitionSpec) LastAssignedFieldID() int

func (PartitionSpec) MarshalJSON ¶

func (ps PartitionSpec) MarshalJSON() ([]byte, error)

func (*PartitionSpec) NumFields ¶

func (ps *PartitionSpec) NumFields() int

func (*PartitionSpec) PartitionType ¶

func (ps *PartitionSpec) PartitionType(schema *Schema) *StructType

PartitionType produces a struct of the partition spec.

The partition fields should be optional:

All partition transforms are required to produce null if the input value is null. This can happen when the source column is optional.
Partition fields may be added later, in which case not all files would have the result field and it may be null.

There is a case where we can guarantee that a partition field in the first and only parittion spec that uses a required source column will never be null, but it doesn't seem worth tracking this case.

func (PartitionSpec) String ¶

func (ps PartitionSpec) String() string

func (*PartitionSpec) UnmarshalJSON ¶

func (ps *PartitionSpec) UnmarshalJSON(b []byte) error

type PrimitiveType ¶

type PrimitiveType interface {
	Type
	// contains filtered or unexported methods
}

type Properties ¶

type Properties map[string]string

type Schema ¶

type Schema struct {
	ID                 int   `json:"schema-id"`
	IdentifierFieldIDs []int `json:"identifier-field-ids"`
	// contains filtered or unexported fields
}

Schema is an Iceberg table schema, represented as a struct with multiple fields. The fields are only exported via accessor methods rather than exposing the slice directly in order to ensure a schema as immutable.

func NewSchema ¶

func NewSchema(id int, fields ...NestedField) *Schema

NewSchema constructs a new schema with the provided ID and list of fields.

func NewSchemaWithIdentifiers ¶

func NewSchemaWithIdentifiers(id int, identifierIDs []int, fields ...NestedField) *Schema

NewSchemaWithIdentifiers constructs a new schema with the provided ID and fields, along with a slice of field IDs to be listed as identifier fields.

func PruneColumns ¶

func PruneColumns(schema *Schema, selected map[int]Void, selectFullTypes bool) (*Schema, error)

PruneColumns visits a schema pruning any columns which do not exist in the provided selected set. Parent fields of a selected child will be retained.

func (*Schema) AsStruct ¶

func (s *Schema) AsStruct() StructType

AsStruct returns a Struct with the same fields as the schema which can then be used as a Type.

func (*Schema) Equals ¶

func (s *Schema) Equals(other *Schema) bool

Equals compares the fields and identifierIDs, but does not compare the schema ID itself.

func (*Schema) Field ¶

func (s *Schema) Field(i int) NestedField

func (*Schema) Fields ¶

func (s *Schema) Fields() []NestedField

func (*Schema) FindColumnName ¶

func (s *Schema) FindColumnName(fieldID int) (string, bool)

FindColumnName returns the name of the column identified by the passed in field id. The second return value reports whether or not the field id was found in the schema.

func (*Schema) FindFieldByID ¶

func (s *Schema) FindFieldByID(id int) (NestedField, bool)

FindFieldByID is like *Schema.FindColumnName, but returns the whole field rather than just the field name.

func (*Schema) FindFieldByName ¶

func (s *Schema) FindFieldByName(name string) (NestedField, bool)

FindFieldByName returns the field identified by the name given, the second return value will be false if no field by this name is found.

Note: This search is done in a case sensitive manner. To perform a case insensitive search, use *Schema.FindFieldByNameCaseInsensitive.

func (*Schema) FindFieldByNameCaseInsensitive ¶

func (s *Schema) FindFieldByNameCaseInsensitive(name string) (NestedField, bool)

FindFieldByNameCaseInsensitive is like *Schema.FindFieldByName, but performs a case insensitive search.

func (*Schema) FindTypeByID ¶

func (s *Schema) FindTypeByID(id int) (Type, bool)

FindTypeByID is like *Schema.FindFieldByID, but returns only the data type of the field.

func (*Schema) FindTypeByName ¶

func (s *Schema) FindTypeByName(name string) (Type, bool)

FindTypeByName is a convenience function for calling *Schema.FindFieldByName, and then returning just the type.

func (*Schema) FindTypeByNameCaseInsensitive ¶

func (s *Schema) FindTypeByNameCaseInsensitive(name string) (Type, bool)

FindTypeByNameCaseInsensitive is like *Schema.FindTypeByName but performs a case insensitive search.

func (*Schema) HighestFieldID ¶

func (s *Schema) HighestFieldID() int

HighestFieldID returns the value of the numerically highest field ID in this schema.

func (*Schema) MarshalJSON ¶

func (s *Schema) MarshalJSON() ([]byte, error)

func (*Schema) Merge ¶

func (s *Schema) Merge(other *Schema) (*Schema, error)

Merge combines two schemas into a single schema. It returns a schema with an ID that is one greater thatn the ID of the first schema. If the two schemas have the same fields, the first schema is returned.

func (*Schema) NumFields ¶

func (s *Schema) NumFields() int

func (*Schema) Select ¶

func (s *Schema) Select(caseSensitive bool, names ...string) (*Schema, error)

Select creates a new schema with just the fields identified by name passed in the order they are provided. If caseSensitive is false, then fields will be identified by case insensitive search.

An error is returned if a requested name cannot be found.

func (*Schema) String ¶

func (s *Schema) String() string

func (*Schema) Type ¶

func (s *Schema) Type() string

func (*Schema) UnmarshalJSON ¶

func (s *Schema) UnmarshalJSON(b []byte) error

type SchemaVisitor ¶

type SchemaVisitor[T any] interface {
	Schema(schema *Schema, structResult T) T
	Struct(st StructType, fieldResults []T) T
	Field(field NestedField, fieldResult T) T
	List(list ListType, elemResult T) T
	Map(mapType MapType, keyResult, valueResult T) T
	Primitive(p PrimitiveType) T
}

SchemaVisitor is an interface that can be implemented to allow for easy traversal and processing of a schema.

A SchemaVisitor can also optionally implement the Before/After Field, ListElement, MapKey, or MapValue interfaces to allow them to get called at the appropriate points within schema traversal.

type StringType ¶

type StringType struct{}

func (StringType) Equals ¶

func (StringType) Equals(other Type) bool

func (StringType) String ¶

func (StringType) String() string

func (StringType) Type ¶

func (StringType) Type() string

type StructType ¶

type StructType struct {
	FieldList []NestedField `json:"fields"`
}

func (*StructType) Equals ¶

func (s *StructType) Equals(other Type) bool

func (*StructType) Fields ¶

func (s *StructType) Fields() []NestedField

func (*StructType) MarshalJSON ¶

func (s *StructType) MarshalJSON() ([]byte, error)

func (*StructType) String ¶

func (s *StructType) String() string

func (*StructType) Type ¶

func (*StructType) Type() string

type Time ¶

type Time int64

type TimeType ¶

type TimeType struct{}

TimeType represents a number of microseconds since midnight.

func (TimeType) Equals ¶

func (TimeType) Equals(other Type) bool

func (TimeType) String ¶

func (TimeType) String() string

func (TimeType) Type ¶

func (TimeType) Type() string

type Timestamp ¶

type Timestamp int64

type TimestampType ¶

type TimestampType struct{}

TimestampType represents a number of microseconds since the unix epoch without regard for timezone.

func (TimestampType) Equals ¶

func (TimestampType) Equals(other Type) bool

func (TimestampType) String ¶

func (TimestampType) String() string

func (TimestampType) Type ¶

func (TimestampType) Type() string

type TimestampTzType ¶

type TimestampTzType struct{}

TimestampTzType represents a timestamp stored as UTC representing the number of microseconds since the unix epoch.

func (TimestampTzType) Equals ¶

func (TimestampTzType) Equals(other Type) bool

func (TimestampTzType) String ¶

func (TimestampTzType) String() string

func (TimestampTzType) Type ¶

func (TimestampTzType) Type() string

type Transform ¶

type Transform interface {
	fmt.Stringer
	encoding.TextMarshaler
	ResultType(t Type) Type
}

Transform is an interface for the various Transformation types in partition specs. Currently, they do not yet provide actual transformation functions or implementation. That will come later as data reading gets implemented.

func ParseTransform ¶

func ParseTransform(s string) (Transform, error)

ParseTransform takes the string representation of a transform as defined in the iceberg spec, and produces the appropriate Transform object or an error if the string is not a valid transform string.

type TruncateTransform ¶

type TruncateTransform struct {
	Width int
}

TruncateTransform is a transformation for truncating a value to a specified width.

func (TruncateTransform) MarshalText ¶

func (t TruncateTransform) MarshalText() ([]byte, error)

func (TruncateTransform) ResultType ¶

func (TruncateTransform) ResultType(t Type) Type

func (TruncateTransform) String ¶

func (t TruncateTransform) String() string

type Type ¶

type Type interface {
	fmt.Stringer
	Type() string
	Equals(Type) bool
}

Type is an interface representing any of the available iceberg types, such as primitives (int32/int64/etc.) or nested types (list/struct/map).

type UUIDType ¶

type UUIDType struct{}

func (UUIDType) Equals ¶

func (UUIDType) Equals(other Type) bool

func (UUIDType) String ¶

func (UUIDType) String() string

func (UUIDType) Type ¶

func (UUIDType) Type() string

type Void ¶

type Void = struct{}

type VoidTransform ¶

type VoidTransform struct{}

VoidTransform is a transformation that always returns nil.

func (VoidTransform) MarshalText ¶

func (t VoidTransform) MarshalText() ([]byte, error)

func (VoidTransform) ResultType ¶

func (VoidTransform) ResultType(t Type) Type

func (VoidTransform) String ¶

func (VoidTransform) String() string

type YearTransform ¶

type YearTransform struct{}

YearTransform transforms a datetime value into a year value.

func (YearTransform) MarshalText ¶

func (t YearTransform) MarshalText() ([]byte, error)

func (YearTransform) ResultType ¶

func (YearTransform) ResultType(Type) Type

func (YearTransform) String ¶

func (YearTransform) String() string

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
catalog
table

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL