chunk

package
v0.0.0-...-9d95335 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 22, 2024 License: MIT Imports: 11 Imported by: 2

Documentation

Index

Constants

View Source
const (
	InitialCapacity = 32
)

Capacity constants.

Variables

This section is empty.

Functions

This section is empty.

Types

type Chunk

type Chunk struct {
	// contains filtered or unexported fields
}

Chunk stores multiple rows of data in Apache Arrow format. See https://arrow.apache.org/docs/memory_layout.html Values are appended in compact format and can be directly accessed without decoding. When the chunk is done processing, we can reuse the allocated memory by resetting it.

func NewChunk

func NewChunk(fields []*types.FieldType) *Chunk

NewChunk creates a new chunk with field types.

func NewChunkWithCapacity

func NewChunkWithCapacity(fields []*types.FieldType, cap int) *Chunk

NewChunkWithCapacity creates a new chunk with field types and capacity.

func (*Chunk) Append

func (c *Chunk) Append(other *Chunk, begin, end int)

Append appends rows in [begin, end) in another Chunk to a Chunk.

func (*Chunk) AppendBytes

func (c *Chunk) AppendBytes(colIdx int, b []byte)

AppendBytes appends a bytes value to the chunk.

func (*Chunk) AppendDatum

func (c *Chunk) AppendDatum(colIdx int, d *types.Datum)

AppendDatum appends a datum into the chunk.

func (*Chunk) AppendDuration

func (c *Chunk) AppendDuration(colIdx int, dur types.Duration)

AppendDuration appends a Duration value to the chunk.

func (*Chunk) AppendEnum

func (c *Chunk) AppendEnum(colIdx int, enum types.Enum)

AppendEnum appends an Enum value to the chunk.

func (*Chunk) AppendFloat32

func (c *Chunk) AppendFloat32(colIdx int, f float32)

AppendFloat32 appends a float32 value to the chunk.

func (*Chunk) AppendFloat64

func (c *Chunk) AppendFloat64(colIdx int, f float64)

AppendFloat64 appends a float64 value to the chunk.

func (*Chunk) AppendInt64

func (c *Chunk) AppendInt64(colIdx int, i int64)

AppendInt64 appends a int64 value to the chunk.

func (*Chunk) AppendJSON

func (c *Chunk) AppendJSON(colIdx int, j json.BinaryJSON)

AppendJSON appends a JSON value to the chunk.

func (*Chunk) AppendMyDecimal

func (c *Chunk) AppendMyDecimal(colIdx int, dec *types.MyDecimal)

AppendMyDecimal appends a MyDecimal value to the chunk.

func (*Chunk) AppendNull

func (c *Chunk) AppendNull(colIdx int)

AppendNull appends a null value to the chunk.

func (*Chunk) AppendPartialRow

func (c *Chunk) AppendPartialRow(colIdx int, row Row)

AppendPartialRow appends a row to the chunk.

func (*Chunk) AppendRow

func (c *Chunk) AppendRow(row Row)

AppendRow appends a row to the chunk.

func (*Chunk) AppendSet

func (c *Chunk) AppendSet(colIdx int, set types.Set)

AppendSet appends a Set value to the chunk.

func (*Chunk) AppendString

func (c *Chunk) AppendString(colIdx int, str string)

AppendString appends a string value to the chunk.

func (*Chunk) AppendTime

func (c *Chunk) AppendTime(colIdx int, t types.Time)

AppendTime appends a Time value to the chunk. TODO: change the time structure so it can be directly written to memory.

func (*Chunk) AppendUint64

func (c *Chunk) AppendUint64(colIdx int, u uint64)

AppendUint64 appends a uint64 value to the chunk.

func (*Chunk) GetRow

func (c *Chunk) GetRow(idx int) Row

GetRow gets the Row in the chunk with the row index.

func (*Chunk) LowerBound

func (c *Chunk) LowerBound(colIdx int, ad *types.Datum) (index int, match bool)

LowerBound searches on the non-decreasing column colIdx, returns the smallest index i such that the value at row i is not less than `ad`.

func (*Chunk) MakeRef

func (c *Chunk) MakeRef(srcColIdx, dstColIdx int)

MakeRef makes column in "dstColIdx" reference to column in "srcColIdx".

func (*Chunk) MemoryUsage

func (c *Chunk) MemoryUsage() (sum int64)

MemoryUsage returns the total memory usage of a Chunk in B. We ignore the size of column.length and column.nullCount since they have little effect of the total memory usage.

func (*Chunk) NumCols

func (c *Chunk) NumCols() int

NumCols returns the number of columns in the chunk.

func (*Chunk) NumRows

func (c *Chunk) NumRows() int

NumRows returns the number of rows in the chunk.

func (*Chunk) Reset

func (c *Chunk) Reset()

Reset resets the chunk, so the memory it allocated can be reused. Make sure all the data in the chunk is not used anymore before you reuse this chunk.

func (*Chunk) SetNumVirtualRows

func (c *Chunk) SetNumVirtualRows(numVirtualRows int)

SetNumVirtualRows sets the virtual row number for a Chunk. It should only be used when there exists no column in the Chunk.

func (*Chunk) SwapColumn

func (c *Chunk) SwapColumn(colIdx int, other *Chunk, otherIdx int)

SwapColumn swaps column "c.columns[colIdx]" with column "other.columns[otherIdx]".

func (*Chunk) SwapColumns

func (c *Chunk) SwapColumns(other *Chunk)

SwapColumns swaps columns with another Chunk.

func (*Chunk) TruncateTo

func (c *Chunk) TruncateTo(numRows int)

TruncateTo truncates rows from tail to head in a Chunk to "numRows" rows.

type CompareFunc

type CompareFunc = func(l Row, lCol int, r Row, rCol int) int

CompareFunc is a function to compare the two values in Row, the two columns must have the same type.

func GetCompareFunc

func GetCompareFunc(tp *types.FieldType) CompareFunc

GetCompareFunc gets a compare function for the field type.

type Iterator

type Iterator interface {
	// Begin resets the cursor of the iterator and returns the first Row.
	Begin() Row

	// Next returns the next Row.
	Next() Row

	// End returns the invalid end Row.
	End() Row

	// Len returns the length.
	Len() int

	// Current returns the current Row.
	Current() Row

	// ReachEnd reaches the end of iterator.
	ReachEnd()
}

Iterator is used to iterate a number of rows.

for row := it.Begin(); row != it.End(); row = it.Next() {
    ...
}

func NewIterator4List

func NewIterator4List(li *List) Iterator

NewIterator4List returns a Iterator for List.

func NewIterator4RowPtr

func NewIterator4RowPtr(li *List, ptrs []RowPtr) Iterator

NewIterator4RowPtr returns a Iterator for RowPtrs.

func NewIterator4Slice

func NewIterator4Slice(rows []Row) Iterator

NewIterator4Slice returns a Iterator for Row slice.

type Iterator4Chunk

type Iterator4Chunk struct {
	// contains filtered or unexported fields
}

Iterator4Chunk is used to iterate rows inside a chunk.

func NewIterator4Chunk

func NewIterator4Chunk(chk *Chunk) *Iterator4Chunk

NewIterator4Chunk returns a iterator for Chunk.

func (*Iterator4Chunk) Begin

func (it *Iterator4Chunk) Begin() Row

Begin implements the Iterator interface.

func (*Iterator4Chunk) Current

func (it *Iterator4Chunk) Current() Row

Current implements the Iterator interface.

func (*Iterator4Chunk) End

func (it *Iterator4Chunk) End() Row

End implements the Iterator interface.

func (*Iterator4Chunk) Len

func (it *Iterator4Chunk) Len() int

Len implements the Iterator interface

func (*Iterator4Chunk) Next

func (it *Iterator4Chunk) Next() Row

Next implements the Iterator interface.

func (*Iterator4Chunk) ReachEnd

func (it *Iterator4Chunk) ReachEnd()

ReachEnd implements the Iterator interface.

type List

type List struct {
	// contains filtered or unexported fields
}

List holds a slice of chunks, use to append rows with max chunk size properly handled.

func NewList

func NewList(fieldTypes []*types.FieldType, maxChunkSize int) *List

NewList creates a new List with field types and max chunk size.

func (*List) Add

func (l *List) Add(chk *Chunk)

Add adds a chunk to the List, the chunk may be modified later by the list. Caller must make sure the input chk is not empty and not used any more and has the same field types.

func (*List) AppendRow

func (l *List) AppendRow(row Row) RowPtr

AppendRow appends a row to the List, the row is copied to the List.

func (*List) GetChunk

func (l *List) GetChunk(chkIdx int) *Chunk

GetChunk gets the Chunk by ChkIdx.

func (*List) GetMemTracker

func (l *List) GetMemTracker() *memory.Tracker

GetMemTracker returns the memory tracker of this List.

func (*List) GetRow

func (l *List) GetRow(ptr RowPtr) Row

GetRow gets a Row from the list by RowPtr.

func (*List) Len

func (l *List) Len() int

Len returns the length of the List.

func (*List) NumChunks

func (l *List) NumChunks() int

NumChunks returns the number of chunks in the List.

func (*List) Reset

func (l *List) Reset()

Reset resets the List.

func (*List) Walk

func (l *List) Walk(walkFunc ListWalkFunc) error

Walk iterate the list and call walkFunc for each row.

type ListWalkFunc

type ListWalkFunc = func(row Row) error

ListWalkFunc is used to walk the list. If error is returned, it will stop walking.

type MutRow

type MutRow Row

MutRow represents a mutable Row. The underlying columns only contains one row and not exposed to the user.

func MutRowFromDatums

func MutRowFromDatums(datums []types.Datum) MutRow

MutRowFromDatums creates a MutRow from a datum slice.

func MutRowFromTypes

func MutRowFromTypes(types []*types.FieldType) MutRow

MutRowFromTypes creates a MutRow from a FieldType slice, each column is initialized to zero value.

func MutRowFromValues

func MutRowFromValues(vals ...interface{}) MutRow

MutRowFromValues creates a MutRow from a interface slice.

func (MutRow) Len

func (mr MutRow) Len() int

Len returns the number of columns.

func (MutRow) SetDatum

func (mr MutRow) SetDatum(colIdx int, d types.Datum)

SetDatum sets the MutRow with colIdx and datum.

func (MutRow) SetDatums

func (mr MutRow) SetDatums(datums ...types.Datum)

SetDatums sets the MutRow with datum slice.

func (MutRow) SetRow

func (mr MutRow) SetRow(row Row)

SetRow sets the MutRow with Row.

func (MutRow) SetValue

func (mr MutRow) SetValue(colIdx int, val interface{})

SetValue sets the MutRow with colIdx and value.

func (MutRow) SetValues

func (mr MutRow) SetValues(vals ...interface{})

SetValues sets the MutRow with values.

func (MutRow) ToRow

func (mr MutRow) ToRow() Row

ToRow converts the MutRow to Row, so it can be used to read data.

type Row

type Row struct {
	// contains filtered or unexported fields
}

Row represents a row of data, can be used to assess values.

func (Row) GetBytes

func (r Row) GetBytes(colIdx int) []byte

GetBytes returns the bytes value with the colIdx.

func (Row) GetDatum

func (r Row) GetDatum(colIdx int, tp *types.FieldType) types.Datum

GetDatum implements the types.Row interface.

func (Row) GetDatumRow

func (r Row) GetDatumRow(fields []*types.FieldType) types.DatumRow

GetDatumRow converts chunk.Row to types.DatumRow. Keep in mind that GetDatumRow has a reference to r.c, which is a chunk, this function works only if the underlying chunk is valid or unchanged.

func (Row) GetDuration

func (r Row) GetDuration(colIdx int) types.Duration

GetDuration returns the Duration value with the colIdx.

func (Row) GetEnum

func (r Row) GetEnum(colIdx int) types.Enum

GetEnum returns the Enum value with the colIdx.

func (Row) GetFloat32

func (r Row) GetFloat32(colIdx int) float32

GetFloat32 returns the float64 value with the colIdx.

func (Row) GetFloat64

func (r Row) GetFloat64(colIdx int) float64

GetFloat64 returns the float64 value with the colIdx.

func (Row) GetInt64

func (r Row) GetInt64(colIdx int) int64

GetInt64 returns the int64 value with the colIdx.

func (Row) GetJSON

func (r Row) GetJSON(colIdx int) json.BinaryJSON

GetJSON returns the JSON value with the colIdx.

func (Row) GetMyDecimal

func (r Row) GetMyDecimal(colIdx int) *types.MyDecimal

GetMyDecimal returns the MyDecimal value with the colIdx.

func (Row) GetSet

func (r Row) GetSet(colIdx int) types.Set

GetSet returns the Set value with the colIdx.

func (Row) GetString

func (r Row) GetString(colIdx int) string

GetString returns the string value with the colIdx.

func (Row) GetTime

func (r Row) GetTime(colIdx int) types.Time

GetTime returns the Time value with the colIdx. TODO: use Time structure directly.

func (Row) GetUint64

func (r Row) GetUint64(colIdx int) uint64

GetUint64 returns the uint64 value with the colIdx.

func (Row) Idx

func (r Row) Idx() int

Idx returns the row index of Chunk.

func (Row) IsNull

func (r Row) IsNull(colIdx int) bool

IsNull implements the types.Row interface.

func (Row) Len

func (r Row) Len() int

Len returns the number of values in the row.

type RowPtr

type RowPtr struct {
	ChkIdx uint32
	RowIdx uint32
}

RowPtr is used to get a row from a list. It is only valid for the list that returns it.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL