table

package
v0.5.0-prerelease Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 28, 2024 License: Apache-2.0 Imports: 17 Imported by: 0

README

About Chunkers

Chunkers are used to split a table into multiple chunks for copying. The downside of a chunk being too large is as follows:

  • Replicas fall behind the master (we don't support read-replicas, but at a certain point it impacts DR).
  • Locks are held for the duration of a chunk copy, so a large chunk can block other operations on the table.

All chunkers support "dynamic chunking" which is our way of saying that from a configuration perspective you specify the ideal chunk size in time-based units (e.g. 500ms) and the chunker will adjust the chunk size to try to meet that target. This tends to be a better approach than specifying a fixed chunk size, because the chunk size can vary wildly depending on the table. As the _new table gets larger, we typically see the chunk size reduce significantly to compensate for larger insert times. We believe this is more likely to occur on Aurora than MySQL because on IO bound workloads it does not have the change buffer.

Our thoughts are that spirit should be aggressive in copying, but there should only be minimal elevation in p99 response times. If you consider that a table regularly has DML queries that take 1-5ms, then it is reasonable to assume a chunk-time of 500ms will elevate some query to 505ms. Assuming this contention is limited, it may only be observed by the pMax and not the p99. It is usually application-dependent how much of a latency hit is acceptable. Our belief is that 500ms is on the high-end of acceptable for defaults, and users will typically lower it rather than increase it. We limit the maximum chunk time to 5s because it is unlikely that users can tolerate larger than a 5s latency hit for a single query on an OLTP system. Since we also adjust various lock wait timeouts based on the assumption that chunks are about this size, increasing beyond 5s would require additional tuning.

Chunking becomes a complicated problem because data can have an uneven distribution, and some tables have composite or strange data types for PRIMARY KEYs. We have chosen to solve the chunking problem by not using a one-size-fits-all approach, but rather an interface that has two current implementations: composite and optimistic.

Composite Chunker

The composite chunker is our newest chunker, and it is selected unless the table has an AUTO_INCREMENT single-column PRIMARY KEY.

Its implementation is very similar to how the chunker in gh-ost works:

  • A SELECT statement is performed to find the exact PK value of the row that is chunkSize larger than the current chunk pointer:
SELECT pk FROM table WHERE pk > chunkPointer ORDER BY pk LIMIT 1 OFFSET {chunkSize}
  • An INSERT .. SELECT statement is run on the table to copy between the last chunk pointer and the new value.

The composite chunker is very good at dividing the chunks up equally, since sparing a brief race condition each chunk will match exactly the chunkSize value. The main downside is that it becomes a little bit wasteful when you have an auto_increment PRIMARY KEYs and rarely delete data. In this case, you waste the initial SELECT statement, since the client could easily calculate the next chunk pointer by adding chunkSize to the previous chunk pointer. A second issue is that the composite chunker always returns FALSE for the KeyAboveHighWatermark optimization. It is possible that it could be implemented correctly in future, but for code-simplicity we have chosen not to for now.

Many of our use-cases have auto_increment PRIMARY KEYs, so despite the composite chunker also being able to support non-composite PRIMARY KEYs, we have no plans to switch to it entirely.

Optimistic Chunker

The optimistic chunker was our first chunker, and it's ideal for one of our main use case: AUTO_INCREMENT PRIMARY KEYs.

Its basic implementation is as follows:

  • Find the min/max of the PRIMARY KEY column.
  • Have a special chunk retrieve values less than the min value*
  • Advance by chunk size until you reach the max value.
  • Have a special chunk retrieve values greater than the max value*

* Because the value is cached, it's possible a small number of rows exist, particularly at the end of the table.

To a certain extent, the chunk-size will automatically adjust to small gaps in the table as dynamic chunking adjusts to compensate for slightly faster copies. However, this is intentionally limited with dynamic chunking having a hard limit on the chunk size of 100K rows. It can also only expand the chunk-size by 50% at a time. This helps prevent the scenario that quickly processed chunks (likely caused by table gaps) expand the chunk size too quickly, causing future chunks to be too large and causing QoS issues.

To deal with large gaps, the optimistic chunker also supports a special "prefetching mode". The prefetching mode is enabled when the chunk size has already reached the 100K limit, and each chunk is still only taking 20% of the target time for chunk copying. Prefetching was first developed when we discovered a user with ~20 million rows in the table but a big gap between the auto_increment value of 20 million and the end of the table (300 billion). You can think of the prefetching mode as not that much different from how the composite chunker works, as it will perform a SELECT query to find the next PK value it should use as a pointer. Prefetching is automatically disabled again if the chunk size is ever reduced below the 100K limit.

Documentation

Overview

Package table contains some common utilities for working with tables such as a 'Chunker' feature.

Index

Constants

View Source
const (
	// StartingChunkSize is the initial chunkSize
	StartingChunkSize = 1000
	// MaxDynamicStepFactor is the maximum amount each recalculation of the dynamic chunkSize can
	// increase by. For example, if the newTarget is 5000 but the current target is 1000, the newTarget
	// will be capped back down to 1500. Over time the number 5000 will be reached, but not straight away.
	MaxDynamicStepFactor = 1.5
	// MinDynamicRowSize is the minimum chunkSize that can be used when dynamic chunkSize is enabled.
	// This helps prevent a scenario where the chunk size is too small (it can never be less than 1).
	MinDynamicRowSize = 10
	// MaxDynamicRowSize is the max allowed chunkSize that can be used when dynamic chunkSize is enabled.
	// This seems like a safe upper bound for now
	MaxDynamicRowSize = 100000
	// DynamicPanicFactor is the factor by which the feedback process takes immediate action when
	// the chunkSize appears to be too large. For example, if the PanicFactor is 5, and the target *time*
	// is 50ms, an actual time 250ms+ will cause the dynamic chunk size to immediately be reduced.
	DynamicPanicFactor = 5

	// ChunkerDefaultTarget is the default chunker target
	ChunkerDefaultTarget = 100 * time.Millisecond
)

Variables

View Source
var (
	ErrTableIsRead       = errors.New("table is read")
	ErrTableNotOpen      = errors.New("please call Open() first")
	ErrUnsupportedPKType = errors.New("unsupported primary key type")
)

Functions

func LazyFindP90

func LazyFindP90(a []time.Duration) time.Duration

LazyFindP90 finds the second to last value in a slice. This is the same as a p90 if there are 10 values, but if there were 100 values it would technically be a p99 etc.

func QuoteColumns

func QuoteColumns(cols []string) string

Types

type Boundary

type Boundary struct {
	Value     []Datum
	Inclusive bool
}

Boundary is used by chunk for lower or upper boundary

func (*Boundary) JSON

func (b *Boundary) JSON() string

JSON encodes a boundary as JSON. The values are represented as strings, to avoid JSON float behavior. See Issue #125

type Chunk

type Chunk struct {
	Key                  []string
	ChunkSize            uint64
	LowerBound           *Boundary
	UpperBound           *Boundary
	AdditionalConditions string
}

Chunk is returned by chunk.Next() Applications can use it to iterate over the rows.

func (*Chunk) JSON

func (c *Chunk) JSON() string

func (*Chunk) String

func (c *Chunk) String() string

String strigifies a chunk into a fragment what can be used in a WHERE clause. i.e. pk > 100 and pk < 200

type Chunker

type Chunker interface {
	Open() error
	OpenAtWatermark(watermark string, datum Datum) error
	IsRead() bool
	Close() error
	Next() (*Chunk, error)
	Feedback(chunk *Chunk, duration time.Duration)
	GetLowWatermark() (string, error)
	KeyAboveHighWatermark(key interface{}) bool
}

func NewChunker

func NewChunker(t *TableInfo, chunkerTarget time.Duration, logger loggers.Advanced) (Chunker, error)

func NewCompositeChunker

func NewCompositeChunker(t *TableInfo, chunkerTarget time.Duration, logger loggers.Advanced, keyName string, whereCondition string) (Chunker, error)

NewCompositeChunker returns a chunkerComposite , setting its Key if keyName and where conditions are provided

type Datum

type Datum struct {
	Val interface{}
	Tp  datumTp // signed, unsigned, binary
}

Datum could be a binary string, uint64 or int64.

func NewNilDatum

func NewNilDatum(tp datumTp) Datum

func (Datum) Add

func (d Datum) Add(addVal uint64) Datum

func (Datum) GreaterThanOrEqual

func (d Datum) GreaterThanOrEqual(d2 Datum) bool

func (Datum) IsNil

func (d Datum) IsNil() bool

func (Datum) IsNumeric

func (d Datum) IsNumeric() bool

IsNumeric checks if it's signed or unsigned

func (Datum) MaxValue

func (d Datum) MaxValue() Datum

func (Datum) MinValue

func (d Datum) MinValue() Datum

func (Datum) Range

func (d Datum) Range(d2 Datum) uint64

Range returns the diff between 2 datums as an uint64.

func (Datum) String

func (d Datum) String() string

String returns the datum as a SQL escaped string

type JSONBoundary

type JSONBoundary struct {
	Value     []string
	Inclusive bool
}

type JSONChunk

type JSONChunk struct {
	Key        []string
	ChunkSize  uint64
	LowerBound JSONBoundary
	UpperBound JSONBoundary
}

type Operator

type Operator string

Operator is used to compare values in a WHERE clause.

const (
	OpGreaterThan  Operator = ">"
	OpGreaterEqual Operator = ">="
	OpLessThan     Operator = "<"
	OpLessEqual    Operator = "<="
)

type TableInfo

type TableInfo struct {
	sync.Mutex

	EstimatedRows       uint64
	SchemaName          string
	TableName           string
	QuotedName          string
	Columns             []string // all the column names
	NonGeneratedColumns []string // all the non-generated column names
	Indexes             []string // all the index names

	KeyColumns []string // the column names of the primaryKey

	KeyIsAutoInc bool // if pk[0] is an auto_increment column

	DisableAutoUpdateStatistics atomic.Bool
	// contains filtered or unexported fields
}

func NewTableInfo

func NewTableInfo(db *sql.DB, schema, table string) *TableInfo

func (*TableInfo) AutoUpdateStatistics

func (t *TableInfo) AutoUpdateStatistics(ctx context.Context, interval time.Duration, logger loggers.Advanced)

AutoUpdateStatistics runs a loop that updates the table statistics every interval. This will continue until Close() is called on the tableInfo, or t.DisableAutoUpdateStatistics is set to true.

func (*TableInfo) Close

func (t *TableInfo) Close() error

Close currently does nothing

func (*TableInfo) DescIndex

func (t *TableInfo) DescIndex(keyName string) ([]string, error)

DescIndex describes the columns in an index.

func (*TableInfo) MaxValue

func (t *TableInfo) MaxValue() Datum

MaxValue as a datum

func (*TableInfo) PrimaryKeyIsMemoryComparable

func (t *TableInfo) PrimaryKeyIsMemoryComparable() error

PrimaryKeyIsMemoryComparable checks that the PRIMARY KEY type is compatible. We no longer need this check for the chunker, since it can handle any type of key in the composite chunker. But the migration still needs to verify this, because of the delta map feature, which requires binary comparable keys.

func (*TableInfo) PrimaryKeyValues

func (t *TableInfo) PrimaryKeyValues(row interface{}) ([]interface{}, error)

PrimaryKeyValues helps extract the PRIMARY KEY from a row image. It uses our knowledge of the ordinal position of columns to find the position of primary key columns (there might be more than one). For minimal row image, you need to send the before image to extract the PK. This is because in the after image, the PK might be nil.

func (*TableInfo) SetInfo

func (t *TableInfo) SetInfo(ctx context.Context) error

SetInfo reads from MySQL metadata (usually infoschema) and sets the values in TableInfo.

func (*TableInfo) WrapCastType

func (t *TableInfo) WrapCastType(col string) string

Directories

Path Synopsis
Package asserty offers functionality to assert for certain DB properties.
Package asserty offers functionality to assert for certain DB properties.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL